Tips June 25th, 2003
Case Study of a High-Volume Account Servicing Application Using J2EE Technology (Page last updated June 2003, Added 2003-06-25, Author Carol McDonald, Joseph Paulchell, Publisher Sun). Tips:
- High volume is: peak 23000 concurrent sessions, average 30 transactions per second, 24/7 site, (still growing rapidly).
- Architecture: browser->load balancer->web server (static content/encryption)->app server cluster (servlets/JSP/EJB)->DB/Legacy systems
- [Architecture for high performance Enterprise system covered in good detail in this article].
- Architecture: Presentation Layer (Struts)->Decoupling Layer: Front-end user data cache (recently used data) and business logic proxies (increases decoupling and flexibility)->Business Logic Layer: Stateless Session Beans ONLY (Facade and Delegate patterns used); service beans (non-business logic) and business beans->Data Access Layer: Connection pools and Object/Relational Mapping->Remote Servers Layer: Separates Vendor and Native Code interfaces from core application
- Patterns used: Facades, Factories, Proxies, Adapters, Singletons, Load-on-Demand, Caching
- Each layer can be timed separately to enable performance testing
How J2EE Technology, Open Standards and Open Source Brought You the Architecture of the Miller Time Network (Page last updated June 2003, Added 2003-06-25, Author John Haro, Publisher Sun). Tips:
- Value List Handler avoids Repeated database calls, handles search capabolities, caches results and provides a filterable and traversable result set
- Updatable Value Objects encapsulates business data for efficient transfer between tiers
- Performance testing is important.
Graphics Performance Writing Optimized Client Applications (Page last updated June 2003, Added 2003-06-25, Author Scott Violet, Joshua Outwater, Chet Haase, Publisher Sun). Tips:
- Use profiling tools to find the hotspots and fix the problems.
- Lazy Loading is the key to a fast startup: Load just enough to show your app; Use a splash screen for immediate feedback; load only those classes immediately needed.
- Use -verbose to see what classes are being loaded at startup amd remove or delay loading of those not needed.
- Static variables are initialized BEFORE main is called, and may load classes unnecessarily or too early.
- Determining what caused a class to load: remove that class from rt.jar, then an exception will be thrown showing what caused the removed class to try to be loaded.
- Lazily: Populate JTabbedPanes; Create JInternalFrames; create display representation.
- Using threads is the key to a responsive GUI; Avoid heavy processing on event dispatching thread; Use SwingWorker; Use invokeLater or invokeAndWait to schedule processing; Only access widgets in the GUI thread.
- Consider overriding: revalidate, repaint, validate, firePropertyChange to do nothing (Look at DefaultTableCellRenderer for details).
- Create your own model implementation: Application specific models outperform defaults.
- Use widget constructors that take models, and populate the model BEFORE creating the widget.
- When tiling small images, create a bigger image to reduce amount of paint calls.
- Managed Images (Component.createImage()/GC.createCompatibleImage()/Toolkit.getImage()) use hardware copies to map the offscreen image to screen - much faster than BufferedImage (BufferedImage and other images managed in 1.5+).
- GC.createCompatibleImage() will return the most efficient image format for copying to the screen.
- Graphics.drawImage() is where acceleration takes place (for subsequent draws).
- Draw using the simplest drawing primitive that achieves the effect. Draw complex shapes into an image for repeated rendering.
- Rectangles are much faster than wide lines.
- Use GraphicsDevice.getDefaultConfiguration() rather than GraphicsDevice.getConfigurations()[0].
Performance Tuning the Sun ONE Application Server (Page last updated June 2003, Added 2003-06-25, Author David Dagastine, Eileen Loh, Scott Oaks, Martin Zaun, Publisher Sun). Tips:
- Out of the box configuration is not optimal for performance.
- Correctly size the overall heap: -Xms/-Xmx.
- Too small heap: More frequent GCs; May run out of memory; Overall throughput decreased.
- Too large heap: Long pauses during full GCs; Unnecessary memory use.
- Sizing the young generation (-XX:NewRatio=n, -XX:NewSize=n, -XX:MaxNewSize=n) too big forces full GCs.
- -XX:+AggressiveHeap Automatically sizes heap (Targets long-running memory allocation intensive jobs; Uses throughput collector; Size of initial heap based on system memory; Automatically sizes generations).
- Use parallel GC when average response and througput are most important (-XX:+ParNewGC).
- Use concurrent GC when low pause times are most important (-XX:+UseConcMarkSweepGC).
- Web container: Pay attention to HTTP Traffic; Optimize output streams; Cache common queries.
- HTTP tunables: Increase acceptor threads if lots of short-lived connections; Decrease RqThrottle unless you have lots of CPUs; Decrease KeepAliveTimeout if clients disconnect a lot.
- KeepAliveQueryMeanTime and KeepAliveQueryMaxSleepTime: reduce if CPU idle else increase.
- In servlets, set content length; for JSP pages, set UseOutputStreamSize.
- Turn off page reloading <jsp-config><property name="reload-interval" value="-1"/></jsp-config>.
- LD_PRELOAD=/usr/lib/ bin/startserv increases SSL performance.
- Optimize pool and cache configuration parameters: commit-option, max-cache-size, cache-idle-timeout-in-seconds, removal-timeout-in-seconds, steady-pool-size, max-pool-size, pool-idle-timeout-in-seconds, (pool-,cache-)resize-quantity.
- Monitor: total-beans-created/destroyed; num-beans-in-pool; GC activity, memory footprint; cache-misses/hits ratios; num-passivations (stateful session beans); total-beans-in-cache.
- Increase bean pool size when observing excessive creation and deletion of bean instances.
- Decrease bean pool size when accumulating a steadily large number of instances in pool.
- Increase cache size until a good cache-hits rate is reached.
- Decrease cache size if accumulating a large number of instances and cache hit rate doesn?t improve.
- Chose optimistic concurrency for CMPs with read-mostly access.
- Use statement caching if possible.
Optimizing EJB Performance in High-Volume Data-Warehousing Applications Patterns, Strategies and Best Practices (Page last updated June 2003, Added 2003-06-25, Author Samrat Ray, Arunabh Hazarika, Publisher Sun). Tips:
- Use batch inserts for large updates.
- Cache data from small tables.
- Use Optimistic Concurrency in place of higher isolation levels. Handle Optimistic Concurrency violations.
- Read-mostly data lends itself easily to caching.
- Use a timer-based invalidation when the data does not have to be real time; Use programmatic invalidation for real-time data.
A Billion Hits a Day (Page last updated June 2003, Added 2003-06-25, Author Deepak Alur, Rajmohan Krishnamurthy, Arnold Goldberg, Publisher Sun). Tips:
- EBay 1,000 million hits/day => 11,500 hits per second (2005 estimate, 2002 reality 380M/day, 4500/second).
- Scaling achieved by: using patterns; performance testing; capacity planning; configuration tuning; redundant infrastructure.
- Minimize server-side state.
- Don't use server affinity.
- Parition the database horizontally and vertically.
- Optimize persistency by generating optimal specialized code.
- Use data caching
- Use lazy loading
- Target the dataset to fetch data accurately (minimizing round trips or data transferred or both).
- Target end-tier data store location late, i.e. make the location dynamically choosable (horizontal scalability, failover).
Measuring Java 2 Platform, Enterprise Edition (J2EE) Application Performance in Production (Page last updated June 2003, Added 2003-06-25, Author Geoff Vona, Publisher Sun). Tips:
- Basic Metrics: Response Time (R); Throughput (X); Resource Utilization (U). R tends to increase with load; X and U increase linearly until U is maxed out; Once U is maximum, R and X plateau or decrease. [This is the first time I have every wanted to have a graphic as a tip. Check out page 10 for a crystal clear picture of the behavior of these metrics].
- Response Time reflects the end-user experience. Can vary significantly from Locking, resource contention, container activity. Deviation implies will be outliers - and these outliers will generate complaints.
- Throughput Measures the number of transactions that are executed by the system over a period of time: A measure of the system?s capacity for load
- A common performance goal is to target maximizing throughput, while maintaining 95% of requests having reposnse times below a given value.
- Measure performance! Don't guess.
- Try to change only one thing between measurements.
- Metrics to measure: client response time; OS CPU utilization; OS memory use; OS disk activity; JVM heap usage; JVM locks; JVM thread call stacks; JVM method exclusive time; JVM memory size; Servlet response times; JSP response times; EJB utilization; EJB response times; JDBC utilization & response times; JMS utilization & response times; JCA utilization & response times; JNDI utilization; Transaction rates and duration; Threads utilization; Queue sizes and throughout; General configuration; [article also covers Weblogic/WebSphere/JBoss/Oracle metrics].
- [Discusses various ways of obtaining performance measurements: protocol sniffers; logs; manual instrumentation; automatic bytecode insertion]
- Understand the impact of your measurements!
Garbage Collection in the Java HotSpot Virtual Machine (Page last updated June 2003, Added 2003-06-25, Author John Coomes, Tony Printezis, Publisher Sun). Tips:
- new java.lang.Object() is about 10 native instructions (with Sun 1.4.1 JVM).
- Stop-the-world GC in HotSpot does not stop threads running in native code.
- Generational GC assumes that (i.e. is tuned for) most objects being short-lived and few references from old to young objects
- Parallel (-XX:+UseParallelGC) and concurrent (-XX:+UseConcMarkSweepGC) GCs also now available.
- Finalization delays object allocation and garbage collection of an object.
- Limit the number of finalizable objects. Reorganize classes so finalizable object holds no extra data.
- Beware when extending finalizable objects in standard libraries: GUI elements, nio buffers.
- Use java.lang.ref.WeakReference to avoid finalizers but still get object cleanup where necessary
- Use Object Pools only if allocation or initialization is expensive.
- Size the heap appropriately: the maximum should be larger than working set but smaller than available physical memory.
- Avoid calls to java.lang.System.gc().
Jack Shirazi
