Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Training online: Threading Essentials course
Tips July 2012
Get rid of your performance problems and memory leaks!
Get rid of your performance problems and memory leaks!
Back to newsletter 140 contents
Top 10 Causes of Java EE Enterprise Performance Problems (Page last updated June 2012, Added 2012-07-30, Author Pierre-Hugues Charbonneau, Publisher JavaLobby). Tips:
- Measure current and predict future required IT capacity. Metrics to include: JVM heap utilization; GC overhead; CPU utilization; network utilization and average page sizes if applicable;
- Regularly performance and load test your application with data and test cases that reflect real-world usage. Determine how many concurrent users and what transaction rates your application can support before hitting bottlenecks.
- Combining too many applications into one server can lead to serious performance issues as the various applications can interact with the underlying system in conflicting ways resulting in resource conflicts or exhaustion.
- Garbage collection is commonly an issue. At the minimum, you should enable verbose GC to enable analysis of the garbage collection costs. Issues include: too small a heap; too large a memory footprint; object throughput too high; suboptimal GC algorithm chosen; object leaks; objects leaking into the old(er) generation(s); JVM paritioned suboptimally; 32-bit JVM heap size competing with process size limits restricting thread creation.
- Proper garbage collection tuning requires you to perform high-volume load and performance testing simulating multiple users.
- It is important to shield your application from badly responding external systems with the proper use of timeouts, so that you don't exhaust resources waiting external responses. Note this applies even if the external requests are all handled asynchronously - the resource 'wait' buildup is the issue here.
- Common database related problems include: Querying taking too long due to suboptimal SQL, missing indexes, suboptimal execution plans, dataset too large; hitting table or row-level data locks too often; lack of database administration (e.g. disk space management, log file rotation, etc).
- The most common application coding performance problems were: concurrency problems - oversynchronization leads to bottlenecks, undersynchronization and incorrect variable specification leads to unpredictable behaviour and data corruption; lack of timeouts in communication; thread and I/O resource leaks; suboptimal data caching (too little causes too much costly repeated work to be done, too much causes heap and GC problems); excessive logging.
- JEE container tuning options include: thread contraints, capped per app; inbound traffic limits; session timeouts; pool sizes and lifetimes; application loading; connection pools; JMS quotas, retry limits, expiration policy.
- Lack of monitoring prevents you from understanding the platform capacity and health situation. Monitoring can be implemented easily and should be considered essential.
- Overloaded hardware and excessive network latency (from hardware issues and from badly configured network stacks and from badly implemented application protocols and requests) are common issues that should be monitored for and targeted for fixing.
Native C/C++ Like Performance For Java Object Serialisation (Page last updated July 2012, Added 2012-07-30, Author Martin Thompson, Publisher mechanical-sympathy). Tips:
- Default Java Serialization is not fast - it was designed for a very different purpose than serialising objects as quickly and compactly as possible. Externalizable is faster, but still slow compared to binary protocols.
- For many systems a large part of the cost remote transfer is the serialisation of state to-and-from byte buffers. Inefficient but readable protocols likes Java Serialisation, XML and JSON are always going to be slower than binary protocols.
- Using a simple binary protocol and using the ByteBuffer api can provide an order-of-magnitude speedup over a simple Serializable approach. With a small overhead, you can handle byte ordering with Long/Integer.reverseBytes(), and determining at initialization whether the ordering needs changing for any particular remote target.
- The sun.misc.Unsafe class allows you to bypass Java's normal memory safe procedures for a speed gain in some situations, but at the cost of making memory access and update unsafe (hence the very clear "Unsafe" class name) and exposing any program using Unsafe to all the nasty memory violation type bugs and security holes you see in C/C++ programs.
HotSpot's Hidden Treasure (Page last updated July 2012, Added 2012-07-30, Author Poonam Bajaj, Publisher Oracle). Tips:
- The HotSpot Serviceability Agent (SA) is a snapshot debugger which can examine Java processes or core files. When SA is attached to a process it freezes the process and allows examination of the heap and threads from a Java-aware perspective.
- HSDB and CLHSDB are the GUI and command-line Serviceability Agent debuggers. You need to set environment variable SA_JAVA to the jdk/bin directory and the PATH should contain the JVM binary used by the target process. java -Dsun.jvm.hotspot.debugger.useXXXDebugger=true -classpath sa-jdi.jar sun.jvm.hotspot.HSDB where XXX is 'Windbg' on windows and 'Proc' on Unix.
- The HotSpot Serviceability Agent lets you examine objects, object histograms, finalizer information, different heap statistics, and supports SOQL to query the heap.
Tracking excessive garbage collection in Hotspot JVM (Page last updated June 2012, Added 2012-07-30, Author Artiom Gourevitch, Publisher artiomg). Tips:
- Use -XX:+HeapDumpOnOutOfMemoryError to dump the heap when the application first gets and out-of-memory-error. This can take minutes for a large heap, and the application will be frozen during that dumping, so may not be feasible for your application.
- Tune -XX:GCTimeLimit=N - here N means N% (default 98%) and defines the proportion of time spent in GC before an OutOfMemory error is thrown, this covers the case where the GC is spending more and more time freezing the application while finding less and less space in the heap, the application will eventually OOME, but may take a while and tuning this option for your application reduces that time.
- Tune -XX:GCHeapFreeLimit=N here N means N% (default 2%) and defines the minimum percentage of free space after a full GC before an OutOfMemoryError is thrown. Investigation suggests 2% is too small and a larger value (author suggest 20%) should be chosen to reduce the wait before the inevitable OOME is encountered.
Averages, Web Performance Data, And How Your Analytics Product Is Lying To You (Page last updated May 2012, Added 2012-07-30, Author Josh Fraser, Publisher highscalability). Tips:
- Real User Measurement uses monitoring embedded into the downloaded page to directly measure all real interactions with the server site providing data for actual user experience, which is unlike the synthetic tests that many websites use which test website response times from random locations around the world using various synthetic user-modelled transactions.
- Real User Measurement allowed Walmart to determine that the average 7 second page loading time actually broke down to more than half the pages loading in under 4 seconds, but the worst 5% taking iver 20 seconds.
- Response times are best broken down into sectors, and the percentage of response times for each sector shown (e.g. as a histogram) to give you a real feel for the user response time. Simple averages and deviations aren't sufficient to determine where you should target your effort in improving reponse times - with a wide spread, you probably want to reduce the variability or the worst cases, with a tight spread you probably want to target the average.
Application Performance and Antipatterns (Page last updated January 2012, Added 2012-07-30, Author Munish K Gupta, Publisher TechSpot). Tips:
- Excessive Layering is a performance antipattern where by the initially reasonable addition of controllers, commands, facades, etc. to decouple layers, gets out of hand and the implementors add so many layers that every request goes through too many layers providing a huge overhead. Excessive Layering is often the result of overengineered solutions.
- Excessive Round Tripping is a performance antipattern where many calls are made to obtain data due to over-detached or inefficient decoupling of data representation from storage. The application should be designed to produce all the data required for a transaction with a single request, or as near as feasible.
- Overstuffed Session is a performance antipattern where too much data is kept within the session object - both from large data objects and from accumulation over time.
- Golden Hammer (Everything is a Service) is a performance antipattern which exposes a web API for every internal service, which can lead to excessive and unnecessary marshalling within the application
- Chatty Services is a performance antipattern where many short communications are used where one longer one would be hugely more efficient.
- Performance antipatterns are best exposed by having architects with a global overview, applications standards and reviews, continuous integration tools integrated with compliance checking tools, and regular profiling and load testing of the end-to-end application.
Back to newsletter 140 contents
Last Updated: 2018-06-28
Copyright © 2000-2018 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us