Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Training online: Concurrency, Threading, GC, Advanced Java and more ...
Tips September 2013
Get rid of your performance problems and memory leaks!
Get rid of your performance problems and memory leaks!
Back to newsletter 154 contents
Livelocks from wait/notify The Java Specialists' Newsletter (Page last updated September 2013, Added 2013-09-29, Author Dr. Heinz M. Kabutz, Publisher The Java Specialists' Newsletter). Tips:
- A Livelock is the situation where thread A holds a lock that thread B wants to acquire, but thread A never releases the lock because of a logic error in the code (as opposed to a Deadlock where thread A wants a lock that thread B holds before it will release the lock, but thread B wants thread A's lock, so neither ever release their locks).
- Object.wait() can throw an InterruptedException, without first releasing the lock.
- If the thread has been interrupted prior to entering Object.wait(), the call will throw an InterruptedException, without first releasing the lock.
- If looping on Object.wait(), you might want to first reset the thread's interrupted state (e.g.
if(Thread.currentThread().isInterrupted()) Thread.currentThread().interrupted();) before entering the wait() on each iteration or it could just throw the exception every iteration leading to a Livelock situation.
Looping versus recursion for improved application performance (Page last updated July 2013, Added 2013-09-29, Author Brian S Paskin, Publisher IBM). Tips:
- Tail recursion (when the recursive call is at the end of the method) does not use the call stack no matter the depth of the recursive call, so is usually more efficient than non-tail recursion, but not always because in some cases holding inetrmediate data on the stack is more efficient.
- In some cases tail recursive methods are more efficient than looping.
- Sorting is often processed more efficiently with recursion than loops.
- If head recursion is used, consider tuning the call stack size.
Lock-Based vs Lock-Free Concurrent Algorithms (Page last updated August 2013, Added 2013-09-29, Author Martin Thompson, Publisher mechanical-sympathy). Tips:
- StampedLock is a better than existing lock implementations.
- "synchronised" is a good general purpose lock implementation when contention is from only 2 threads.
- ReentrantReadWriteLock is only useful when there is a huge balance of reads compared to very few writes.
- ReentrantLock is an acceptable lock implementation when thread counts grow as previously discovered.
Optimize Late and Not Often (Page last updated September 2013, Added 2013-09-29, Author Tim Kitchens, Publisher timontech). Tips:
- Don't optimize until there is a need - in the time you spend optimizing code that is already fast enough, you could be accomplishing another task.
- The priority order for coding should be: 1. Implement correct system functionality; 2. Make a clean design; 3. Identify performance shortfalls by way of repeatable defined performance tests.
- Automate realistic performance tests with a mix of use cases, running parallel threads that mimic anticipated runtime behavior.
- If the requirements do not indicate a need for a higher level of performance, spend your time elsewhere - don't optimize if it's not needed.
- Performance testing steps: 1. Identify mixes of use cases and realistic loads and test volumes; 2. Design automated performance tests using "record/playback" with testing tools; 3. Run tests with monitoring; 4. Profile any uses cases/components that failed to achieve the required performance; 5. Step through code if necessary to get a more detailed understanding of flow through the bottleneck; 6. Improve the code to make it more efficient, and then repeat testing.
- If performance or scalability is critical to the success of your project, address this - but keep in mind that not every aspect of a system requires optimization. Focus your effort on the part of the system that needs optimizing.
Tips for Tuning the Garbage First Garbage Collector (Page last updated September 2013, Added 2013-09-29, Author Monica Beckwith, Publisher infoQ). Tips:
- The max number of concurrent refinement threads can be limited using -XX:G1ConcRefinementThreads or -XX:ParallelGCThreads but if the concurrent refinement threads cannot keep up with the amount of filled buffers, then the application threads will do the updating, slowing down the application. Look for "0 ( 0.0%) by mutator threads." in the log output - if it's not 0, consider increasing the refinement threads limit.
- Use the option -XX:+G1SummarizeRSetStats and if Scan RS times seem high relative to the overall GC pause time, look for the text string "Did xyz coarsenings" in your GC log output to determine if you have many coarsened remembered sets - you can alter the proportion of time spent in updating remembered sets with -XX:G1RSetUpdatingPauseTimePercent=N (default N=10, i.e. 10%).
- If you see high times during reference processing (GC ref-proc) then turn on parallel reference processing with -XX:+ParallelRefProcEnabled.
- If you find an evacuation failure in your G1 GC logs: 1. Get a simple baseline with min and max heap and a realistic pause time goal, removing heap sizing such as -Xmn, -XX:NewSize, -XX:MaxNewSize, -XX:SurvivorRatio, etc. Use only -Xms, -Xmx and a pause time goal -XX:MaxGCPauseMillis; 2. Increase the heap size, or if that's not an option decrease -XX:InitiatingHeapOccupancyPercent=N (default n=45) - or increase that if the the marking cycle is starting early and not reclaiming much; 3. Try with a higher -XX:ConcGCThreads; 4. If "to-space" survivor is the issue, then increase the -XX:G1ReservePercent (default is 10).
- Use -XX:+PrintAdaptiveSizePolicy to help explain causes of evacuation failures.
- If you get excessive "humongous allocation" (with -XX:+PrintAdaptiveSizePolicy flag) especially after when most of your long-lived objects are created, then increase the region size, e.g. -XX:G1HeapRegionSize=16M (default is 4M), to more than twice the size of the majority of allocations.
Sean Hull's 20 Biggest Bottlenecks that Reduce and Slow Down Scalability (Page last updated August 2013, Added 2013-09-29, Author Sean Hull, Publisher highscalability). Tips:
- Two-phase commit and anything that requires confirmation from a remote host before proceeding is slow. Aim for eventual consistency or asynchronous processing.
- Cache at every layer, even interposing extra caching layers can help performance.
- Disk I/O speeds are critical for any type of persisted storage. Use RAID 10 instead of RAID 5; Provisioned IOPS in the cloud.
- Serial processing can be a huge bottleneck in critical parts of a system. Ensure that any "choke" points that processing HAS to flow through is fully parallel. Watch out for single points of failure, which have an even worse impact.
- Ensure that you can selectively disable parts of a system without affecting other parts, that way the system as a whole can continue while specific subsystems that may be causing issues can be worked on without impacting everything.
- Multiple copies of the database allows higher scaling.
- Database performance antipatterns include: Using your database for queueing (scalability killer); inefficient querying (like text searching); object-relational models tend to produce inefficient SQL.
- Instrumentation/monitoring/metrics/logging of your system and applications is essential - without ongoing visibility, everything will be a surprise and more difficult to track down to the causes.
- Supporting a browse-only mode can leave all read-only features available, while allowing any sudden maintenance requirement to proceed with only partial service denial instead of full shutdown.
- Test failover modes regularly using the actual production system, otherwise it's easy for critical components to be missing in failover mode - licenses missing for backup systems, incomplete backups, configurations that are now out of date, the list of potential issues is endless.
Back to newsletter 154 contents
Last Updated: 2023-08-28
Copyright © 2000-2023 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us