Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Training online: Threading Essentials course
Tips April 2015
Get rid of your performance problems and memory leaks!
Get rid of your performance problems and memory leaks!
Back to newsletter 173 contents
Java Performance Tuning: Getting the Most Out of Your Garbage Collector (Page last updated April 2015, Added 2015-04-28, Author Alex Zhitnitsky, Haim Yadid, Publisher takipi). Tips:
- Symptoms suggestion GC problems include: slow response times; high CPU and memory utilization; irregular extremely slow transactions; irregular disconnections.
- Do not ignore the outlier measurements - these are where GC problems hide. Using averages hides these.
- Define acceptable criteria for GC pause frequency and duration (these will be specific to the application).
- Turn on GC logging (eg -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:mygclogfilename.gc), this is essential to analyse GC.
- 5% is the generally accepted upper bound for GC overhead, but of course acceptable pause times depend on your application requirements.
- You have 4 overall options to fix GC pauses: Switching garbage collector; tuning heap flags; code changes; and alternative JVMs.
- If predictable performance is an important factor and the heap size isn't large, try the Parallel garbage collector.
- If the average response times or latency are your top priority, then try the CMS or G1 garbage collectors.
- If GC overhead is high, increasing the heap size often improves the situation.
- For solving long pauses in CMS or G1 (usually caused by fragmentation or the promotion rate), either start the GC earlier or increase the heap size.
- Off heap large caches are a good solution to not making the heap too large.
- Tune the young gen size, survivor space sizes, and tenuring threshold so that objects created for short-lived requests expire entirely in the young generation.
- The old gen heap should be sized so that the objects which are retained in memory for long periods are easily accommodated in the old generation, and there is sufficient further free memory for other objects that will get promoted there.
- Code changes that can attack fragmentation problems include: reusing objects; using off heap storage; splitting the process into a latency sensitive one with short lived objects, with the remaining state in another process.
- Alternative JVMs include Azul with the C4 GC, and the open JDK with the Shenandoah GC.
Why is APM Important? (Page last updated April 2015, Added 2015-04-28, Author Steven Haines, Publisher AppDynamics). Tips:
- Application Performance Management is primarily determining when application are behaving abnormally and why.
- APM tools typically monitor metrics from: hardware; OS and virtual machine layer; the JVM; any containers (application server or web container); the application itself; and infrastructure such as network communications, databases, caches, external web services, and legacy systems.
- APM tools use the metrics they've gathered across the system they are monitoring, and identify deviations from normal behaviour for that particular time (ie they include expected variations across time).
- APM tools will alert on deviations from normal behaviour, allowing the alert to be correlated to the metrics that caused the alert and allowing operational staff to dive into the anomaly and identify the causes; and then fix them (potentially using the APM tool itself in some cases).
- Alternatives to holistic APM tools include: synthetic transactions; manual instrumentation; users feedback.
- Log files rarely include the right data for being able to diagnose performance issues.
- It's frequently extremely difficult to reproduce performance problems outside of production; often it's impossible.
Adopting Microservices at Netflix: Lessons for Architectural Design (Page last updated February 2015, Added 2015-04-28, Author Tony Mauro, Publisher nginx). Tips:
- A loosely coupled architecture means that you can update services independently; this gives you the ability to get changes into a service as quickly as can be (as you don't need to update dependencies); this is very useful when tuning. IN the case of a implementation with a scaling limitation, you can even reimplement a service completely differently, as long as you maintain the API contract.
- A microservice with correctly bounded context is self-contained - you can understand and update the microservice's code without knowing anything about the internals of its peers; the microservices and its peers interact strictly through APIs and so don't share data structures, database schemata, or other internal representations. This allows you to focus on fixing a single bottleneck without impacting anything else in the system.
- Using separate data stores for each microservice scales the system much better; however there can be consistency issues which may require external reconciliation across data stores.
- Server instances, particularly those that run customer-facing code, should be interchangeable; this is most easily enabled by keeping them stateless.
JVM concurrency: To block, or not to block? (Page last updated July 2014, Added 2015-04-28, Author Dennis Sosnoski, Publisher IBM). Tips:
- blocking approach: a thread waits for the event and then takes action; nonblocking approach: the event performs the action when the event occurs.
- java.util.concurrent.Future let's you know of completion by means of either polling or waiting.
- java.util.concurrent.CompletableFuture gives you the ability to execute code when an event completes. CompletableFutures support both blocking and nonblocking approaches to handling events, including callbacks.
- A deadlock is where two or more threads each control resources that the other thread(s) need to progress.
- Thread starvation is where some threads are unable to progress because other threads are hogging shared resources.
- Livelocks are where threads are trying to adjust to one another but end up making no progress.
- Blocking threads implies that you'll be context switching threads; context switching has overheads (including cache invalidation) which can dramatically slow down applications in some cases.
- For highest performance, minimize thread switches by using nonblocking code wherever possible.
Java Performance: Tune the HotSpot JVM Step-by-Step (Page last updated February 2015, Added 2015-04-28, Author Charlie Hunt, Publisher informIT). Tips:
- Define and prioritise the system requirements: maintainability, scalability, availability, manageability, performance (throughput, latency, footprint).
- Choose a JVM deployment model: single JVM, multiple JVMS, on one box or many (the system requirements guide this, eg availability implies multiple JVMs on multiple boxes)?
- Choose between client, server, and tiered JVMs depending on startup and throughput requirements (client for fast startup, server for highest throughput, tiered would offer the best of both but isn't mature in Java 8).
- Suggested high-level tuning order: Decide how much memory your application needs, adjust to meet that; decide on the latency requirements and adjust to meet that; decide on throughput requirements and adjust to meet that. Adjustments might start adjusting from high-level system requirements or further along depending on how much change is needed to achieve the targets.
Why it's difficult to find performance problems during pre-production tests (Page last updated January 2015, Added 2015-04-28, Author Daniel Witkowski, Publisher JaxEnter). Tips:
- Coordinated Omission is where a single event impacts multiple measurements but is not considered correctly, eg a long pause affects the measurements at the beginning of the pause more dramatically than those later in the pause, but if you average the effect then you don't get the significance of the pause.
- Compare the same measurements over different reporting intervals to see if averaging is hiding outliers.
- Coordinated Omission can cause you to miss samples in your result set that would be signficant if not missed.
- If 1% of partial requests are slow, but you need 100 partial requests (even in parallel) to complete a full request, then most full requests are slow. Outliers are important, sometimes disproportionately so.
Back to newsletter 173 contents
Last Updated: 2020-03-30
Copyright © 2000-2020 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us