Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Training online: Concurrency, Threading, GC, Advanced Java and more ...
Tips July 2024
JProfiler
|
Get rid of your performance problems and memory leaks!
|
JProfiler
|
Get rid of your performance problems and memory leaks!
|
|
|
Back to newsletter 284 contents
https://www.youtube.com/watch?v=9dMPp-E5-vk
Understanding Garbage Collection, Memory Leaks, Heap and Thread Dumps (Page last updated June 2024, Added 2024-07-26, Author Ram Lakshmanan, Publisher JPrime). Tips:
- JVM Process is heap memory (Xmx) plus various native memory spaces, eg metaspace, thread stacks, code cache, direct buffers, GC management, JNI, etc.
- Both "OutOfMemoryError: Java heap space" and "OutOfMemoryError: GC overhead limit exceeded" mean the heap is unable to find more space to allocate an object even after garbage collection and trying to expand the heap up to Xmx.
- The GC over time usually shows when a heap leak will cause an OutOfMemoryError, because the heap used after GCs will gradually grow. Enable GC logging with Xlog:gc and analyze with https://fasterj.com/tools/gcloganalysers.shtml.
- ycrash will pull the following data from your system: the GC log, a thread dump, a heap dump, heap data, top (and with -H), ps, disk usage, dmseg, nestat, ping, vmstat, iostat, kernel params, your app logs, and system metadata. This combination is useful for diagnosing issues happening in a JVM.
- Heap dumps are analysed with heap dump analysers (see https://fasterj.com/tools/heapdumpanalysers.shtml).
- "OutOfMemoryError: Requested array size exceed the VM limit" means the application tried to create an array too large to fit in to the current heap - the error shows the stack trace of where that was tried to be created so should be straightforward to fix.
- "OutOfMemoryError: Metaspace" means too many classes have been generated or loaded. Assuming this is not a class leak (eg classes being continually generated but all held on to), you can increase the Metaspace size. Log class loading with -Xlog:class+load=info. "OutOfMemoryError: Permgen space" (only on old JVMs, before Java 8) is usually a similar cause.
- "OutOfMemoryError: Unable to create new native threads" is caused by the OS not having any native memory left to allocate a new thread. In this situtaion usually the app has too many threads, taking a thread dump shows you the other threads in the JVM. Alternative options are to increase thread limits (ulimit), or make more native memory available in some way (more RAM, fewer competing processes, smaller heap, smaller thread stack sizes).
- "OutOfMemoryError: direct buffer memory" - either the direct buffer is not large enough, or the objects using it are not reclaiming the memory used fast enough. Note Java 17 has fixed some issues in this area, so try that if you are on a lower JVM.
- OutOfMemoryError from "Kill process or sacrifice child" happens when the OS kills the JVM because it decides the JVM process is trying to allocate more memory than it is allowed to. This message is in the kernel logs (eg viewable from dmesg). As this includes native memory, you may need to analyze why that is growing too large, eg with NativeMemoryTracking.
- OutOfMemoryError from stack_trace_with_native_method happens when JNI causes a memory issue.
https://www.youtube.com/watch?v=F4bpkT9pTLs
JVM Ergonomics for Software Engineers (Page last updated June 2024, Added 2024-07-26, Author Fabio Arcidiacono, Publisher Ticino Software Craft). Tips:
- When -Xmx is not set, the JVM configures maximum heap value as a quarter of the available (container) memory, unless below 256MB is available when the maximum heap value will be 50%, or between 256MB and 512MB is available when the maximum heap value will be approximately 127MB. Using -XX:MaxRAMPercentage=PERCENT let's you change that default.
- JVM GC choice (if no explicit GC algorithm flag is set): If the number of CPUs is equal to or greater than 2 and the amount of memory is greater than 1792MB, the chosen GC will be the G1 GC (or ParallelGC before Java 9). If either of these two conditions is below the mentioned values, the chosen GC will be the SerialGC.
- Garbage collector options: SerialGC (low overhead) for single core small heaps; ParallelGC (low overhead) for multi-core small heaps or batch workloads with any heap size; G1GC (low to moderate overhead) for responsiveness in medium to large heaps; ZGC (moderate overhead) for responsiveness in medium to large heaps; ShenandoahGC (moderate overhead) for responsiveness in medium to large heaps.
- Kubernetes pod CPU throttling happens each 100ms that the millicore budget is used, ie if in the first part of a 100ms window you use all your allocated millicore budget, the pod is not allocated any more CPU until the start of the next 100ms window. This CPU budget includes your application CPU usage as well as the JVM and GC CPU usage. The JVM uses Runtime.availableProcessors() to size the GC thread count and internal JVM thread pools, but this is an integer value, so millicores are rounded up to the next full core. This means the JVM can be expecting more CPU than is actually allocated, eg 1100 millicores would see the JVM decide it has 2 cores and allocate 2 threads to the GC which might cause too much CPU utilization by the GC and cause the application to be throttled. Using -XX:ActiveProcessorCount=N let's you explicitly tell the JVM how many cores it should use; it also let's you increase the thread pool sizes if that is beneficial.
https://www.youtube.com/watch?v=QfvtxTtD2oQ
Low latency Java systems (Page last updated June 2024, Added 2024-07-26, Author Stefan Angelov, Publisher JPrime). Tips:
- Bandwidth is the number of requests per period that can be pumped in to a system; throughput is the number of requests per period that can be processed by the system; latency is the amount of time a request takes to be processed including time spent waiting to get in and out of the system.
- Java low latency performance challenges include: garbage collection, warmup, unpredictable compilation, lack of memory layout control, non-direct memory access.
- Guard logging statements with a check on the log level so that logging doesn't generate garbage when it's not used.
- The JVM memory has heap (young gen, old gen) and non-heap (metadata, threads, code cache, GC overhead). Low latency Java needs efficient memory use, including reducing object churn.
- Efficient heap memory techniques: make objects smaller, lazy initialization, immutable objects, canonical objects, reuse objects (object pools and thread locals), using Enums for singletons, flyweight patterns, collections that use primitive types instead of wrapping primitives.
- The costs of synchronization include: context switching, contention, locking (acquisition and release), and memory consistency overheads. But most important of all, is that the code becomes serial, only one thread (in the synchronized block) executes at a time.
- Avoid synchronization with thread-locals and/or compare-and-swap based actions.
- False sharing happens because CPU caches load cache lines, and the line can contain data from multiple objects/data fields. If two different fields are being processed in different threads, there should be no contention, but if the fields are in the same cache line, then both threads are handling the same cache line and contention happens. This can be avoided with padding.
- Thread affinity binds a thread to a specific core which increases data locality and reduced context switching - but needs to be manage on a whole host level so that other threads don't contend. OpenHFT AffinityLock supports this mechanism.
- LMAX Disruptor is an efficient pipeline technique for low latency Java.
- OpenHFT ChronicleMap is a fast offheap (memory mapped) in-memory non-blocking map designed for low latency. ChronicleQueue is similarly a persisted (memory-mapped) queue designed for low latency.
- For low latency, you need to minimize serialization and deserialization overheads. This often means using binary pre-mapped messaging formats instead of common formats like JSON.
Jack Shirazi
Back to newsletter 284 contents
Last Updated: 2025-01-27
Copyright © 2000-2025 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
URL: http://www.JavaPerformanceTuning.com/news/newtips284.shtml
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us