Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Training online: Concurrency, Threading, GC, Advanced Java and more ...
Tips January 2025
JProfiler
|
Get rid of your performance problems and memory leaks!
|
JProfiler
|
Get rid of your performance problems and memory leaks!
|
|
|
Back to newsletter 290 contents
https://www.youtube.com/watch?v=HCcq6VLuXe0
Trash Talk - Exploring the Memory Management in the JVM (Page last updated August 2024, Added 2025-01-27, Author Gerrit Grunwald, Publisher JAVAPRO). Tips:
- Memory is divided into heap (where objects reside and GC occurs) and stack (where thread frames and object references exist). Objects on the heap are reachable through references from the stack. Objects referenced are still in the heap even if direct references are set to null. When there are no remaining references to an object, it is eligible for garbage collection.
- Garbage collection terminology: Tracing or Marking is identifying dead (unreachable) objects; Reclaiming or Freeing is releasing memory occupied by dead objects; Compaction is rearranging live objects to create larger contiguous free memory blocks; Minor GC is collecting in the young generation; Major GC is collecting in the old generation; Full GC collects both young and old generations; Remembered Set/Card Table tracks references from the old generation to the young generation to ensure correct GC during minor collections.
- Basic garbage collector algorithms: Non-Moving Collector/Mark and Sweep - marks live objects, then sweeps (removes) dead objects, leaves fragmentation making allocation harder over time; Moving Collector/Mark and Compact - marks live objects, removes dead objects, then compacts live objects to one end of the heap, compaction time is linear with heap size; Copy Collector/Mark and Copy - divides the heap into two spaces (from-space and to-space), allocates in to-space, when full, marks live objects in to-space and copies them to from-space, then the roles of the spaces are swapped, inefficient for long-living objects (they get copied repeatedly).
- Generational collectors are based on the weak generational hypothesis that most objects die young. They divide the heap into generations: young generation where new objects are allocated and live for a few GC cycles (eg using to- and from-spaces); old generation/tenured space for long-living objects.
- Concurrent Garbage Collection occurs concurrently with application execution. If the allocator modifies object references while the collector is marking, inconsistencies can occur, so barriers (read and write barriers), snapshots, remarking and forwarding (Brooks) pointers are used to prevent inconsistencies. Write barriers are generally preferred (fewer writes than reads).
- JVM Collectors include: Serial Collector - single-threaded, stop-the-world, good for single-core systems and small heaps (<4GB); Parallel Collector/Throughput Collector - multi-threaded GC, stop-the-world, maximizes throughput (application time vs. GC time), good for batch processing, scientific computing, data analysis; CMS (Concurrent Mark and Sweep) - removed in JDK 14; G1 (Garbage First) - region-based collector (heap is divided into regions), balances throughput and response time; Epsilon - no-op garbage collector, for testing application performance with no GC overhead or for extremely memory-constrained applications where the memory usage is known exactly; Shenandoah - region-based, concurrent collector, for low response times and large heaps; ZGC (Z Garbage Collector) - region-based, concurrent, generational collector, designed for large heaps.
https://www.youtube.com/watch?v=z1t_ShLop7U
JVM Performance Engineering (Page last updated January 2025, Added 2025-01-27, Author Monica Beckwith, Kirk Pepperdine, Publisher GOTO Conferences). Tips:
- Performance engineering focuses on enriching the user experience. SLAs, functionality, maintainability, and availability all contribute to this.
- Observability is a property of the system - how easily can information be extracted? Its impact on application performance should be understood and minimized.
- Understanding how hardware works (e.g., memory access, processor pipelines, large pages, instruction-level parallelism, data-level parallelism) is needed for achieving maximally efficient software.
- Benchmarking is often done incorrectly. Look for when something is wrong with the benchmark, understand why, and make corrections.
- Benchmarking is useful to establish baselines and peak/stress performance measurements, as well as how useful particular improvements are.
- Correct benchmarking is challenging. Systems are complex with a multitude of variables (known and unknown), and reproducibly simulating real-world conditions (with noise and isolation) is very hard.
- Top-down and bottom-up investigation approaches are complementary and can be used together. Top-down starts with the highest level (user experience, overall system performance) and drills down to identify bottlenecks. Bottom-up starts with the lowest level (code, system internals, hardware) and builds up an understanding of how these details impact overall performance.
- Performance engineering is a holistic and systematic approach that requires understanding of software, hardware, and user needs. It involves careful measurement, experimentation, and continuous improvement. Observability plays a key role in understanding system behavior.
https://www.youtube.com/watch?v=sNqbGU-8ys8
How difficult can it be to write efficient code? (Page last updated January 2025, Added 2025-01-27, Author Roberto Cortez, Publisher Devoxx). Tips:
- Useful tools for performance analysis include: Async Profiler, JMH (Java Microbenchmark Harness), Flame Graphs.
- Focus on allocations - reducing memory allocations usually leads to improvements in both memory usage and CPU time.
- Always measure performance before and after making changes to verify improvements.
- Focus on optimizing the largest performance bottlenecks first before moving on to smaller optimizations.
- If a task doesn't need to be performed immediately, defer its execution to a later time to improve initial performance.
- Performance optimization is often an iterative process, involving repeated measurement and improvement.
- A use-case example of optimizing Quarkus redesigned the implementation to minimize string allocations, perform character-by-character matching, and deferring the generation of UUIDs to when they are actually needed, resulting in significant performance improvements.
Jack Shirazi
Back to newsletter 290 contents
Last Updated: 2025-01-27
Copyright © 2000-2025 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
URL: http://www.JavaPerformanceTuning.com/news/newtips290.shtml
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us