Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Training online: Concurrency, Threading, GC, Advanced Java and more ...
Tips November 2019
JProfiler
|
Get rid of your performance problems and memory leaks!
|
JProfiler
|
Get rid of your performance problems and memory leaks!
|
|
|
Back to newsletter 228 contents
https://www.infoq.com/presentations/low-latency-cloud-oss/
Achieving Low-Latency in the Cloud with OSS (Page last updated October 2019, Added 2019-11-27, Author Mark Price, Publisher QCon). Tips:
- Low latency of 10s of microseconds cannot use databases, data needs to be kept in-memory (persisted asynchronously to an append-only log with snapshots), no IO on the fast-path and use lock-free computing.
- Some cloud providers provide cluster placement groups which ensure that nodes are as physically close together as possible (a single rack) - this is needed to minimize network latencies
- Use kernel bypass technologies to get the incoming request to application faster (instead of network-card->socket-buffer->application, these allow you to do network-card->application and similarly in reverse).
- Avoid jitter, use no-GC techniques and avoid CPU cache-misses.
- For low latency your measurement tools need to not introduce any jitter into measurements, which means they also need to have no pauses of more than a few microseconds.
- If you need low latency on a cloud node, you need to rent the entire machine even though you only need part of it, otherwise noisy neighbours introduce much too much jitter.
- For low latency turn off hyper-threading or you get CPU cache contention between threads
- For low latency use the L3 cache to pass messages between threads and processes.
- For low latency lock threads to specific cores using affinity.
https://www.infoq.com/articles/slos-engineering-team-API/
SLOs Are the API for Your Engineering Team (Page last updated October 2019, Added 2019-11-27, Author Charity Majors, Publisher InfoQ). Tips:
- Service level objectives (SLOs) provide an interface between your application and anyone or application that interacts with it, including: the business agreements for how the application performs; your team members for when it is no longer productive to work on performance; your customers for what they can expect; priority selection amongst competing requirements for the application (breaching your SLOs is the same as any high priority bug that needs fixing).
- If a performance problem is reported which is not already flagged by your service level objectives (SLOs), then you need to either adapt the existing SLOs or add new ones that will identify it.
https://www.youtube.com/watch?v=8quHAGrU-sI
Is writing performant code too expensive? (Page last updated July 2019, Added 2019-11-27, Author Tomasz Kowalczewski, Publisher GeeCON). Tips:
- Fail fast - if the feature is going to be rejected, no need to spend time optimizing it.
- Compare the cost of adding another server to the cost of the developer spending time on making the application run without needing that extra server.
- If you have already achieved the target, there is no point in optimizing further, eg 250ms is about the limit of human perception for change, so doing something for human perception faster than that is wasted time
- Optimizing a complex 3rd party product often produces a much less efficient result than implementing a simple custom solution.
- Placing data according to the hardware layout (NUMA, CPU caches, thread staying with the same core, avoiding sharing cache data across cores) can make the application much faster.
- Sequential access is much faster than random access (because of pre-fetching by the OS).
- Branch prediction is much faster when one branch is much more likely. Converting the branch into a set of deterministic operations (eg using bitwise operations) gives a good speedup for branches which are not heavily biased.
https://www.youtube.com/watch?v=pMeNDXyNv4E
What do I do with 1000 cores (Page last updated July 2019, Added 2019-11-27, Author Andrzej Grzesik, Publisher GeeCON). Tips:
- Cloud hosts are variable in performance mainly because of the other processes running on the host, in the worst case performance can degrade dramatically.
- Having code that cross NUMA nodes makes a significant difference in performance.
- Sequential processing on each core is a good approach, using event sourcing to trigger processing.
- If you want to run on many cores you need to use lock-free algorithms.
- In tests, try the reverse test to verify your hypothesis.
- Reactive concepts like backpressure are very useful.
Jack Shirazi
Back to newsletter 228 contents
Last Updated: 2024-08-26
Copyright © 2000-2024 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
URL: http://www.JavaPerformanceTuning.com/news/newtips228.shtml
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us