|
|
|
Back to newsletter 217 contents
Continuing from last month, I'm selecting the tips of the year (from the more than 300 I published). Top of the list this month is one repeated in several articles - that on modern systems memory transfer often dominates algorithm complexity analysis, so O(N) analysis is insufficient. This applies from the CPU cache level right up to distributed memory - I gave guidelines for this in my Devoxx talk last year, where I explained the 3 axes of performance: concurrency, data size and responsiveness.
The next top tip, taken from Daniel Shaya's excellent masterclass in ultra-low latency programming (see this month's talks, below) tells you what you need for ultra-low latency (sub 100 microsecond): you can't have any GCs; use shared memory; apply single-threaded processing logic (no synchronization) with the thread pinned to the core and all other threads excluded from that core; use simple object pooling (single-threaded); scale by partitioning data across non-shared threads/processes/microservices; spin when waiting for data to keep hold of the CPU and keep it hot; record everything so that you can replay in test to analyse outliers; don't cross NUMA regions, each process/microservice should run on one core; use wait-free data structures (no waits and guaranteed that the thread can always proceed and will finish within a set number of cycles); run replicated hot-hot for high availability.
My seventh (including last month's) top tip of the year tells you how to optimize load balancing - choose 2 servers at random and then pick the least busy of the two. This keeps coordination overhead low while still giving a reasonable chance of using spare resources.
And my final top tip of the year is clarity on when to use parallel streams: Parallel streams are ideal for substantial in-memory work. Gathering data from outside memory for processing would overwhelm any benefit; similarly if the work is small, going parallel will actually make it take more time than staying sequential. Any data structure used needs to be splittable (processed in parallel), and any lambdas used in the stream need to be threadsafe (avoid updating shared variables).
Now on to this month's tips, tools, news, articles, and talks from our extensive community. And of course the tips from this month's articles and talks, as ever are extracted into this month's tips page.
Java performance tuning related news.
Java performance tuning related tools.
Back to newsletter 217 contents