|
|
|
Back to newsletter 173 contents
Tuning is an ongoing journey. Changes you made at one time can become redundant or make performance worse as a system evolves. An interesting example of a tuning journey is the keep-alive in HTTP. Let's have a look at that journey and the lessons we can learn from it.
1. At first browsers made one connection at a time to the webserver to get the page. Early web pages were quite simple and network bandwidth was low so this worked well. But as average bandwidth improved and websites became more ambitious, web pages evolved to more complex ones with multiple embedded resources, and each resource required a separate HTTP request so the full page rendering time became annoyingly slow. Lesson: Make sure you have SLAs, and retest these for any changes, performance drifts with changes, even from data changes.
2. To improve performance, browsers applied two generic performance tips - Lesson: Move non-UI delays to a background thread outside of the UI; and Lesson: parallelise slow operations. Specifically, browsers started rendering those elements of a partial page that they had downloaded and that could be rendered before the full page download had completed; and they opened multiple simultaneous connections to get the resources in parallel rather than sequentially. Network bandwidth improved so having multiple connections was now acceptable.
3. But too many connections can saturate your bandwidth and also put too heavy a load on webservers. Both browsers and websites responded to this with the classic tuning control mechanism of Lesson: use queueing and pooling to pipeline requests efficiently; browsers limited the total number of concurrent connections (globally and per website), queuing requests to websites; webservers similarly provided queued requests so that only a certain number of concurrent requests from a particular browser instance were simultaneously served, with pending requests queued. But one solution is seldom ideal for everyone, and amusingly some websites actually wanted MORE concurrent requests than now allowed by browsers so that their pages would download faster! To enable this, they Lesson: break up their resources across multiple domains so that the page download was not delayed by the browser's per-website concurrent connection limit - and this is still a recommended performance tuning option even now.
4. Profiling the average page download at this stage showed one common inefficiency: every request needed a separate connection set up, even when it was to the same webserver (Lesson: profile and eliminate identified inefficiencies). Finally, to optimise this inefficiency, the first keep-alive capability was created. This was an implementation which allowed the browser to keep open the connection and make multiple requests over that same connection. Lesson: like most performance tuning changes, there is a benefit but also a cost to doing this. The cost here is that a connection remains open even if it's not being used, which means resources, particularly the all important socket handles which are limited on every server, can be idle whereas a non-keep-alive architecture only uses resources as needed. To optimise this, a timeout of how long the connection stays alive to wait for a new request after completing the last request, was introduced. Lesson: for most performance optimisations, you can introduce a balancing parameter that lets you optimise between the old implementation and the new one. Keep-alive is especially efficient if you have many HTTPS requests to the same site. Keep-alive itself evolved to allow request pipelining (initiating requests before a previous request has completed) to allow the connection to be as efficient as possible. Lesson: Optimal use of a shared resource is done by multiplexing where possible.
5. As sites' traffic grew, the number of connections increased and ever more resources were needed. Keep-alive connections had a higher resource cost, especially on memory, and the tuning recommendations flipped to recommending websites disable keep-alive or keep the timeout very short, because CPU capability had increased much faster than memory, so a site could scale to more concurrent load with keep-alive disabled. Lesson: A performance optimisation can become a performance bottleneck over time - keep testing your SLAs.
6. In order to attack the scaling limits from memory limitations of webservers, newer webserver implementations eliminated this memory cost, by moving to multiplexed IO and ensuring that inactive connections used negligible memory resources. The limitation now for modern webservers is primarily how many open socket handles the OS of the server can manage. Once again the tuning recommendation has flipped back to recommending websites enable keep-alive! (Of course you may be using a less than optimal implementation, but I'm talking about the latest optimal ones). Although in some scenarios having keep-alive enabled uses more concurrent connections on average, even here the recommendation tends to be to scale horizontally and load balance rather than disable keep-alive.
Now on to all our usual sections: links to tools, articles, news, talks and as ever, all the extracted tips from all of this month's referenced articles.
Java performance tuning related news.
synchronized(something) {while (didNotReceiveNotification()) {something.wait();} doSomething();} handleNotification();
Java performance tuning related tools.
Back to newsletter 173 contents