Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Tips September 2011
Get rid of your performance problems and memory leaks!
Get rid of your performance problems and memory leaks!
Back to newsletter 130 contents
But what about performance testing using virtualization? (Page last updated July 2011, Added 2011-09-27, Author Kirk Pepperdine, Publisher kodewerk). Tips:
- There are plenty of great reasons to use virtualization, but performance isn't one of them.
- If your test environment differs from your production environment you are likely to mask existing bottlenecks with phantom or artificially created ones.
- Fixing "issues" that don't actually exist in production is at best a waste of time and resources - at worst can actually cause issues in production.
- If you use virtualization in production, you *need* to be using it in the test environment.
- Performance and virtualization is about balancing risk: the risk of configuration mistakes vs the risk that virtualization will just simply give you the wrong performance.
- Most (observed) applications running in a virtualized environment were I/O bound; while attention is made to ensure that enough CPU is available, I/O provisioning is often badly considered.
A Non-Foolish Consistency (Page last updated July 2011, Added 2011-09-27, Author Kyle Brandt, Publisher serverfault). Tips:
- The most common way to look at response time is to look at the average response time for the primary web request in a page load. You need to include page rendering time.
- A Content Delivery Network (CDN) can help reduce load times for people who are geographically far from the main servers
- After getting page delivery and rendering times fast enough on average, you need to focus on the variation in times measured - the average is insufficient to give you the users view of performance, the outliers are disproportionately remembered by users.
- A simple way to measure average and variance of response times for a server is to model the user request as closely as possible and periodically execute such requests as a simulated user - including all overheads such as network delivery and page rendering times. Though without data start where you can - server request weblogs, or even network packet sniffing.
How to get C like performance in Java (Page last updated May 2011, Added 2011-09-27, Author Peter Lawrey, Publisher vanillajava). Tips:
- The JVM does implicit bounds checking on array access and updates. This has a small overhead - you can unsafely eliminate this (and open yourself to buffer overflows and other problems) using the the Unsafe class or direct buffers.
- Use memory-minimized collections to reduce memory usage.
- You can use Direct memory to store data how you wish (this is what BigMemory uses).
- Use blocking IO in NIO (which is the default for a Channel) - don't use Selectors unless you need them.
- Most systems can handle 1K-10K threads efficiently. Scalability beyond 10K users/server doesn't buy you anything in the real world since the server resources will be consumed servicing 10k concurrent users.
- -XX:+UseCompressedStrings use byte instead of char for strings which don't need 16-bit characters - this saves memory but is 5%-10% slower.
- To reduce string space usage, you can use your own Text type which wraps a byte, or get your text data from ByteBuffer, CharBuffer or use Unsafe or -XX:+UseCompressedStrings.
- To start the JVM faster, load fewer libraries.
- Use primitives instead of primitive wrapper objects.
Java Threads on Steroids (Page last updated August 2011, Added 2011-09-27, Author Wojciech Kudla, Publisher JavaLobby). Tips:
- LMAX disruptor performs better than an ArrayBlockingQueue (by avoiding blocking).
- To eliminate the cost of context switching you need to force threads to run only on a specified set of CPUs - this is called processor affinity. In Java you can achieve this using the RealTime JVM, or use a JNI call into the kernel. Tests show this could speed up throughput significantly where context switching is significant.
Application Performance Monitoring in production - A Step-by-Step Guide - Part 1 (Page last updated April 2011, Added 2011-09-27, Author Michael Kopp, Publisher dynatrace). Tips:
- Useful application performance targets are response time, throughput, and concurrent users targets; CPU usage or other hardware, resource, OS or JVM statistics should be secondary targets.
- User's typically will tolerate 3-4 secomds of unresponsiveness before getting frustrated enough to give up.
- For transaction oriented applications, the main performance target is usually throughput (typically in total transactions completed per second).
- Places where it is a good idea to take are at the interfaces between distributed components of the application, e.g. page load (start and end), service requests (ingoing and outgoing), business transaction (response and service times), total roundtrip times from initiation to final presentation.
- The closer your measurements come to being taken at the end-user, the closer it gets to the real world view of the performance (but typically the harder it is to measure).
- Averages are acceptable measurements for throughput targets, but not so useful for response times, because it ignores volatility - percentiles are a better option for response times, especially the 90th centiles and above.
- High volatility of response times are not desirable and indicate performance instabilities.
- You should try to categorize the different transaction types and different request types and provide separate measurements for each type, as different types usually have different performance profiles.
- Errors can easily get incorrectly categorized as fast transactions - an error in handling a request will often result in a much quicker response than correctly processing the request, and if not correctly categorized the error will be classed as a fast correct response, producing misleading statistics. Care should be taken to handle and measure errors separately in performance monitoring.
- Alerting should be careful to distinguish between occasional non-critical errors and frequent non-critical errors - when you get an increase in frequency of non-critical errors, this is often worth alerting on, as it tends to indicate a widespread problem that is likely to impact performance too.
- Monitor the separate tiers of the application, both transfer time between tiers and service time of the tiers. If a particular tier is identified as causing a bottleneck, increase monitoring details within that tier to identify the bottleneck more finely.
- The correct way to measure application performance improvement or degradation is with respect to the impact on the end user - other measures are inadequate.
High Performance And Smarter Logging (Page last updated June 2011, Added 2011-09-27, Author Archanaa Panda, Publisher JavaLobby). Tips:
- Logs can be the very lifeline of your production application and should not be taken lightly or as an afterthought.
- Logging is the most frequent implementation of monitoring the application.
- Too much logging can result in having useful incident logs being rolled over, thus defeating the very purpose of logging. Too much logging also has a very detrimental effect on performance from I/O waits, and also from thread blocking within the application if (commonly) there is a synchronization point in the logging stack.
- Changing the log level from DEBUG to INFO or WARNING can give significant performance benefits.
- Applications can use a special class of logs (redo logs) for recoverability if required.
- Where a repeated identical error occurs, your application should log the first in detail and then switch to restricted or summary logging of the repeated errors, to reduce unnecssary I/O and log spam.
- Consider whether you will support distributed logs (and all the management that implies) for your distributed application, or centralised logs using a centralised log server.
- Client applications need their logs accessible by the development and operations team - ensure you design for a mechanism to retrieve these logs efficiently.
- Log4J's MappedDiagnosticContext and NestedDiagnosticContext use ThreadLocal storage to store context specific information, e.g. to store information such as user name or transaction id to identify all operations done by the particular user or transaction.
- Logging source code information (filename, line number, etc) can have overheads.
- Do not use logging as a replacement for other monitoring strategies - it can intefere too much with the very performance it is supposed to be measuring.
- A high performance logging solution is to combine centralized logging with a logger facade that uses integer codes instead of Strings for logs. A separate file lists the mapping between the error codes and the complete human decipherable String. This solution reduces memory, I/O, rollover and disk consumption. It also provides for potentially very efficient searching.
- SLF4J improves over Log4J to reduce intermediate garbage generation.
Back to newsletter 130 contents
Last Updated: 2017-10-01
Copyright © 2000-2017 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us