Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Training online: Threading Essentials course
Tips September 2010
Get rid of your performance problems and memory leaks!
Get rid of your performance problems and memory leaks!
Back to newsletter 118 contents
Java Memory Model From a Programmers Point-of-View (Page last updated July 2010, Added 2010-09-27, Author Yongjun Jiao, Publisher JavaLobby). Tips:
- The order of execution of statements is not guaranteed to be the order written in the code.
- Two operations with stores are guaranteed not to be reordered if both are referencing the same memory locations (so they have data dependencies on each other).
- To enforce ordering of statements, you need to use synchronized to ensure one thread at a time processes the block.
- Accesses (reads or writes) to volatile variables cannot be reordered with each other, nor can they be reordered with normal field accesses around them.
- volatile variables cannot be allocated in CPU registers, which makes them less efficient than normal variables.
- from Section 17.5 of JLSv3 "Final fields allow programmers to implement thread-safe immutable objects without synchronization. A thread-safe immutable object is seen as immutable by all threads, even if a data race is used to pass references to the immutable object between threads. This can provide safety guarantees against misuse of an immutable class by incorrect or malicious code. Final fields must be used correctly to provide a guarantee of immutability."
- The write of the default value (zero, false or null) to each variable happens-before the first action in every thread. Conceptually every object is created at the start of the program with its default initialized values.
- A program is correctly synchronized if and only if all its sequential execution traces (program orders) are free of data races.
- Synchronizing only on a write to a variable is incorrect (for thread-safety) as although the synchronization on the writer thread flushes the new value to main memory, the reader thread may still read its old cached value without synchronization.
- Accesses (reads or writes) to all Java primitive types and references (not the referenced objects) except double and long are atomic.
ESB Performance Pitfalls (Page last updated August 2010, Added 2010-09-27, Author Martin Vecera, Publisher JavaLobby). Tips:
- Parsing XML documents is an expensive operation - creating and parsing XML is the real performance blocker.
Thousands of Threads and Blocking I/O (Page last updated July 2010, Added 2010-09-27, Author Paul Tyma, Publisher ). Tips:
- For an NIO based server, the server notifies when some I/O event is ready to be processed, this is then processed; since all I/O is effectively multiplexed, it requires the server to keep track of where each client is within its i/o transaction, i.e. state must be maintained for all clients (unless a stateless protocol is used, e.g. all state is part of the request).
- NIO is not faster than IO, but it can be more scalable, though the scalability is an issue of how efficient the OS is at handling many threads.
- NIO transfers rate can be only 75% of a plain IO connection (several benchmark studies show this sort of comparative maximum rate).
- A multithreaded IO server tends to automatically takes advantage of multiple cores, where an NIO server may explcitly need to hand processing off to a pool of worker threads (though that is the common design)
- On modern OSs, idle threads have not much cost, context switching is fairly efficient, uncontended synchronization is cheap.
- Nonblocking datastructures scale well - ConcurrentLinkedQueue, ConcurrentHashMap, NonBlockingHashMap (&NonBlockingLongHashMap)
- A good architecture throttles incoming requests to the maximum rate the server can handle optimally, otherwise if the server gets overloaded overall request rates as well as individual request service times drop to unnacceptable levels.
- Avoid Executors.newCachedThreadPool as an unbounded number of threads tends to be bad for applications (e.g. more threads get created just when you are already maxxed on CPU).
- If you do mutliple sends per request, use a buffered stream. If one send per request, don't buffer (as you effectively already have).
- Try to keep everything in byte arrays if possible, rather than converting back and forth between bytes and strings.
- In a thread-per-request model, watch for socket timeouts.
- Multithreaded server coding is more intuitive than an event based server.
How to Measure Application Performance (Page last updated July 2010, Added 2010-09-27, Author Alois Reitbauer, Publisher DynaTrace). Tips:
- If you are not measuring you are blind.
- Response times alone cannot tell you why it took that time to deliver that response.
- Components of application response time are: CPU time (time processing the request); Wait time (time waiting for a resource to become available); Suspension time (time when the system suspended execution to do some non-application processing, e.g. GC).
- CPU time tells how computationally intensive your transaction is. Look at the relation between response time and CPU time. If increasing load on your system produces CPU increasing proportionally to the number of transactions while response times stay stable, this is called linear scaling.
- Waiting for resources can mean waiting for a database connection, the response of a service call, file operations or other shared resources. If response times increase disproportionally with the number of requests this means that your resource wait times are increasing.
- Split out response time and its components by application components such as business layer, database layer, communication layer, etc. If this is possible, this information can identify the exact component that is causing bad response times.
7 Scaling Strategies Facebook Used to Grow to 500 Million Users (Page last updated August 2010, Added 2010-09-27, Author Todd Hoff, Publisher ). Tips:
- Scale (and architect in order to scale) to an arbitrary number of machines, for all parts of the system.
- Break things up into many distinct parts so that you can make small changes rolled out on a few machines to a few users at a time, and measure the benefits/impacts.
- System stability is increased by incremental change because you know sooner if a particular strategy is working. It's easier to figure out where things go wrong when dealing with smaller increments.
- Measure both system and application level statistic to know what's happening.
- Checkout what's happening in the 95th or 99th percentile as averages hide important issues.
Java Best Practices - High performance Serialization (Page last updated July 2010, Added 2010-09-27, Author Justin Cater , Publisher javacodegeeks). Tips:
- If you don't explicitly set a serialVersionUID class attribute the serialization mechanism has to compute it by going through all the fields and methods to generate a hash, which can be quite slow.
- With the default serialization mechanism, all the serializing class description information is included in the stream, including descriptions of the instance, the class and all the serializable superclasses.
- Externalization eliminates almost all the reflective calls used by Serialization mechanism and gives you complete control over the marshalling and demarshalling algorithms, resulting in dramatic performance improvements. However Externalization requires you to rewrite your marshalling and demarshalling code whenever you change your class definitions.
- Use simpler data representations to serialize objects where possible, e.g. just the timestamp instead of a Date object.
- You can eliminate serializing null values by serializing meta information about which fields are being serialized.
- Google protobuf is an alternative serialization mechanism with good size advantages when using compression.
Back to newsletter 118 contents
Last Updated: 2018-04-29
Copyright © 2000-2018 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us