Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Training online: Concurrency, Threading, GC, Advanced Java and more ...
Tips May 22nd, 2003
Get rid of your performance problems and memory leaks!
Get rid of your performance problems and memory leaks!
Back to newsletter 030 contents
Continuous Performance (Page last updated 2003 April, Added 2003-05-22, Author Cody Menard, Publisher TheServerSide). Tips:
- It isn't unusual for Java development teams to spend 20 percent of their time - up to a week every month - tediously fixing performance problems
- [Article advocates continuous performance testing throughout project development, i.e. tune while you build. The trouble, from my point of view, is having performance goals availble to identify bottlenecks at all stages of development, without wasting resources on bottlenecks that don't matter in the final application].
Understanding Performance in Web Service Development (Page last updated 2003 April, Added 2003-05-22, Author Peter Varhol, Publisher Web Services Journal). Tips:
- Individual application components may not show performance problems but, on integration, performance bottlenecks may result from interactions between components.
- You need to measure throughput: the ability to accept a request and provide a response in accordance with required performance parameters.
- You should measure the performance of all of the application components simultaneously, during the same test run.
- Bottlenecks can come from application components, resource or processing issues, synchronization problems, networking, or data throughput.
- Carefully monitor and optimize the amount of data transmitted in each SOAP call.
- A chatty (fine-grained) interface passing less complex data may turn out to be less computationally expensive because its marshalling isn't as complex.
- Chunky calls (coarse-grained) are more efficient because they reduce the overall volume of calls
Caching in J2EE Architectures (Page last updated 2003 April, Added 2003-05-22, Author Helen Thomas, Publisher JDJ). Tips:
- Caching objects in memory, called in-process caching, can be reduce object creation and destruction overhead and provide fast object access
- Longer lived objects are much more expensive for the garbage collector than short-lived objects
- Soft (weak) references are often used to cache objects. However they can impose a heavy overhead on the garbage collector
- Article compared different types of caches: no-cache provided constant load; soft reference cache provided a sawtooth load with spikes when memory filled and the cache was reclaimed; a hard reference cache crashed with an out-of-memory error;
- A soft reference cache is more efficient than no cache if the cache does not need to be reclaimed. If the cache needs to be reclaimed regularly, a soft reference cache can be less efficient than no cache
- In memory-constrained application environments, in-process caching can be detrimental to application performance because of the additional garbage collection overheads
- An external cache using serialized objects, where the cache server handles all the cache and resource management, is an efficient alternative to caching objects with high external resource costs. The external cache imposes no additional garbage collection overheads on the JVM, which is the reason for it's efficiency profile.
- In-memory caches also have distributed cache coherency problems which an external cache can solve or reduce.
- An external cache imposes the overhead of interprocess calls and marshalling.
Errant Architectures (Page last updated 2003 April, Added 2003-05-22, Author Martin Fowler, Publisher Software Development Magazine). Tips:
- Transparency is valuable, but while many things can be made transparent in distributed objects, performance isn?t usually one of them.
- Don't distribute objects for performance reasons. It doesn't work.
- A procedure call within a process is extremely fast. A procedure call between two separate processes is orders of magnitude slower. Make that a process running on another machine, and you can add another order of magnitude or two.
- A fine-grained interface doesn?t work well when it?s remote.
- Remote calls are best coarse-grained, designed not for flexibility and extendibility but for minimizing calls.
- Any object that may be used remotely should have a coarse-grained interface, while every object that isn?t used remotely can have a fine-grained interface (and should for good design).
- It only makes sense to pay the remote call cost when you need to, and so you need to minimize the number of interprocess collaborations.
- You can?t just take a group of classes that you design in the world of a single process, throw some distribution architecture at them and come up with a distributed model. Distributed design is more than that.
- If you base your distribution strategy on classes, you?ll end up with a system that does a lot of remote calls and thus needs awkward, coarse-grained interfaces.
- Even with coarse-grained interfaces on every remotable class, you?ll still end up with too many remote calls and a system that?s awkward to modify as a bonus.
- Fowler's First Law of Distributed Object Design: Don?t distribute your objects.
- Distribute applications using clustering: ut all the classes into a single process and then run multiple copies of that process on the various nodes
- Unavoidable distribution boundaries include: client/server separation; database communications; web server/application server interface.
- Remote Facades are useful for limiting distribution boundaries. Remote Facades provide a remote interface to fine-grained objects.
- Don?t be transparent about a potential remote call.
- Data Transfer Objects (also known as Value Objects) allows you to bundle data into fewer transfers, thus reducing remote calls.
- Applications can see a significant performance improvement by replacing an XML-based interface with a (binary) remote call.
- Use XML Web services only when a more direct approach isn?t possible.
- Asynchronous message based systems are probably more efficient than synchronous RPC based systems.
Performance of Lists (Page last updated 2003 April, Added 2003-05-22, Author karschten, Publisher JPTC). Tips:
- ArrayList is quite fast for accessing the elements in direct or random order, but for pure sequential use the LinkedList might the faster List
- Vector is slower than ArrayList.
Select for high-speed networking (Page last updated 2003 April, Added 2003-05-22, Author Greg Travis, Publisher Javaworld). Tips:
- The stream model (java.io) is very flexible, but not alwasy the fastest. A buffer model (java.nio) allows data to be dealt with in large blocks, which can maximize throughput.
- The main advantage of buffers is that they deal with data in bulk.
- In some cases buffer implementations can represent system-level buffers which means you can read and write data with minimal data copying.
- Blocking I/O requires one thread per concurrent open I/O stream. Threads can be resource-intensive. Under many implementations, each thread can occupy a sizeable amount of memory, even if it's not doing anything. And a performance hit can result from having too many threads.
- NIO Select allows you to use one thread to handle multiple open I/O channels.
Watch your HotSpot compiler go (Page last updated 2003 April, Added 2003-05-22, Author Vladimir Roubtsov, Publisher Javaworld). Tips:
- HotSpot comples methods after a while causing a temporary slowdown followed by faster method execution.
- Use the -XX:+PrintCompilation HotSpot option to see which methods are being compiled (and when they are compiled).
- Warming up code by running it for a while can be useful in some specialized situations, such as for more accurate timing.
- The -XX:CompileThreshold allows you to specify how many iterations of a method are needed before it is compiled (default 1500 for client mode and 10 000 for server mode).
Proactive Application Monitoring (Page last updated 2003 April, Added 2003-05-22, Author Alexandre Polozoff, Publisher IBM). Tips:
- Proactive Application Monitoring allows you to detect and respond to problems before end users are even aware that a problem exists.
- Care must be taken to avoid synchronizing logging in distributed environments or serialized execution can result, severely limiting the scalability and performance of the application.
- Logging to files can severely slow down an enterprise application by impacting the server's file I/O subsystem. Server caches and other mechanisms should be configured to minimize such hits, but this may still be a serious and unavoidable bottleneck, especially in high volume situations where the application is continually sending data to the log.
- Application monitoring tools automate the logging process and reduce the chances that the server process will be adversely compromised.
- Monitor Network latency (Ping time and network bandwidth measurements) for Timings > 1000 ms or network bandwidth maxed
- Monitor CPU utilization (all servers) for utilization > 80% over x minutes
- Monitor Memory utilization (all servers) for utilization > 80% over x minutes
- Monitor Paging/swapping (OS level, all servers) for paging/swapping beyond background levels
- Monitor File system available file space (all servers) for space > 80% used
- Monitor Network components SNMP traps for Degraded counters
- Monitor Java naming server using scripts to run JNDI queries for response times > 3 secs
- Monitor Average servlet and JSP response times for response times > 8 secs
- Monitor The EJB container average response time for response times > 900 ms
- Monitor SQL INSERT, UPDATE, DELETE statements for sesponse times > 1600 ms
- Monitor Gateways average response times for sesponse time > 1 second
- Monitor Web server for response time retrieving 1K GIF > 1 second
- Monitor Databases for average response times > 1000 ms
- Monitor Message queues for average response times > 200 ms and queue depths exceeding 500
- Monitor The application, e.g. for complex page requests > 10 secs
Practical examples for improving system responsiveness (Page last updated 2003 April, Added 2003-05-22, Author Cameron Laird, Publisher IBM). Tips:
- Users need to know quickly and reliably how long an activity will take. Use progress monitors, progress bars, stopwatches, etc.
- You can solve many apparent performance problems just by teaching your programs to show what they're doing.
- Superior algorithms often outperform inferior ones by a factor of a thousand or more.
- Avoid sorting by indexing on insertion. Insertion time increases but ordered lookup is much faster.
- Avoid sorting by using secondary structures to hold data in the alternative required order.
- Memory impact can dominate performance if it exceeds memory boundaries (e.g. process starts paging).
- An increase in main memory is an inexpensive improvement that often yields dramatic performance results.
- Disk drive combinations can easily have faults which do not lead to data loss, but do lead to performance problems, such as a RAID unit that has lost one spindle. Monitor disks and use reliable ones.
Minimize Contention (Page last updated 2003 March, Added 2003-05-22, Author Ted Neward, Publisher Neward). Tips:
- Scalable means that we can achieve higher throughput in the system as demand grows by adding hardware to the system without redesign.
- Contention refers to the conflict that arises when multiple concurrent operations try to access a shared resource.
- The scalability of a given system is constrained by contention for shared resources within the system.
- [Article gives a nice example of contention restraints causing performance problems].
- Each operation trying to acquire exclusive access to a resource may also already hold exclusive access on other resources, forcing even more concurrent operations to wait, increasing the scope of the bottleneck, and system throughput seriously suffers as a result.
- Eliminate or minimize that contention, and the system becomes more scalable.
6 Tips for High-Performance Java Apps (Page last updated 2003 March, Added 2003-05-22, Author Peter Varhol, Publisher ftponline). Tips:
- The garbage collector can be a significant performance bottleneck so allocate memory wisely.
- The more memory your code allocates, the more often the garbage collector has to run, and for longer time periods.
- Use profiling tools.
- Threads can perform actions independently and provide the appearance of faster overall performance.
- Poorly designed thread use can lead to unnecessary memory consumption and deadlocks.
- Test methodically.
Back to newsletter 030 contents
Last Updated: 2021-07-28
Copyright © 2000-2021 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us