Java Performance Tuning

Java(TM) - see bottom of page

|home |services |training |newsletter |tuning tips |tool reports |articles |resources |about us |site map |contact us |
Tools: | GC log analysers| Multi-tenancy tools| Books| SizeOf| Thread analysers| Heap dump analysers|

Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks! 

Training online: Concurrency, Threading, GC, Advanced Java and more ... 

The Interview: Brian Goetz, January 29th, 2003

Get rid of your performance problems and memory leaks!

Modern Garbage Collection Tuning
Shows tuning flow chart for GC tuning

Java Performance Training Courses
COURSES AVAILABLE NOW. We can provide training courses to handle all your Java performance needs

Java Performance Tuning, 2nd ed
The classic and most comprehensive book on tuning Java

Java Performance Tuning Newsletter
Your source of Java performance news. Subscribe now!
Enter email:

Training online
Threading Essentials course

Get rid of your performance problems and memory leaks!

Back to newsletter 026 contents

Brian Goetz writes the IBM Developerworks Java theory and practice column, and is on several Java expert groups. This month, Brian took some time to answer's questions.

Q. Can you tell us a bit about yourself?

I am currently a Principal Consultant at Quiotix, a software consulting firm in Los Altos, CA. Over the past 20 years, I've done work in kernel internals, device drivers, protocol implementations, compilers, server applications, web applications, scientific computing, data visualization, and enterprise infrastructure tools. I participate in a number of open source projects, including the Lucene text search and retrieval system, and the WebMacro template framework. I also write a monthly column in IBM developerWorks Java Zone and have contributed numerous feature articles, including several on performance, to JavaWorld and IBM developerWorks Java Zone.

Q. What do you consider to be the biggest Java performance issue currently?

This is probably going to be an unpopular answer, but I think it is XML. Don't get me wrong, XML is a powerful framework for representing and validating structured data, but XML, or more precisely the overuse/misuse of XML (and the technologies surrounding it, such as XSL), has been one of the biggest sources of performance problems in many of the projects I've worked with.

XML and related technologies activate all the standard performance warnings -- lots of intermediate temporary objects, heavily layered implementations, extensive use of "interchange types", and complex specifications that make efficient implementation difficult. Choosing XML-based technologies can not only increase your memory and bandwidth requirements by several orders of magnitude but, if you are not careful, you'll end up with an application whose structure is driven by the structure of your XML tools, rather than by your business process requirements.

Q. Given what you say about XML, have you got any specific advice to our readers on how to reduce or prepare for those XML overheads?

If possible, try not to let XML into the core of your application. Use it at the periphery -- configuration, preferences, data import, data export -- but not as your in-memory data representation for core processing. Using it as your in-memory representation not only subjects you to negative performance impact, but it ties your implementation and class hierarchy way too tightly to the structure of how the XML parsing tools work.

Can you tell us a little bit about JSR 166?

The goal of JSR 166 is to produce a set of basic concurrency utilities which can serve as the building blocks of server applications. Server application developers need simple facilities to enforce mutual exclusion, synchronize responses to events, communicate data across multiple cooperating activities, and asynchronously schedule tasks. The low level synchronization primitives Java provides -- wait, notify, and synchronized -- are effective but too low-level for most applications, are difficult to use, and easy to misuse. JSR 166 will be contributing a package of utilities as the java.util.concurrent package, including efficient thread pools, thread-safe collection classes, synchronization primitives such as mutexes and semaphores, utilities for reliably exchanging data between, threads,. Many of the elements will be familiar to users because they have their root in the widely-used util.concurrent package by Doug Lea.

Q. Do you participate in other JCP Expert Groups?

I am also a member of the JSR 106 Expert Group, whose charter is to standardize a framework for providing caching services to J2SE and J2EE applications. I was motiviated to join this JSR because of my experience developing a caching subsystem for WebMacro, and my frustration at having to reinvent the wheel from scratch. Caching is critical to the performance and scalability of all sorts of applications, and developers should be able to take advantage of existing caching frameworks rather than having to write their own.

Q. What are the most common performance mistakes that you have seen in Java projects?

I think the most common Java performance mistakes fall into two major categories, which are opposite flavors of the same basic mistake, which is not giving performance management the correct place in the development process. The first class of mistakes is to ignore performance completely, and treat it as something that can be handled at the end of the project, like writing the release notes. This strategy is basically relying on luck, but it often works, so it keeps getting used.

The other common class of performance error is to let micro-performance considerations drive architectural and design decisions. Developers love to optimize code, and with good reason: it is satisfying and fun. But knowing when to optimize is far more important. Unfortunately, developers generally have horrible intuition about where the performance problems in an application will actually be. As a result, they waste effort optimizing infrequently executed code paths, or worse, compromise good design and development practices in order to optimize some component which doesn't have any performance problems in the first place. (See October's JT&P for a rant on premature optimization.) When you've got your head down in the code, it's easy to miss the performance forest for the trees. Making each individual code path as fast as it can possibly be is no guarantee that the final product is going to perform well.

The right balance is to integrate performance measurement and planning into the process from the beginning, but sit on your hands when tempted to implement some clever performance tweak you just thought up. Performance work should be driven by performance goals, which must be supported by performance measurement. Anything else is just "playing."

Q. What change have you seen applied in a project that gained the largest performance improvement?

That would have to be the effective application of caching, at multiple levels. When confronted with an expensive operation that is done frequently, there are two ways to improve your performance -- make the operation less expensive, or do it less often. While the former is often more fun, it is usually the case that the latter approach is easier to implement correctly and efficiently.

Unfortunately, the framework tools for implementing caching are quite poor, and so adding caching to an application involves a certain amount of infrastructure investment. Hopefully, when JSR 106 bears fruit, this will be less of an issue. Caching can dramatically improve both the response time and scalability of applications. Caching on the read side -- caching data retrieved from a database, web service, or file -- is used frequently, but caching on the write side can also be effective in applications where data changes frequently but some minimal level of data loss in the case of system failure is acceptable. An example of this would be a chat server which maintained a permanent record of chat data. By updating the data store no more than every minute, instead of every time a message is sent, you can eliminate a great number of writes to the database, at some small cost to data durability (is losing 30s of chat log in a system crash a disaster? Maybe not.) An important element of performance management is being able to determine when tricks like this are acceptable and when they are not.

Q. Do you know of any performance-related tools that you would like to share with our readers?

I am extremely disappointed in the current state of Java performance tools. The leading profilers, Jprobe and OptimizeIt, are fairly primitive as profiling tools go. They have certainly improved in recent years, but they still are intrusive and offer little in the way of drilling down to find all but the most obvious hotspots. At the risk of sounding like an old curmudgeon, I'm still waiting to see a profiling tool as good as DEC's PCA, which was released nearly 20 years ago. I've not seen one since that even comes close.

The enterprise performance tools such as those from Precise are impressive, but they are limited to a small number of J2EE containers, and are prohibitively expensive for smaller shops. Load generation tools such as SilkPerformer and LoadRunner are also quite useful, but they are also quite expensive.

Q. Could you describe some of the features of DEC's PCA profiling tool which you'd like to see in Java profilers?

PCA was a two-stage tool, with a low-impact data-gathering component (which would take CPU and stack trace snapshots while your program was running) and store them into a database for later analysis. Runtime overhead when gathering data was in the 5-10% range. After a run, you would use the offline analyzer to analyze the data, which included not only CPU data but call chain data as well. There was a rich query language, not unlike SQL, that let you select which CPU or call chain data points you wanted to feed into the table, chart, or graph engine. The result was a degree of drilldown that today's Java tools can't even come close to.

Using the query language, you could separate calls to A when B and C were on the call stack from calls to A when they weren't, etc, much more powerfully than the simple all-or-nothing filtering supported by JProbe or OptimizeIt. You could progressively refine result sets with a series of filters to drill down to the precision you wanted. The filtering was powerful enough that you could often identify a hotspot, have the analyzer remove those call chains from your data set, and continue to profile and identify a second-order hot spot without re-running the program. And it could pinpoint hot spots at the level of module, routine, line, or machine instruction, as you preferred. Made today's Java profilers look like toys.

All Brian Goetz's published articles can be accessed from here.

(End of interview).

Back to newsletter 026 contents

Last Updated: 2022-06-29
Copyright © 2000-2022 All Rights Reserved.
All trademarks and registered trademarks appearing on are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
RSS Feed:
Trouble with this page? Please contact us