Java Performance Tuning

Java(TM) - see bottom of page

|home |services |training |newsletter |tuning tips |tool reports |articles |resources |about us |site map |contact us |
Tools: | GC log analysers| Multi-tenancy tools| Books| SizeOf| Thread analysers|

Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks! 

Training online: Threading Essentials course 

Newsletter no. 26, January 29th, 2003

Get rid of your performance problems and memory leaks!

Java Performance Training Courses
COURSES AVAILABLE NOW. We can provide training courses to handle all your Java performance needs

Java Performance Tuning, 2nd ed
The classic and most comprehensive book on tuning Java

Java Performance Tuning Newsletter
Your source of Java performance news. Subscribe now!
Enter email:

Training online
Threading Essentials course

Get rid of your performance problems and memory leaks!

JavaWorld stopped free access to its archived articles on January 20th. You can still access new articles free for one week after they are published, but after that they are available only to subscribers.

Update: at the moment (end January) this policy seems to be on hold and all articles are currently accessible

This is an effort by JavaWorld to increase revenue. JavaWorld has a useful article archive, but the last few articles from JavaWorld seem to have been on how not to use Java. Two recent ones were about using Flash for presentation and moving Java apps to .Net. Hardly attractive articles for the readership of a Java magazine.

There is also an element of closing the stable door after the horse has bolted here. For those of you who need access to old archived articles, I suggest using Google's cache. If you search for a JavaWorld article on Google, Google presents the option of viewing Google's cached version of the article. You need to be a little clever about searching for second and subsequent pages of an article, but mainly they are all cached and accessible. This won't work for JavaWorld articles published after 20th January, but that's life.

Fortunately, all the most useful JavaWorld articles have had (or will have) their useful bits extracted here in our tips sections.

A note from this newsletter's sponsor

JProbe helps developers understand precisely what is causing
problems in Java applications - right down to the offending
line of source code. Download a free evaluation of JProbe today.

Here at, our archives remain completely free to access. And we are increasing the amount of information we are providing with our newsletters. This month we have added a new regular section, "Question of the Month". This month's question addresses the garbage collection algorithms currently available in JVM 1.4.1+. In the future we hope to add yet more regular sections.

The articles listed this month would be sufficient to set you up with a highly scaled J2EE site, including the infrastructure. Articles on NIO and webserver implementation (JAWS) show how to create a high performance server; High availability WAS shows how to configure for reliable high performance; distributed design and large scale architecture are covered by Venners and Ludin; then we have servlet best practices and Ace's Hardware showing how to optimize and scale servlet based J2EE; a detailed comparison of J2EE solutions from Rice University; and a couple of articles on efficient J2EE clients. We also list efficient pooling and sorting, optimized microjava games and how to build simulations and microbenchmarks.

Our other regular sections are all present. Kirk (the roundup) covers discussions on XML in 1.4 (slower), loop count ordering, timer resolution, EJBs, and more. Javva (the hutt) continues his diary, and relates a fictional (absolutely, definitely fictional) dialogue he didn't overhear. Our interview this month with Brian Goetz covers XML, a couple of expert groups, caching, when to optimize, better profiling tools and more. All the latest performance tips are extracted from our listed articles, and we have a new tool report on IBMs WebSphere Studio Profiler.

A note from this newsletter's sponsor

Get a free download of GemStone Facets 2.0 and see how this
patented technology supports JCA 1.0 to seamlessly integrate
with J2EE application servers to enhance performance.

Tool Reports


Java performance tuning related news.


A note from this newsletter's sponsor

Java Performance Tuning, 2nd ed. covers Java SDK 1.4 and
includes four new J2EE tuning chapters. The best value Java
performance book just got even better! Order now from Amazon


Recent Articles

Older Articles

Jack Shirazi

Kirk's Roundup

Before starting this months round-up, I?d like to give a final thought about the recent Middleware/Microsoft benchmarking debacle. Although Ricard Orberg makes good arguments in the Java Developers Journal about the folly of such an exercise, lets eliminate the rhetoric and consider this. It's tough that the Java Pet Store was never intended to be a benchmark, it is a real example of an application that both participants have reworked. The message to a non-technical management team is that .NET outperforms the J2EE. If you want more evidence, try using both the Java and .NET versions of Web Services side by side. From a technical perspective, we all know the value J2EE offers over .NET but try explaining this value of J2EE (in dollars) to the business side of the house. Now, we need to explain why this benchmark is flawed without sounding like we?re all winners. There are many times in life that you need to know the answer before you ask the question. Stepping into a benchmarking exercise like this is like not knowing about all of the exits before rushing into a fire-fight. Sorry Ed, but your team has just earned the famed Meadow Muffin award.

The JavaRanch

From all reports, the JDK 1.4 has proved to be much more performance oriented than its predecessors. But as one poster found out, there is no guarantee that your application will run faster in the JDK1.4. In fact, this poster experienced a 7x increase in run time. After profiling the application, it was noted that the newly included support for XML was responsible for the performance degradation. Once these classes were over-ridden with the previously used xalan package, the application ran as expected.

We also had a question regarding whether compiling a Java program to a native binary would improve performance. Though it did draw a bit of discussion regarding HotSpot etc, the best advice is the old adage, make it work, make it right, and make it fast by profiling.

Moving on to, you?ll find that the discussion groups are supporting a new and much improved look. And, all of this happened without affecting the quality of the discussions. For proof, lets consider a question regarding micro benchmarking. At question was the performance characteristics of the following three loops.

for (int i = 0; i < array.length; i++)
    // do stuff
for (int i = array.length - 1; i >= 0; i--)
    // do stuff
for (int i = array.length;  --i >= 0; )
    // do stuff

The second for loop would supposedly take advantage of a special instruction to test against 0. Though the answer to this question is VM/processor dependent, in the Wintel world, this is no longer a valid optimization. One argument for the first example is that processors may optimize front to back memory access. I don?t know of any processor that optimizes back to front access.

This discussion was followed up with a discussion on escape analysis. Using escape analysis, the VM would recognize when an object only exists on a local stack. The benefit that can be derived from this type of analysis is the ability to use a much cheaper form of GC. The JET VM has successfully used this technology to reduce the impact of garbage collecting these short lived objects.

Write Once Run Anywhere? Java game programmers certainly have a different view on this moniker as is evidently clear in a discussion of threads on Linux. In order to provide a smooth animation experience, gamers live and die on the quality of the timers that they use. Consequently a number of discussions center around the granularity, accuracy, and consistency of different timers. The common measurement is fps or frame per second. In this posting, the programmer was conveying observations he made while experimenting with sleep() in a timing loop. With a sleep time of 12ms, the frame rate was 33 per second. When the sleep time was set to 0, the frame rate jumped to 200 per second. Changing the sleep time to 1ms dropped the frame rate to 50. As it turns out, this behavior is related to the granularity of the timer. Calling sleep(1) will most likely not sleep for 1 millisecond. Instead, it will sleep for the time specified by the granularity of the clock. On Linux, this is typically 1ms. On Windows however, this usually results in a 10-15ms sleep. Were there other difference which affected performance? Well, the most significant one would be support (or lack there of) for the underlying graphics card. The lesson here is that even though Java goes a long way to support the WORA mantra, it does not completely bridge the gap.

The Server Side

From the Middleware company ( there is an interesting thread regarding the choice of middleware. Eventually, the discussion broke down into a comparison of RMI to SOAP. There certainly are trade-offs between the two technologies. First, RMI offers a much better performance profile than SOAP. On the other hand, RMI is an all Java solution. Because SOAP uses XML as metadata, it offers the possibilities of integrating non-Java clients. In the end, the choice should be driven by your requirements.

Another thread of interest involves the discussion regarding two different EJB architectures. The first uses Stateless Session Beans to wrap POJO (plain old java objects) that were retrieved directly from the database. The second architecture uses the same Stateless Session Bean to retrieve and wrap Entity Beans. The thread starts with a plea for configuration tips to improve the Entity Bean performance. What the poster has stumbled upon is the fact that Entity Beans are expensive to use. In this case, the data is read only so, there appears to be no benefit to incur the cost of using them.

Finally, we have a posting from a developer who is designing a typical ERP application. The developer wishes to use an application server but does not want to use an EJB server. In short, he was planning on building his own application server and was looking for any tips on how to handle concurrency. It was suggested that if performance was an issue, using a caching technology may result in a significant performance difference. Where is the Cache JSR when you need it?

Kirk Pepperdine.

Javva The Hutt

The following fictional interchange DID NOT TAKE PLACE. Anyone suggesting otherwise is completely incorrect, I deny any possibility that I did not make this up.

Joe: Hey Bob, have you got that password checking routine? You need to speed it up and get all the extra conditions by twelve or you lose your bonus.

Bob: Goddammit I know. This is driving me crazy. They want it faster and it has to support all these other damn systems. I've pretty much done them, but I'm having trouble with the MIT realm. It's got these freaky conditions if you try to change your password and it's driving me nuts trying to deal with them.

Joe: Why don't you just stick in the nutter test. You know, the one that no one will hit. It'll satisfy the functional requirements, and if anyone ever hits hit, we can always issue a patch. It's not like this is open source, no one will ever get to see the code.

Bob: You know, you're right. Who the hell logs onto an MIT realm anyway! Let's see ... if you try to change your password, I'll set it to output this stupid error message. That's what I call efficient. This is the fastest password check in the world!

I repeat, this interchange is entirely fictional and has no relationship whatsoever to the Microsoft Windows error message Your Password Must Be at Least 18770 Characters and Cannot Repeat Any of Your Previous 30689 Passwords. Why I'm sure that type of error could be produced in any system. In fact, here is a Java implementation:

public boolean checkNewPasswordIsValid(String password)
    throws InvalidPasswordException
  throw new InvalidPasswordException(
      "Your Password Must Be at Least 18770 Characters and "
        + "Cannot Repeat Any of Your Previous 30689 Passwords");

Diary of a Hutt

December 4. Well, well, well. All that seemingly pointless and annoying explaining about how Java is perfectly fast enough has brought some unexpected rewards. My section is now becoming a real department and not just a group under HasntGotAClue. Actually, my line management isn't changing, but I'm getting a promotion and some actual dweebs to push around. Yes! Power at last! It's not officially official yet, but we've talked the talk, and cleared the low hanging fruit and ... in case you didn't realize I've been surfing for managerial speak. I don't think I've quite got the knack yet but, hey, its only a matter of finding half a dozen really thin books on how to be a manager. I'm going to try and express myself using real-world concepts, but right know I'm facing a rain dance of rubber-volcanic proportions (the source for these last few can be found here).

December 11. Things are firming. Boris will be my first disciple. Not great material, but good enough. I made a bid for Weevil too. Didn't mention it to him of course, but I can't think of anything funnier than having Weevil under me. For a couple of months, that is, until I have to can him for something or other. But I suspect it's not going to happen. Not as long as Weevil can still breathe.

December 17. Little bastard Weevil can still throw spanners even in his current mostly ignored state. He practically burst a blood vessel when he heard about my proposed group, and then I could swear something popped when he found out about me trying to co-opt him. I saw him having lunch with HasntGotAClue and I didn't even know that people had veins in that part of their face! But by the end of the day the little twerp was smiling away, and I found out the next day that somehow he had screwed things up. I don't know how, but I do know that now I'm having to justify the internal Java performance group as a separate enity when a few days ago it was a done deal.

December 18. Still no freebies. I've dropped hints to each of the J2EE monitor salesmen that the other has given me a mug, but they seem to be too dense to get it. I've decided not to close this quarter. But first I'm going to bring them both to the edge. I'll try to see if they have veins in the same places as Weevil. Think of it as a modern experiment in physical anatomy.

December 18, pm. I made appointments for each of them for Friday, but I'm starting my holiday then, so I won't be here. Boris is going to see them. He has no authority to do anything, but given that he thinks he's Gods gift, he won't let them know that. I imagine they'll have to justify their products from scratch. Knowing Boris, he'll make them think that they have to get him on their side or they haven't got a hope.

December 30. Frantic calls from the salesmen. "Oh dear, I was really intending to sign today", (yeah sure), "but we're having an emergency on the server", (the office couldn't be quieter), "and I'll be too busy all day. Sorry. Call me next week." Boy, these guys are too amateurish. They were dropping the price on the phone, and they both sent me nice presents, bottle of wine and a mug. My freebies at last! Don't you love the end of the quarter. What a great time to buy something. Or not. Well, there's always another quarter. My budget transfers. I bet they didn't even make their quotas this quarter. Still, if you don't want that kind of pressure, you shouldn't be a salesman.


Javva The Hutt.

The Interview: Brian Goetz

Brian Goetz writes the IBM Developerworks Java theory and practice column, and is on several Java expert groups. This month, Brian took some time to answer's questions.

Q. Can you tell us a bit about yourself?

I am currently a Principal Consultant at Quiotix, a software consulting firm in Los Altos, CA. Over the past 20 years, I've done work in kernel internals, device drivers, protocol implementations, compilers, server applications, web applications, scientific computing, data visualization, and enterprise infrastructure tools. I participate in a number of open source projects, including the Lucene text search and retrieval system, and the WebMacro template framework. I also write a monthly column in IBM developerWorks Java Zone and have contributed numerous feature articles, including several on performance, to JavaWorld and IBM developerWorks Java Zone.

Q. What do you consider to be the biggest Java performance issue currently?

This is probably going to be an unpopular answer, but I think it is XML. Don't get me wrong, XML is a powerful framework for representing and validating structured data, but XML, or more precisely the overuse/misuse of XML (and the technologies surrounding it, such as XSL), has been one of the biggest sources of performance problems in many of the projects I've worked with.

XML and related technologies activate all the standard performance warnings -- lots of intermediate temporary objects, heavily layered implementations, extensive use of "interchange types", and complex specifications that make efficient implementation difficult. Choosing XML-based technologies can not only increase your memory and bandwidth requirements by several orders of magnitude but, if you are not careful, you'll end up with an application whose structure is driven by the structure of your XML tools, rather than by your business process requirements.

Q. Given what you say about XML, have you got any specific advice to our readers on how to reduce or prepare for those XML overheads?

If possible, try not to let XML into the core of your application. Use it at the periphery -- configuration, preferences, data import, data export -- but not as your in-memory data representation for core processing. Using it as your in-memory representation not only subjects you to negative performance impact, but it ties your implementation and class hierarchy way too tightly to the structure of how the XML parsing tools work.

Can you tell us a little bit about JSR 166?

The goal of JSR 166 is to produce a set of basic concurrency utilities which can serve as the building blocks of server applications. Server application developers need simple facilities to enforce mutual exclusion, synchronize responses to events, communicate data across multiple cooperating activities, and asynchronously schedule tasks. The low level synchronization primitives Java provides -- wait, notify, and synchronized -- are effective but too low-level for most applications, are difficult to use, and easy to misuse. JSR 166 will be contributing a package of utilities as the java.util.concurrent package, including efficient thread pools, thread-safe collection classes, synchronization primitives such as mutexes and semaphores, utilities for reliably exchanging data between, threads,. Many of the elements will be familiar to users because they have their root in the widely-used util.concurrent package by Doug Lea.

Q. Do you participate in other JCP Expert Groups?

I am also a member of the JSR 106 Expert Group, whose charter is to standardize a framework for providing caching services to J2SE and J2EE applications. I was motiviated to join this JSR because of my experience developing a caching subsystem for WebMacro, and my frustration at having to reinvent the wheel from scratch. Caching is critical to the performance and scalability of all sorts of applications, and developers should be able to take advantage of existing caching frameworks rather than having to write their own.

Q. What are the most common performance mistakes that you have seen in Java projects?

I think the most common Java performance mistakes fall into two major categories, which are opposite flavors of the same basic mistake, which is not giving performance management the correct place in the development process. The first class of mistakes is to ignore performance completely, and treat it as something that can be handled at the end of the project, like writing the release notes. This strategy is basically relying on luck, but it often works, so it keeps getting used.

The other common class of performance error is to let micro-performance considerations drive architectural and design decisions. Developers love to optimize code, and with good reason: it is satisfying and fun. But knowing when to optimize is far more important. Unfortunately, developers generally have horrible intuition about where the performance problems in an application will actually be. As a result, they waste effort optimizing infrequently executed code paths, or worse, compromise good design and development practices in order to optimize some component which doesn't have any performance problems in the first place. (See October's JT&P for a rant on premature optimization.) When you've got your head down in the code, it's easy to miss the performance forest for the trees. Making each individual code path as fast as it can possibly be is no guarantee that the final product is going to perform well.

The right balance is to integrate performance measurement and planning into the process from the beginning, but sit on your hands when tempted to implement some clever performance tweak you just thought up. Performance work should be driven by performance goals, which must be supported by performance measurement. Anything else is just "playing."

Q. What change have you seen applied in a project that gained the largest performance improvement?

That would have to be the effective application of caching, at multiple levels. When confronted with an expensive operation that is done frequently, there are two ways to improve your performance -- make the operation less expensive, or do it less often. While the former is often more fun, it is usually the case that the latter approach is easier to implement correctly and efficiently.

Unfortunately, the framework tools for implementing caching are quite poor, and so adding caching to an application involves a certain amount of infrastructure investment. Hopefully, when JSR 106 bears fruit, this will be less of an issue. Caching can dramatically improve both the response time and scalability of applications. Caching on the read side -- caching data retrieved from a database, web service, or file -- is used frequently, but caching on the write side can also be effective in applications where data changes frequently but some minimal level of data loss in the case of system failure is acceptable. An example of this would be a chat server which maintained a permanent record of chat data. By updating the data store no more than every minute, instead of every time a message is sent, you can eliminate a great number of writes to the database, at some small cost to data durability (is losing 30s of chat log in a system crash a disaster? Maybe not.) An important element of performance management is being able to determine when tricks like this are acceptable and when they are not.

Q. Do you know of any performance-related tools that you would like to share with our readers?

I am extremely disappointed in the current state of Java performance tools. The leading profilers, Jprobe and OptimizeIt, are fairly primitive as profiling tools go. They have certainly improved in recent years, but they still are intrusive and offer little in the way of drilling down to find all but the most obvious hotspots. At the risk of sounding like an old curmudgeon, I'm still waiting to see a profiling tool as good as DEC's PCA, which was released nearly 20 years ago. I've not seen one since that even comes close.

The enterprise performance tools such as those from Precise are impressive, but they are limited to a small number of J2EE containers, and are prohibitively expensive for smaller shops. Load generation tools such as SilkPerformer and LoadRunner are also quite useful, but they are also quite expensive.

Q. Could you describe some of the features of DEC's PCA profiling tool which you'd like to see in Java profilers?

PCA was a two-stage tool, with a low-impact data-gathering component (which would take CPU and stack trace snapshots while your program was running) and store them into a database for later analysis. Runtime overhead when gathering data was in the 5-10% range. After a run, you would use the offline analyzer to analyze the data, which included not only CPU data but call chain data as well. There was a rich query language, not unlike SQL, that let you select which CPU or call chain data points you wanted to feed into the table, chart, or graph engine. The result was a degree of drilldown that today's Java tools can't even come close to.

Using the query language, you could separate calls to A when B and C were on the call stack from calls to A when they weren't, etc, much more powerfully than the simple all-or-nothing filtering supported by JProbe or OptimizeIt. You could progressively refine result sets with a series of filters to drill down to the precision you wanted. The filtering was powerful enough that you could often identify a hotspot, have the analyzer remove those call chains from your data set, and continue to profile and identify a second-order hot spot without re-running the program. And it could pinpoint hot spots at the level of module, routine, line, or machine instruction, as you preferred. Made today's Java profilers look like toys.

All Brian Goetz's published articles can be accessed from here.

(End of interview).

Question of the Month

What is the difference between the various garbage collectors in 1.4.1?

The 1.4.1 SDK was released with at least six different garbage collection algorithms. To understand the differences between these algorithms, you first need to understand that in 1.4.1 (and previous JVMs since one of the 1.2 releases) the JVM heap is divided into two main areas: the young generation and the old generation. First, a digression on why there are these two areas of heap. (Note that the following explanations are simplified, avoiding complexities such as that the heap has more areas like Perm space, and that objects too large for the young generation are created in the old generation.)

Analysis of object lifecycles in many object-oriented programs shows that most objects tend to have very short lifetimes, with fewer objects having intermediate length lives and some objects being very long-lived. Garbage collection of short-lived objects can be achieved efficiently using a copying collector, whereas a mark-and-sweep collector is more useful for the full heap because this collector avoids object leaks. In their most basic terms, a copying collector copies all live objects from area1 to area2, which then leaves area1 free to reuse for new objects or the next copy collection. A mark-and-sweep collector finds all objects that can be reached from the JVM roots by traversing all object nodes (instance variables and array elements), marking all reached objects as "alive", then sweeping away all remaining objects (the dead objects). Copy collection time is roughly proportional to the number of live objects, mark-and-sweep collection is roughly proportional to the size of the heap.

So the heap is split into the young generation and the old generation so that a copying collection algorithm can be used in the young generation and a mark-and-sweep collection algorithm can be used in the old generation. Objects are created in the young generation, most live and die in that heap space and are efficiently collected without forcing a full mark-and-sweep collection. Some objects get moved over to the old generation because they live too long, and if the old generation gets full enough, a mark-and-sweep collection must run.

Okay, now you are armed with sufficient knowledge to understand the six 1.4.1 garbage collectors that I know about. There are three available for the young generation, and three for the old generation. Collectors labelled "parallel" use multiple threads to parallelize the collection and hence shorten the time taken on multiple-CPU machines. Collectors labelled "concurrent" allow application processing to proceed concurrently while the collection is executing, thus reducing or eliminating pauses in the application caused by garbage collection.

Young generation garbage collection algorithms

Old generation garbage collection algorithms

The team

Scaling Server Performance (Page last updated January 2003, Added 2002-12-29, Author Brian Neal, Publisher AcesHardware). Tips:
Implementing High Availability for WebSphere Application Servers (Page last updated November 2002, Added 2002-12-29, Author John Lamb, Michael Laskey, Publisher Tips:
Optimizing the performance of JAWS Webserver (Page last updated 1999, Added 2002-12-29, Author James C. Hu, Irfan Pyarali, Douglas C. Schmidt, Publisher Distributed Object-Oriented Systems). Tips:
Sorting (Page last updated December 2002, Added 2002-12-29, Author Alex Blewitt, Publisher JavaWorld). Tips:
Make Object Pooling Simple (Page last updated November 2002, Added 2002-12-29, Author Karthik Rangaraju, Publisher JavaPro). Tips:
Servlet Best Practices 1 (Page last updated December 2002, Added 2002-12-29, Author Jason Hunter, Publisher OnJava). Tips:
Servlet Best Practices 2 (Page last updated January 2003, Added 2002-12-29, Author Jason Hunter, Publisher OnJava). Tips:
Designing Distributed Systems (Page last updated October 2002, Added 2002-12-29, Author Bill Venners, Publisher Tips:
New IO API (Page last updated December 2002, Added 2002-12-29, Author Todd Stewart, Publisher OCIWeb). Tips:
Simulate discrete simultaneous events (Page last updated December 2002, Added 2002-12-29, Author David Mertz, Publisher IBM). Tips:{2286C770-2C28-4E6B-8881-C1AAF2963155}
Designing and Implementing J2EE Clients (Page last updated June 2002, Added 2002-12-29, Author Mark Johnson, Inderjeet Singh, Beth Stearns, Publisher informIT). Tips:
Intro to MicroJava Game creation (Page last updated December 2002, Added 2002-12-29, Author David Fox, Publisher OnJava). Tips:{67E79D2A-486B-4325-AD8C-654B2798D121}
J2EE Enterprise Bean Basics (Page last updated August 2002, Added 2002-12-29, Author Dale Green, Kim Haase, Eric Jendrock, Stephanie Bodoff, Monica Pawlan, Beth Stearns, Publisher informIT). Tips:
Benchmarking Method Devirtualization and Inlining (Page last updated December 2002, Added 2002-12-29, Author Osvaldo Pinali Doederlein, Publisher JavaLobby). Tips:
Performance and Scalability of EJB Applications (Page last updated November 2002, Added 2002-12-29, Author Emmanuel Cecchet, Julie Marguerite, Willy Zwaenepoel, Publisher OOPSLA). Tips:
Large-Scale Financial Applications & Service-Oriented Architectures (Page last updated December 2002, Added 2002-12-29, Author Anwar Ludin, Publisher BEA). Tips:

Jack Shirazi

Last Updated: 2017-12-29
Copyright © 2000-2017 All Rights Reserved.
All trademarks and registered trademarks appearing on are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
RSS Feed:
Trouble with this page? Please contact us