Java Performance Tuning

Java(TM) - see bottom of page

|home |services |training |newsletter |tuning tips |tool reports |articles |resources |about us |site map |contact us |
Tools: | GC log analysers| Multi-tenancy tools| Books| SizeOf| Thread analysers|

Our valued sponsors who help make this site possible
New Relic: Try free w/ production profiling and get a free shirt! 

Site24x7: Java Method-Level Tracing into Transactions @ $12/Month/JVM. Sign Up! 

The Roundup July 2005

jKool for DevOps
Light up your Apps & get a cool t-shirt

Get rid of your performance problems and memory leaks!

Java Performance Training Courses
COURSES AVAILABLE NOW. We can provide training courses to handle all your Java performance needs

Java Performance Tuning, 2nd ed
The classic and most comprehensive book on tuning Java

Java Performance Tuning Newsletter
Your source of Java performance news. Subscribe now!
Enter email:

New Relic
New Relic: Try free w/ production profiling and get a free shirt!

Site24x7: Java Method-Level Tracing into Transactions @ $12/Month/JVM. Sign Up!

jKool for DevOps
Light up your Apps & get a cool t-shirt

Get rid of your performance problems and memory leaks!

Back to newsletter 056 contents

The Server Side

In this months edition of the roundup, we start with a question from the serverside: Are modern JVMs obsolete? The fun part of me wants to tackle the mix of modern and obsolete in the same sentence but the answers suggest that just maybe the current set of JVM technology is not ready for the new pieces of hardware that are just now finding their way into the enterprise. The question quickly starts to focus on the key areas of the technology that currently don't fit in the 64 bit world, memory addressing and garbage collection.

Now it's true that the current JVM technology cannot utilize the amount of memory that the 64 bit world can offer it but the real question is, do we need or even want our applications to consume all of this memory? When one poster suggested that it didn't make sense to run a JVM with a 100 Gigabyte heap, he was rebuffed with the response that his solution to run many VMs on a single node was really a work-around. I'm left wondering if this is really the case. I'm wondering if we're now seeing memory spaces so large that people can make the choice that they made a few years ago (and still are today) when they were deciding between using a large mainframe or a set of smaller machines. Now you may think that this analogy is a bit of a stretch but when one considers the difficulties that each choice brings, the choices of the two eras (and hence the problems) look remarkably similar.

The first problem is plain and simple. How do I get the data to where it's needed when I need it. In the mainframe world this is not a problem because you are dealing with one large global memory space. If you are using a set of smaller machines, then the problem becomes one of serialization. Strangely enough, if we had the capability of running many applications in one large JVM, the resulting process model would look remarkably similar to having many processes running in a large global memory space. To continue with the analogy, running an application on sets of JVMs or machines leads to the age old problems of how does one efficiently transfer data between nodes.

There is one problem that is new in the world of JVMs and that is garbage collection. The cost of GC is related to the number of things left behind, not the number of things that have been collected. It may seem a bit counter intuitive at first but consider this, GC works until it fails, and in this case failure is defined as not being able to collect or reclaim memory. So the cost of GC is related to the number of things that it has to check. Furthermore, if an object is eligible for collection, GC will discover this fairly quickly, perform the reclaim, yet another quick operation, and then move on.

One other point on GC, generational spaces owe their existence to the desire to reduce the size of the heap that it has to troll through. This suggests that the technology favors many smaller spaces over one larger one. Though there are some exceptions to this rule, there is a lot of truth to this perception. The bottom line is, if the JVM is going to support larger heap sizes, GC is going to have to get a lot better.

We could continue this discussion and analogy by talking about single points of failure, reliability, availability etc. My take on all of this is; just as hardware has been able to keep up, so will the JVM. Right now they are making excellent strides to improve GC. Expect "Escape Analysis" to provide the JVM with yet another means to further partition large heaps into more manageable sizes. Now you all might think that this discussion is a bit academic in nature and to some extent, you'd be correct. We are now just starting to run into applications where there are requirements to keep upwards to a terabyte of data in ram at all times. So once again, our requirements for machines with larger amounts of RAM outpace our ability to provide them. Sometimes I long for the days when 640k was a whole heap of memory.

The JavaRanch

In the first post we take from the JavaRanch we see someone trying to make sense of a very incomplete set of requirements. The sponsors of the project are expecting their user base to expand to 50,000 from their current levels of 10,000. It is typical for this type of requirement to be stated so what is missing? Well here is a good list that was posted by the forums moderator Ilja Preuss.

The essence of these questions is; how much load will these 50000 users actually put on the application? The sponsors of the application may be proud (and rightfully so) that they have so many users signed up and of course this is what they will focus on. That said, it is your job to make sure that the system is actually usable for all of these users. Your task will be easy if only 10 of these 50000 are using the system at once. It gets much more difficult if they are all heavy users of the system.

There are not that many open source J2EE (as in something that covers the entire stack) performance tools so it was nice run across this link to InfraRed. I've not had an opportunity to test drive the tool but I am looking forward to it. Of course you can always look for other free and commercial options on our resource pages.

This month's offerings from are deep in numbers and content. Consider this for starters. "Topic: GC Bomb - CMS GC starves high priority system threads". Now here is the scoop. This observation has only been made in XP and using a huge amount of deductive work, the group pulled together a reasonable explanation for it. In order for GC to do its job, it must be able to run all by itself and it cannot be interrupted. To pull off this feat, the thread running CMS GC (the concurrent version of GC) asks the kernel for a lock. The owner of this lock has the complete control of windows to the point that it will interfere with the ability of windows to respond to cursor movements. Why this only happens in CMS and only in Windows is still a mystery. One thing for sure, unlike other GC algorithms, CMS acquires and releases locks several times during a single run. It is this type of behavior that would cause windows to stutter. A very interesting find when you consider that all they had to work with is some peripheral information.

Are object casts expensive to process? The astonishing answer to that question was no, they take about 1 clock tick or 0.3ns on the machine used to make the measurement. Just in case you didn't get that, the answer was 1 clock tick! That leads directly into the next question, how in the heck does one measure something that happens in one clock tick? Certainly there is no micro-benchmark that can make measurements of that accuracy. So how was this calculated? The java code was translated into machine code. Since machine code executes in a fix number of clock ticks, the measurement could be made. The thread does reveal just how far microprocessors have advanced in the last few years. It used to be that one needed to buy a Cray (or similar hardware) to get the pipelining features that are commonly used in today's hardware. Now with multi-core CPUs becoming more popular, the Crays of the past don't look so amazing anymore.

Till next month.. keep on tuning!

Kirk Pepperdine.

Back to newsletter 056 contents

Last Updated: 2017-03-01
Copyright © 2000-2017 All Rights Reserved.
All trademarks and registered trademarks appearing on are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
RSS Feed:
Trouble with this page? Please contact us