Java Performance Tuning

Java(TM) - see bottom of page

|home |services |training |newsletter |tuning tips |tool reports |articles |resources |about us |site map |contact us |
Tools: | GC log analysers| Multi-tenancy tools| Books| SizeOf| Thread analysers|

Our valued sponsors who help make this site possible
Site24x7: Java Method-Level Tracing into Transactions @ $12/Month/JVM. Sign Up! 

New Relic: Try free w/ production profiling and get a free shirt! 

The Roundup August 2004

Get rid of your performance problems and memory leaks!

jKool for DevOps
Light up your Apps & get a cool t-shirt

Java Performance Training Courses
COURSES AVAILABLE NOW. We can provide training courses to handle all your Java performance needs

Java Performance Tuning, 2nd ed
The classic and most comprehensive book on tuning Java

Java Performance Tuning Newsletter
Your source of Java performance news. Subscribe now!
Enter email:

Site24x7: Java Method-Level Tracing into Transactions @ $12/Month/JVM. Sign Up!

New Relic
New Relic: Try free w/ production profiling and get a free shirt!

Get rid of your performance problems and memory leaks!

jKool for DevOps
Light up your Apps & get a cool t-shirt

Back to newsletter 045 contents

It's been about a month since I decided to turn off swap (or tone it down as much as I could) on both of my machines. At first, I was only willing to risk my W2K system. The silly thing is that it has 576M of memory whereas my XP machine has 1G. I was a bit concerned over what might happen should I run out of memory. But then I rarely use that much memory. Case in point, I'm currently running a WL admin server, and IDEA, as well as a full blown Weblogic Server and a toned down Weblogic Express while running email, a browser, chat and word and all in less than 600M of ram. It's only been recently that I've been running three instances of Weblogic, so while that might present a problem for my smaller machine, it certainly doesn't present any problems for my larger (and preferred development) machine. As I stated last month, memory is now quite cheap so why should I sacrifice performance for more memory when it's so inexpensive to get REAL memory.

It turns out that a number of people who read my blog agreed and have now turned off swapping on their PCs. Interestingly there appears to be a trend that has spread to my favorite yahoo mailing list where the bold and brazen have also decided to eliminate swap. And just to make things really interesting, my favorite group of intrepid performance maniacs has apparently decided that virtual memory is counter productive for them also. The net result: there is now a small army of Java folks that have decided that virtual memory's day has come and gone.

The strange thing is, each and everyone of these groups had a different motivation for starting down this path. For example, one of the ST-J-ers (my yahoo group friends) was complaining that Windows was forever swapping out his favorite IDE, Eclipse. He'd walk away for a drink only to come back and find that windows had decided to commit Eclipse to disk. On the first key stroke, all he could do was watch the disk light blink as windoze (as he likes to call it) brings eclipse out of it's coma. So instead of biting the head of the hydra, he went off and found a nice little utility ( called keep resident. This is interesting because we now have a point of comparison against turning swap off altogether. So far, keep resident has managed to keep windows from swapping Eclipse most of the time.

Not that life has been a bed of roses on this side of the isle, for as much as I'd like to say that I've completely shut down swapping, it just hasn't happened. As it turns out, windows lies. It keeps a minimal area of swap space as occasionally evidenced by my disk light going nuts as the OS tries to squeeze my application through the eye of a needle. My guess was that there were some operations that forced windows to swap. The Microsoft Knowledge Base Article - 293215, has now provided credence to that assumption. In that article it states that when a top-level window is minimized, the operating system will "trim the working set" of that process. This what we used to call in Cray land, a roll-out repack and roll-in.

Even though Cray UniCOS didn't utilize virtual memory, it still had to solve some of the issues associated with memory management normally taken care of in swap space. These issues would force the OS into having to re-organize memory. It did this by swapping (or rolling) out the entire process out to disk. It would then repack the disk image before swapping (or rolling) it back into memory. The difference between a Cray and a machine that supports swapping is that on a Cray, the image is not runnable when it is not in RAM (i.e. it is swapped out). Repacking an application on a Cray was something that you very much wanted to prevent because at 20+ million USDs a pop, utilizing every clock tick is economically important. In a system with virtual memory, the cost of reorganizing a memory image is much less noticeable (and the machines are a whole heck of a lot cheaper).

So, it would appear that if Windows decided to reduce the working set of a process, it would naturally have to push it through swap space in order to achieve that effect. I say force because since I've turned off swap then it shouldn't happen.

The MS knowledge base paper does goes on to describe how one can capture that signal and prevent windows from swapping out the application which helps explain how "stay resident" might work. It also helps to explain why we both cannot keep applications in memory all the time.

Having learned all of this, I'm not about to turn swapping back on because, even though things get swapped out, it happens very rarely. Most of the time, I'm finding that my machines are much more responsive than they used to be even if every once in a while, the disks kick in as the OS decided that this or that page needed to be pushed to or read from disk. Though I no longer feel the performance improvement, writing this has reminded me that it is there. I am no longer annoyed at Word when it suddenly locks up and spins the disk. If you were thinking that this was due to the automatic save feature, then yes, there may be a connection, but I'm still using the feature and I no longer feel these long delays.

So, go on, be brave and join me in the Programmers Movement against Swapping (PMS). Go and turn off swapping and then come back and read the rest of this months' round up. I may not make you a speed reader but it will certainly help render them faster.

It should be no surprise that the first posting from is a discussion on the seemingly uselessness of virtual memory on a modern PC. The interesting point that was brought up during the discussion is that a 128M heap can be drawn into memory from disk that is being read at 20Mbytes/second in just over 6 seconds. So why do the delays seem longer than this? In one of the responses it was pointed out that when a process is swapped out, it's working set is greatly reduced. When the process is swapped back in, the first thing it realizes is that it needs more memory. It then has to go back and negotiate with the OS for more memory. Once it gets the memory, it has to swap it into RAM. Doing so often results in data being re-read into real memory several times. So in effect, you may need to read in 100s of MBytes of data in order to fill a 128MByte space. And for all you Linux fans, life is worse in that Linux will swap your application out even it is being used. The saving grace is that at least with Linux, you can truly eliminate swap. FYI, the Solaris JVM does come with an option that will lock it into memory (search for documents on ISM).

Normally, Java gaming is a love fest. In fact, I can't remember if I've ever seen a contentious discussion. So who would think that a topic such as "does HotSpot eliminate dead code" would get heated? It all starts out civil enough. We even get a "new" unpublished option, -XX:+PrintCompilation, that allows us to dump compiled methods. The controversy started with the statement "The goal is to understand what the compiler is/is not optimizing so you know if you need to hand optimize something yourself." Wow, this is quite a statement!

In many respects the quote is very insightful. Take the Cray computer systems for example. To write effective code for a Cray you first had to understand its architecture. Once you understood that, the next step was to understand the code that was produced by the compiler. Once these items were known, one would know how to adjust one's code so that it might best utilize the hardware. This is a clear example of where one needed to understand what the compiler would do so that once could hand optimize the code. Of course there came the usual rounds of postings about how using a profiler will produce better results. But what can you do with the information from a profiler if you don't understand how that code may be optimized. It can be done but it is certainly less prone to guess work if one understands what the compiler will optimize.

The JavaRanch

Turning to the Java Ranch (, we find in our first posting someone who is looking for performance testing tools. First there is an interesting discourse on the numbers produced by Apache JMeter. It seems as though some have been less than satisfied with their accuracy. Interestingly, this is a tool that Jack and I take advantage of in our Java Performance Tuning courses and we've not noticed a problem with the results that we are receiving. That said, it does show that it's worth your time to test everything that you are going to use including the tools that will be used for testing. In another interesting aside, it would appear that "author" J.B. Rainsberger has stolen a page out of our book when he commented on how to specify performance goals. I could swear that I wrote the line in the posting myself.

When it comes to performance there is just a never-ending stream of questions regarding coding best practices. When you really think about it, coding best practices can offer minor performance improvements at best. If questions come down to: "is it better to use a switch statement over an if", as was asked in one of this months postings, then the answer is: I DON'T KNOW!!!! Ok, maybe a bit of an over reaction but performance tuning is often finding a very specific answer to a very specific problem. Yes, there are design/coding practices that are definitely bad for performance and these are much easier to quantify than those practices that are better for performance. But when it comes to the difference between "if" and "switch" statements, and the question is asked in the abstract, then there cannot be a right or wrong answer. For whatever answer is offered, there will be a infinite number of situations where it will be wrong. If we always did have the answer, then we wouldn't need performance tuning exercises as we'd be able to generate the best solutions. Since we can't do that, we'll be relying on programmers for quite some time to come.

The Server Side

From the server side, we run into a short but interesting post from someone who is experiencing latency issues with regards to JSP requests but is not seeing any performance issues when serving up static content. The causal posting mentions that quad CPU on the Sun Solaris machine is running at 50%. Even though there doesn't seem to be a lot of information offered, there is still enough to garner out some interesting facts. First point, the application never utilizes more than 50% of the CPU. So, right off the mark, we know that we do not have a CPU problem. Thus, we can ignore just about all of the advice provided in the follow on postings. Second piece of information is that static content is served up without any unreasonable latency. What this says is that the network between the client and the server (at least the part that serves up static content) is also not the bottleneck. Further more, the fact that static content is being served up in a timely manner suggests that there are no memory or disk IO performance bottlenecks. This leaves but one problem, a bottleneck within the application, contention on critical sections (single or restricted threading) of your application. In the case of a Servlet engine or a web server, this most likely would be a problem with the number of threads in the pool to run Servlets or in the connection pools that are maybe used by the JSP pages if and when they do DB calls. In either case, doing a thread dump is a simple means to diagnose the problem. I should mention Cameron Purdy as offering the most probable cause of this applications performance problems.

In our final posting from the serverside, the question is which technique should be used to measure the amount of memory that a process is using. Specifically, the posting wanted to know if either the information from java.lang.Runtime or from Top (Unix utility) should be used. The answer is; use ps -o in Solaris.

Kirk Pepperdine.

Back to newsletter 045 contents

Last Updated: 2017-04-01
Copyright © 2000-2017 All Rights Reserved.
All trademarks and registered trademarks appearing on are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
RSS Feed:
Trouble with this page? Please contact us