Back to newsletter 036 contents
It?s been an interesting month for me. Jack and I have just returned from Hong Kong where we gave the Asian premiere of our Java Performance Tuning courses. For both of us, it was our first trip to Hong Kong, and we both found Hong Kong to be a lovely place. And for all of you that are hearing that companies are no longer willing to bear the expense of training their employees, what we are finding is that although there is most likely some truth to that statement, there are still many progressive companies that see value in providing their employees with opportunities to enrich their skill sets. In addition to being able to assist in that process, it also gives me an opportunity to advance my knowledge. Yes, that?s right, the trainees are also training the trainer. In every instance, it happens in a slightly different way.
Since some of our exercises do not have a specific best answer, we always have a couple of students that come up with some very inventive answers that satisfy the requirements. Our experience in Hong Kong was no different. Also, we often get asked questions that are directly related to the trainees day to day problems. When that happens, it?s a great opportunity for everyone to learn. This too happened in Hong Kong. In this case, the problem was an application that appeared to be bottlenecking on a search. The search wound it?s way through tens of thousands of objects looking for ?almost perfect? matches. It was this inexact comparison aspect of the search that eliminated many traditional optimizations. So, Jack and I sat down over dinner and we talked about a couple of possible solutions. When we got back in the next morning and started to discuss the problem again, we asked a few questions so that the whole class could catch up, then follow along. One of the first questions that we asked was, ?did you profile the code?. Not surprisingly, the answer was no. I say unsurprisingly because it?s very common that people do not profile the code when they have a bottleneck. They just note that this part of the application takes a long time to execute and since this is the complicated bit, it must most certainly be the bottleneck.
Well, Jack being the geek that he is, (and I do mean that with the deepest of respect), sat down in the airport and wrote a simulation of the searching aspect of the problem. After a few minutes of pounding away on the keyboard, he pops up from behind the screen and comments that he doesn?t see that the timings on the search are all that bad. This result confirms what we both felt, were we trying to solve the right problem? From the results of the simulation, it would appear as if the answer was no, we were not going after the bottleneck. Which meant we had insufficient information. Though we did not spend a lot of time on this problem, we still did spend some time on it. The lesson that we can take away from this experience is that profiling gives us the metrics that we need to ensure that we are tackling the real (and not the perceived) problem. In other words, profiling helps us to use our time much more effectively. That said, lets move on to the Java Ranch to see what is current topic of discussion down at the Saloon.
Top of the list is the question, "how to code review for performance"? For a posting to collect so many responses that are all tightly focused on the original question, you know that you?ve run into a topic where people?s experiences are universal. The answers to this question bear that out. Unequivocally, every response gives the advice that one unless the programmers have not followed best practices, the chances of improving performance by only inspecting the code is close to zero. Every posting offered the advice that the first step was to profile the code. One posting even offered the advice that Jack and I have always given, the first step is to set performance goals. There are many reasons for this but, in this case, performance tuning the application is an effort that will consume resources. By setting specific performance targets, you will be able to limit the quantity of these resources that will be consumed by the process. Without performance targets, when will you stop tuning any particular piece of the application? When will it be fast enough?
A number of the postings went on to back up their advice with some real life experiences. And, just to show you how durable and ubiquitous this advice is, one posting offered an experience acquired in the 70?s while working on an application running on a mainframe running in (you guessed it) Cobol! Tuning is a very dynamic process. Code reviews are very static. Given these two facts, it is not surprising that code reviews are generally not able to provide many improvements in overall performance.
In yet another posting, the thread focused on efficient caching. The real question seemed to be, if I have a number of objects in a collection, how can I find them given that I may need to search on a range of values. Of course, this is what relational technology is very good at doing, so it was a foregone conclusion that some of the postings would suggest that the query be done in a relational database. In many cases that answer would work but in this instance, the whole purpose of caching was to avoid a trip to the database. In fact, the purpose of any caching technique is to wrap a much slower technology so that we can avoid having to call upon it. Now one could just use an in-memory database, but that is not always a good option, and one still has to incur the costs of inflating objects. So, if moving the data into a relational model is not a solution then lets move the relational process into the object model.
In the relational model, adding more indexes to a table enhances searching capabilities. In the object world, a collection is analogous to a table. So, it seems logical that we should create a special collection that contains multiple collections (or indexes) on the underlying data. In this instance, we are interested in searching on a range and adding a N-M tree is a data structure that easily allows one to conduct that type of search. So we can add HashMaps where appropriate and also add an N-M tree to cover the range searching. If this sounds expensive then yes, you?re correct it does add some expense. But, then again, no multiple indexing comes for free. And when you consider the alternatives, the extra expense of a multiple indexing scheme is well worth it. After all, if it wasn?t, then it wouldn?t be so prevalent in the relational world.
With that, we move on to the Server Side where a question is being asked as to which J2EE application server one should use. For the most part, the responses run like a popularity contest, but that all ends with an intelligent post that points out that the best load balancing technique is to put a load balancer in front of the application server and ensure that their applications are stateless and they use optimistic transactions. Ensuring that your application is stateless means that your servers will not have to replicate state. This is often a very expensive operation that can result in serious performance degradation. The second point is to use optimistic transactions. As we all know, holding onto locks for longer than is necessary can also degrade performance. Using both of these techniques seems like a very sensible choice.
It would seem that the performance implications of AOP (or Aspect Oriented Programming) is now being considered. I?m not sure if this implies that AOP is mature enough that it is starting to move into the mainstream of thinking, or that we have just run into a very progressive post. In either case, does AOP help performance? Well, I don?t see how AOP can help performance. I do see how it might hurt performance or even obfuscate the performance tuning process. Imagine that you?ve identified a bottleneck in an injected piece of code. First, how would you know that it?s injected and secondly, how would you find the source? My guess is that if you were familiar with AOP, the fact that the byte codes may not align to any source would not be a point of confusion. Having said this, how many of us are already confused when everything is (supposedly) apparent. It would seem to me that of all the advantages that AOP would seem to offer, improved performance would not be one of them. At best, it should be neutral but as the technique is still new, the jury is still out on that one.
Last but not least, we visit my favorite discussion group, the Java Gaming group at http://community.java.net/games. As I scanned the list of topics, I?m stunned to see ?Converting from JDK 1.4 to JDK 1.1? appear in the list. Why on earth would one want to convert back to the JDK 1.1. Surely, this must be a mistake! I drill down into the posting and sure enough, someone wants to migrate to the JDK 1.1 and for a perfectly valid reason: not many people have taken the time to install the latest JDK. In fact, I can?t imagine any non-geek taking the time to install any JDK other than that which came with the OS. And in this case, that VM is Microsoft?s. It?s a shame to see progress stifled due to politics and questionable business practices but unfortunately, this is what has motivated the subject of this thread. It is interesting to see the thread run through many of the improvements in the Java platform that had been inspired by this participants of this gaming group.
The last thread that we look at this month is one concerning the JNI. The posting is a reminder that although we are playing in a sophisticated piece of software that is capable of managing multiple memory spaces, we are still dealing with a C application and with it, all of its strengths and frailties. The question is quite simple: does using the JNI allocate space on the Java or OS heap? For clarification, the Java VM is a process like any others. And though its memory model it dictated by the OS in which it has been compiled to execute in, there are some commonalities or things that we can count on. One of these commonalities is that the process will contain heap space. It is out of this heap space that the Java VM allocates its ?memory spaces?. In the Sun VM, that includes a new space, old space, survivor spaces and perm space. All Java objects are created in one of these Java Heap spaces. All other structures are created in the process heap space. This includes all of the structures needed to support the interactions between Java and process heap spaces. So, we can see from this that any space used by the JNI will be created in process heap space. Surprisingly, this can have an effect on performance as improper or excessive use of the JNI can bloat the process's in-memory image size. But most Java profiling and monitor tools only work within Java heap space. Is this yet another reason to avoid using the JNI unless it is absolutely necessary?
Back to newsletter 036 contents