Back to newsletter 022 contents
A while back, I was asked to tune an application that was running in an application server. After a day and a half of taking measurements, performing calculations, and looking at performance graphs, I made the hypothesis that the source of the problem was that the size of the transactions in the (OO) database were at the root of the problem. In this instance, the solution is to bulk up the commits. At issue was the client?s refusal to allow any coding changes. That?s when I realized that I was called in to find the mystical "go fast" button.
After series of discussions, I was allowed to make some coding changes in a development environment. I immediately went to work to change the 20 or so lines of offending code. The subsequent tests demonstrated a 600% overall performance improvement. This improvement was in line with simulations performed in another environment (once the differences in environments were accounted for). It was at this point that the fun began. First, I was told that the coding changes could not be used. Realizing the futility of continuing the exercise, I made plans to go home.
As unreasonable as the clients position seemed to be, it was the only responsible position that they could take. The reason for this was that all of their quality assurance and regression testing was performed by hand. The process took several weeks to complete. In effect, their testing process had handcuffed their development team. I?m sure we?ve all heard variants of the quote "Make it work, make it right, make it fast." As much as I am in agreement with that statement, I have had the opportunity to work with another brilliant IT specialist who always said that you need to "plan for performance". The question is, are these two statements incompatible?
In this authors humble opinion, the answer is no. Projects already plan for the first and second steps. Step 3 is almost always inevitable. The real question is, was the "make it fast" step planned for, and if so, how much thought and effort was put into that planning.
In the case of this client, I was called in just days before the system was scheduled to go live. And as good as it felt to find the source of the problem, it was equally frustrating to leave the client in the position of having to go live with a system that was not going to meet the end users needs. But, I do take solace in the fact that this is still better than walking into a war room where their newly released application is not performing to expectations. But, these are other war stories for another time. And now for this months round up of the performance tuning discussion groups.
As Jack stated last month, my schedule is a little crazy as of late. Consequently, there is quite a backlog of material sitting in the performance tuning forums. So, lets start with a trip to the Saloon and the new look of the Java Ranch. First up is an interesting question regarding the performance implications of sleep vs. yield. Though there was a bit of confusion at first, the sheriff stepped in to calm the greenhorns down by telling it like it is. The first point made was that yield and sleep work in very similar manners. Of this, the most important behavior to remember is that the target thread will not relinquish any monitors that it may have acquired. Though this maybe reasonably benign in the case of yield, calling sleep on a thread whilst running in a synchronized block should be avoided.
The next question to ask was, would declaring everything
final that could be declared
final offer some performance advantages? To answer this question, we must first understand that
final is a tag or a hint to the compiler that once declared, this entity will never change. This allows the compiler to optimize variables by replacing them with their direct value. It allows the compiler to in-line code blocks of
final methods. It also allows the compiler to predetermine the target of a method call thus eliminating the need for the VM to perform the method lookup at run time.
So, with all of these advantages, the answer?s got to be yes, right? Well, as in other things in life, the real answer is, it depends. What does it depend on? For starters, the implementation of the VM has a lot to do with performance. For instance, hotspot is able to perform significant optimizations at run time that rightfully make this type of choice irrelevant. The bigger issue is: what impact would declaring things
final have on your ability to extend and maintain an application. On top of this, my previous experiences with the static Cray optimizing compiler tell me that what may seem like a good idea at the time may actually backfire as it does not have enough information to make a choice, so it guesses. The fact that the major database vendors all include a run time query optimizer is further evidence that static optimizations are typically not enough. To quote the bartender, "Obfuscators can do this easily (inline code and replace variables with
final values), and it makes no change to the source code". Seems like a plan.
Finally, the question of why StringBuffer does not outperform the + operator. The answer for this one can be found at http://c2.com/cgi/wiki?StringBuffer. What makes this thread interesting is it?s length and the fact that one benchmark revealed a timing of 0ms. The lesson here is that you need to be very careful when performing micro-benchmarks that your results really are representative of what you are trying to do. As simple as this may sound, it is my experience that setting up a valid micro-benchmark is more challenging than setting up a regular benchmarking exercise.
Every once in a while, I come across a thread of discussion that I just cannot summarize. I found one such thread on the Java Gaming performance discussion group. You can navigate to it by hitting; Performance Tuning: Profilers: HPROF. In summary, the back and forth discussion reveals the co-operative thought process as two individuals work to understand a performance issue.
In a similar feat of co-operation, there was a group review of the performance-tuning article that appeared in the 8th issue of the Java Developers Journal. In this discussion, members debated a number of the points brought out in the article. I do know that the JDJ has instituted some peer reviews, which has led to a dramatic increase in the quality of the materials. Even so, the members of this discussion group have really questioned the value and relevance of the information presented in this article. If you plan to use any of the points in the article, I?d recommend reading what these people have to say about it and then run your own tests so that you can decide what is best for your situation.
In contrast to the last two "love" fests, there is a fairly heated discussion on why certain performance enhancing features have not been added to the VM. A fair portion of the discussion centered on the need for a "shared" VM. There are a number of articles that discuss how one might share a VM. In the end, all of the articles that I?ve seen end up demonstrating that the usefulness of using a shared VM is hindered by the inability to share static resources. The underlying tone in this thread is based on that point. In addition to the points already made in that forum, one should consider the effects of the current implementation of GC and thread scheduling. Now, the JDK 1.4 has made many improvements in the efficiency of GC and thread scheduling but, there is still much more work to be done so that different applications don?t interfere with each other as much as they do today.
As usually the case, the Server Side performance discussion group offered a number of high quality nuggets of information. The first came as a result of CMP performance characteristics. Of the points made, it was interesting to see the response to the statement, "No matter how good you are, you will not be able to write a persistence layer as good (or better) as the one your container provides, otherwise you would be working for your container vendor by now ;)". The response: sure, they can write a better general purpose persistence layer, but when I'm writing my application I know exactly what pieces of data I need *right now*. I can write one big SQL statement with all the joins, and then optimize that query to my heart's content. I?m not sure that this position doesn?t violate the spirit of the "Make it work, make it right, make it fast" axiom.
How could an entity bean application running in a transaction out-perform one not running in a transaction? This was the question of one puzzled developer. The answer is in how the EJB server solved the age-old cache coherency problem. While involved in a transaction, EJB?s are protected from changes made by others. Thus the ejbLoad and possibly the ejbStore operations can be avoided. If there is no transaction and consequently, no protection from change, then the ejbLoad operation will need to be performed. Puzzle solved!
Last but not least, there was a posting regarding where one might find information on performance tuning Java applications. There was only one response to this post and I?m happy to report that it said that www.javaperformancetuning.com was the site to visit. And the answer is no, neither Jack or myself paid or otherwise coerced anyone to make that posting.
Finally, a friend of mine that worked for the now defunct WebGain Inc recently contacted me. He requested that I bestow a Meadow Muffin award to the management that decided to put all of their eggs into the IDE basket and then failed to deliver on their all Java version of WebGain Studio before the money ran out. Consider it bestowed. But, in all fairness, I will also bestow a Meadow Muffin award to Borland as they decided to hasten the cash burn by suing WebGain over alleged patent violations. In this author?s opinion, the suit helped drain the life?s blood out of an already ailing company. And just to keep the record clean, I?d also like to give honorable mention to BEA inc. After BEA spun off WebGain, it seemed like some elements of their sales staff became a lot more competitive trying to sell their WebLogic CMP caching product to TopLink accounts and/or prospects (TopLink is an O/R mapping tool now owned by Oracle, built by the Object People and subsequently purchased by WebGain). With this type of alleged behavior between supposed partners, one can only applaud Sun?s ability to keep the Java ship sailing as smoothly as it has been.
Back to newsletter 022 contents