This month we've got several pages giving various tips on Java web server technology, covering the full gamut of EJBs, servlets, JSPs and web servers. There is also a mixture of other tips, including swing and sockets, and an updated version of the HotSpot white paper was published by Sun.
Kirk's roundup focuses on a some of the more interesting discussions over the last month: warming up loops, load balancing and cache coherency. Kirk also gives us the benefit of his analysis on a couple of contrasting presentation styles, and we get to hear about one of the advantages of life in a sub-tropical environment (we readers in more temperate climates will just have to console ourselves with the knowledge that we get fewer nasty little fauna).
Finally, my usual reminder to our Japanese readers that Yukio Andoh's translation should be available at http://www.hatena.org/JavaPerformanceTuning/ in a week or so.
All the following page references have their tips extracted below.
All the following page references have their tips extracted below.
I was recently asked to review a presentation on Java performance tuning. Each section of the presentation started with a slide showing two code fragments, one labeled "Bad", the other labeled "Good". The slide following that contained a graph showing timings. In several cases, the timings showed differences that could only be measured in milliseconds. I?m sure that the authors had some reasoning as to why one code fragment was "bad" while the other was "good". I do wish they had shared their ideas with their viewers.
One section pertained to synchronization. The section
illustrated two common techniques to synchronize critical sections
of your code. The first technique used the
synchronized keyword as part
of the method declaration, the second used
the method body. The difference in total run time was less than half a second.
Now, if this result had been produced in a single method call, I?d be
concerned. But, if this difference could only be seen after making
thousands of calls, then is the difference really important?
Both techniques require the executing thread to acquire the monitor
for the object before proceeding, so what accounted for the difference
in timings? The answer is that if you declare a method to be synchronized,
then a flag is set in the class file and there is no byte code generated
to trigger the acquisition of the monitor: the acquisition is handled
directly by the JVM. Using the "bad" technique (
extra byte codes to be generated in the class files, and to be executed by the JVM.
Does this make the "bad" technique bad? The very narrow example used didn't
reflect on the effects that minimizing the size of your critical sections
can have on the execution of your code. It also ignored a third technique of
synchronizing on another object (i.e. not on
which could offer better performance characteristics in some situations.
I also recently had the opportunity to review a performance study written by a colleague of mine. The paper focused on the I/O performance characteristics of various java collections that had been persisted into an OODB. In the end, there was no "bad" collection to use, nor was there a "good" collection to use. There were situations in which one collection performed better than the others. It was left to the reader to decide which collection they should be using.
This paper contained a section on methodology. The reader was in a position to understand, what was done, how it was done, and why things were done the way they were. The reader was given enough information that he could reproduce the results published in this performance study. After reading the paper, I immediately started thinking about how I could apply the information provided. The presentation discussed earlier did not leave me with the same feeling.
Now, onto a question from the Saloon at the Java Ranch. The question concerns the HotSpot FAQ. What does "Warming up a loop mean"? Peter den Haan, a ranch hand, provided a great explanation on how HotSpot helps my rickety code run faster. He points out that with older versions of HotSpot, the JVM would swap your byte code for compiled code only after it had finished executing it. Thus, if you executed a large loop only once, you would miss out on the benefits of HotSpot. To account for this, developers "warm the loop" by executing it enough times to cause HotSpot to swap out the bytecode. Then they would run the loop for real. The trick is figuring out what "enough times" means for the JVM. Well, as Peter so eloquently points out, now you don?t have to. The more recent versions of HotSpot (from 2.0 onwards) will swap out the bytecode while the loop is still running thus eliminating the need to "warm up your loop".
www.TheServerSide.com raised the two part question: what is meant by clustering, and how does it help in load balancing? www.dictionary.com defines clustering as
a group of the same or similar elements gathered or occurring closely together; a bunch.www.webopeadia.com describes load balancing as
distributing processing and communications activity evenly across a computer network so that no single device is overwhelmed.From these definitions, we can see that if we connect two or more servers configured in the same manner, containing the same data, we will have created a cluster. Theoretically, we should be able to route traffic to any of the servers in the cluster.
A common way to distribute the load (or load balance) is to use Round-Robining. The danger with Round-Robin allocation is that it is not a fair share allocation. Requests are blindly forwarded to a server without regard to its current load. Under certain conditions, this can overwhelm a server in the cluster. To make matters worse, the load balancer does not give the server a chance to recover. In response to this problem, some manufacturers now produce "smart" routers that have the capability of monitoring loads on web servers. These routers use fair share algorithms to balance loads. But, is this all there is to it?
One other factor that we need to consider is the volatility of your data (or the rate at which your data is changing). If your servers do not remain similar (i.e. the servers go out of sync with each other), then you cannot route traffic based on load and the clustering scheme breaks down. This problem is known as cache coherency. Cache coherency is a well-known problem and performance stress is one factor that can cause loss of coherency.
In other words, to maintain perfect cache coherency, you may well need to sacrifice performance. To gain performance, you may well need to sacrifice cache coherency. How much of each will be sacrificed depends on the volatility of the data, the need to keep caches in sync, and the mechanisms employed to keep the caches in sync.
Each of the major J2EE application server vendors offers clustering, or in some cases "Extreme Clustering". To support their implementation, they all use some mechanism to mitigate the cache coherency problem. Depending upon conditions, one vendor's clustering scheme may perform better than another's. Since your environment determines the conditions you need to operate under, it would seem sensible to challenge each vendor's claim by creating prototypes and/or benchmarks that reflect your needs.
As old as batch processing is, it doesn?t appear that it will be going away any time soon. This is evidenced by several queries and responses to the question on how to design a J2EE batch process that performs well. While there are native tools that are faster than J2EE technologies, using them may present problems. The root of the problem is that native tools use their own interface and consequently by-pass the application interface. This can create a cache coherency issue. As one respondent pointed out, the solution is to have the entity beans flush, then read themselves before each method call, or take the system down while doing batch updates.
The first solution will cripple performance; the second will kill it (at least for the duration of the batch job). Another respondent focused on his experiences using a batch update to synchronize disparate systems. The techniques that they used resulted in them achieving results that more than satisfied their users (which surely is the most important goal). They used a session bean service layer to wrap their entity beans. They then determined that it was best to process the data in 10M chunks. It?s nice to see that this chunking strategy still applies in the J2EE world. The respondent claimed that they experienced no noticeable slowdown in response times when the batch process was underway. Their reasons for not using native tools: the systems that they were trying to synchronize were totally different as they were owned and operated by different companies. They used XML to help bridge the differences. It will be interesting to see how Web Services and UDDI are applied to EDI problems such as this.
In less than two years, my little Oronoco Banana tree has grown from a little stub in the ground to a towering 5 1/5 meters. Now at the end of its life cycle, it sprouted a stock. The initial growth rate was an astounding 30cm/day. That rate has slowed but still remains at 10 cm/day. Given this performance, if I had a performance award to hand out, I would hand it out to the Banana trees. I hope to report on how the bananas taste in the July issue of this column.
Swing performance tips (Page last updated 1999, Added 2001-05-21, Author Bill Harlan). Tips:
Swing performance tips (Page last updated March 2001, Added 2001-05-21, Author Steve Wilson). Tips:
Web application scalability. (Page last updated June 2000, Added 2001-05-21, Author Billie Shea). Tips:
Web Load Test Planning (Page last updated April 2001, Added 2001-05-21, Author Alberto Savoia). Tips:
Optimizing StringBuffer usage (Page last updated May 2001, Added 2001-05-21, Author Glen McCluskey). Tips:
Faster JSP with caching (Page last updated May 2001, Added 2001-05-21, Author Serge Knystautas). Tips:
Stateful vs. Stateless EJBs (Page last updated May 2001, Added 2001-05-21, Author Chuck Caveness, Doug Pardee). Tips:
Improving socket transfer rates (Page last updated May 2001, Added 2001-05-21, Author Rama Roberts). Tips:
Server performance testing (Page last updated 2000, Added 2001-05-21, Author Floyd Marinescu). Tips:
Optimizing entity beans (Page last updated May 2001, Added 2001-05-21, Author Akara Sucharitakul). Tips:
EJB best practices (Page last updated April 2001, Added 2001-05-21, Author Sandra L. Emerson, Michael Girdley, Rob Woollen). Tips:
Avoiding memory leaks in EJBs (Page last updated April 2001, Added 2001-05-21, Author Govind Seshadri). Tips: