Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Training online: Concurrency, Threading, GC, Advanced Java and more ...
Newsletter no. 5, April 20th, 2001
JProfiler
|
Get rid of your performance problems and memory leaks!
|
JProfiler
|
Get rid of your performance problems and memory leaks!
|
|
|
This month we have the usual eclectic mix of Java performance
related articles. Many of these are increasingly aimed at
server-side performance aspects, as are the performance monitoring
tools. And we also have the odd micro-Java performance articles
just beginning to show up.
Kirk continues his indispensible roundup of Java performance
discussion groups, and I've started adding Java performance book
reviews to the books section (mainly because John Zukowski's
comparative review of the available books gave mine the best
marks).
Please note that further requests for the training course cannot
currently be accepted due to lack of resources (i.e. my time).
Finally, a reminder to our Japanese readers that Yukio Andoh's
translation will be available at
http://www.hatena.org/JavaPerformanceTuning/ in a week or so.
Recent Articles
All the following page references have their tips extracted below.
Older Pages
All the following page references have their tips extracted below.
Other additions to the website
Discussion groups
Tools
Jack Shirazi
This month?s installment of my roundup includes comments from
performance lists found on www.javaranch.com
and www.theserverside.com.
As a special added bonus, I have included a brief review of a two part
JMS benchmarking article appearing in the March and April editions of the
Java Developers Journal.
The folks down on the ranch were
up to their usual business of providing great answers to every day
performance issues. The first question was about the differences in
performance when using a for
or while
loop.
The bartender quickly pulled his Java de-compiler of the shelf and showed
us that the byte code for the method
public void foo() {
Vector v = new Vector();
Iterator iter = v.iterator();
for(;iter.hasNext(); ){}
while (iter.hasNext()) {}
}
decompiles to
0 new #4 < Class java.util.Vector >
3 dup
4 invokespecial #7 < Method java.util.Vector() >
7 astore_1
8 aload_1
9 invokevirtual #9 < Method java.util.Iterator iterator() >
12 astore_2
13 goto 16
16 aload_2
17 invokeinterface (args 1) #8 < InterfaceMethod boolean hasNext() >
22 ifne 16
25 goto 28
28 aload_2
29 invokeinterface (args 1) #8 < InterfaceMethod boolean hasNext() >
34 ifne 28
37 return
So, the byte code is the same. More importantly, we see that
decompiling is a useful way for determining what effect your coding
style has on a particular compiler. In this instance, the answer is none.
A greenhorn came in and stated that his animation program was running
much faster on his PII350 with 64 MB of RAM running NT than on his PIII
500 with 128 MB of RAM running Win ME. Though there wasn't an explanation
for this phenomenon (though a guess was hazarded about NT having superior
multi-threading), it does point out that although Java is WORA (write once
run anywhere), it is still wise to test your application on the platforms
you intend to deploy to.
Another respondent was asking questions about how to best represent data
to accommodate a searching algorithm that needed to perform calculations.
This was to be a "flagship" algorithm that was to be at the heart of
performance. Having had previous experience working with "flagship"
algorithms, I found the advice given validated the position I have taken
in the past. The advice given:
- Don't get too bogged down in saving every little bit of speed early
on.
- Most of the optimizations you might make early on won't matter nearly
as much as you might think.
- If your program is spending 90% of its time in methodA(), doubling
the speed of methodB() won't help you much at all.
- Deferring a lot of the optimization until later will probably save
you a lot of needless work.
- Don't think that every little performance improvement will give you
exponential gains.
The Middleware Company (www.theserverside.com)
is writing a book on the J2EE. What is interesting is that they are
following procedure from Sun Microsystems and JCP. The Middleware
company is releasing the book a chapter at a time so that everyone can
review it and make comments. The feedback will be rolled back into the
book. So, in a sense, the community is writing the book. This will
certainly be a fun exercise to watch.
And now, from their performance-tuning list, we had a participant asking
a question regarding the execution speed of the JDK1.3 running on NT.
This produced one useful reminder that HotSpot Server VM for Windows NT
is a separate download for the JDK1.3, so don't forget that when trying
out JVMs.
There was a question on the performance factors of using the J2EE.
Though responses to this question were light, they were effective.
It is certain that system architecture is an important component to how
an application will perform. Sun Microsystems has done much to help
development teams architect scalable distributed applications. The J2EE
blueprint is a start. This has been followed by documents that describe
best practices and design patterns. You can find the documents on their
web site at http://java.sun.com/j2ee/blueprints.
On the question of JDBC performance, it was pointed out that executing a
statement with a large number of OR
s may be broken down by
the query optimizer resulting in a response time that might be longer than
if the query was broken down into several calls. So, what looked like a
Java performance-tuning issue quickly turned into a RDB performance tuning
issue. As is the case when tuning any system, you need to consider the
impact that every component in a system may have on performance.
Again on the subject of J2EE/EJB, a developer is trying to determine the
best way to bulkload data into a RDB. His initial choice was to use an
Entity Bean. Several people responded with the advice to provide the
bulk loading service via a Stateless Session Bean. The Stateless Session
Bean would wrap the Entity Bean. One person replied that they had better
performance using Container Managed Persistence (CMP). Though the responses
started in the right direction, in my experience there is no need to use
an Entity Bean when performing a bulk load. Doing all of the work within
the Stateless Session Bean will offer the same functionality with much
better performance.
As a final note, the March and April editions of the
Java Developers Journal
offers an excellent two-part article written by Dave Chappell and Bill Wood
(of Sonic MQ) titled "Benchmarking JMS Based E-Business Messaging Providers".
The article provides a very in depth description of how to analyze the
performance of a JMS implementation. In doing so, it addresses just about
every point that one needs to consider when carrying out any benchmarking
activity. The article is well structured and is supported with well
thought out scenarios and code. They also provide you with a strong sense
of the effort required to complete a benchmarking exercise. I highly
recommend anyone considering a benchmarking exercise read this article.
Kirk Pepperdine.
http://www.javareport.com/html/from_pages/article.asp?id=799&mon=4&yr=2001
Various strategies for connecting to databases (Page last updated March 2001, Added 2001-04-20, Author Prakash Malani). Tips:
- Use pooled connections to reduce connection churn overheads.
- javax.sql.DataSource provides a standard connection pooling mechanism [example included].
- Obtain and release pooled conections within each method that requires the resource if the connection is very short (termed "Quick Catch-and-Release Strategy" in the article). However do not release the connection only to use it again almost immediately, instead hold the connection until it will not be immediately needed.
- The performance penalty of obtaining and releasing connections too frequently is quite small in comparison to potential scalability problems or issues raised because EntityBeans are holding on to the connections for too long.
- The "Quick Catch-and-Release Strategy" is the best default strategy to ensure good performance and scalability.
http://www.javaworld.com/javaworld/jw-03-2001/jw-0323-performance.html
Designing remote interfaces (Page last updated March 2001, Added 2001-04-20, Author Brian Goetz). Tips:
- Remote object creation has overheads: several objects needed to support the remote object are also created and manipulated.
- Remote method invocations involve a network round-trip and marshalling and unmarshaling of parameters. This adds together to impose a significant latency on remote method invocations.
- Different object parameters can have very different marshalling and unmarshaling costs.
- A poorly designed remote interface can kill a program's performance.
- Excessive remote invocation network round-trips are a huge performance problem.
- Calling a remote method that returns multiple values contained in a temporary object (such as a Point), rather than making multiple consecutive method calls to retrieve them individually, is likely to be more efficient. (Note that this is exactly the opposite of the advice offered for good performance of local objects.)
- Avoid unnecessary round-trips: retrieve several related items simultaneously in one remote invocation, if possible.
- Avoid returning remote objects when the caller may not need to hold a reference to the remote object.
- Avoid passing complex objects to remote methods when the remote object doesn't necessarily need to have a copy of the object.
- If a common high-level operation requires many consecutive remote method calls, you need to revisit the class's interface.
- A naively designed remote interface can lead to an application that has serious scalability and performance problems.
- [Article gives examples showing the effect of applying the listed advice].
http://www.microjava.com/articles/techtalk/object_lists?content_id=1152
Basic article on a minimal ArrayList implementation, from a micro-Java slant (Page last updated March 2001, Added 2001-04-20, Author Lee Miles). Tips:
- ArrayLists are the fastest SDK collection class.
- System.arraycopy provides an efficient method for copying arrays.
- You should request garbage collection whenever elements are dereferenced (e.g. the list is cleared).
http://developer.java.sun.com/developer/JDCTechTips/2001/tt0327.html
How to use java.rmi.MarshalledObject (Page last updated March 2001, Added 2001-04-20, Author Stuart Halloway). Tips:
- MarshalledObject lets you postpone deserializing objects. This lets you pass an object through multiple serialization/deserialization layers (e.g. passing an object through many JVMs), without incurring the serialization/deserialization overheads until absolutely necessary.
http://developer.java.sun.com/developer/community/chat/JavaLive/2001/jl0327.html
Sun community chat session: Tuning the Java Runtime for "Big Iron" (Page last updated March 2001, Added 2001-04-20, Author Edward Ort). Tips:
- Use the -server option. Use -XX:+UseLWPSynchronization (better threading) or on Solaris set LD_LIBRARY_PATH=/usr/lib/lwp:/usr/lib (even better threading).
- Set the "young" generation space to 1/4 to 1/3 of heap space, e.g. -Xms1024m -Xmx1024m -XX:NewSize=256m -XX:MaxNewSize=256m. On Solaris use vmstat, pstat (utilities) and -verbose:gc (runtime option).
- GC is single-threaded (at least to 1.3.x), so cannot take advantage of multiple-CPUs (i.e. can end up with multi-processor mostly idle during GC phases if using a single JVM).
- Too many threads can lead to thread "starvation" [presumably thrashing].
- Use at least one thread per CPU, more if any threads will be i/o blocked. On Solaris use the mpstat utility to monitor CPU utlization.
- 1.4 will include concurrent GC that should avoid large GC pauses.
- The biggest performance problem is bad design.
- Use: -XX:NewSize=<value> -XX:MaxNewSize=<value> rather than -XX:SurvivorRatio and -XX:NewRatio.
- Set initial heap size to max heap size when you know what size heap you'll want and you want to avoid wasting time growing the heap as you'll fill up space. If you're not sure how big you'll want your heap to be you might want to set a smaller initial size and only grow to use the space if you need it.
- Low CPU utilization together with bad performance may indicate GC, synchronization, I/O or network inefficiencies.
- -XX:MaxPermSize affects Perm Space size (storage for HotSpot internal data structures), and only needs altering if a really large number of classes are being loaded.
- [The session also discussed some Solaris OS parameters to tune].
- For JDK 1.3, the heap is: TotalHeapSize = -Xmx setting + MaxPermSize; with -Xmx split into new and old spaces [i.e. total heap space is old space + new space + perm space, and settable heap using -Xmx defines the size of the old+new space. -XX:MaxNewSize defines how much of -Xmx heap space goes to new space].
http://www.sys-con.com/java/article.cfm?id=673
J2EE Application servers (Page last updated April 2001, Added 2001-04-20, Authors Christopher G. Chelliah and Sudhakar Ramakrishnan). Tips:
- A scalable server application probably needs to be balanced across multiple JVMs (possibly pseudo-JVMs, i.e. multiple logical JVMs running in the same process).
- Performance of an application server hinges on caching, load balancing, fault tolerance, and clustering.
- Application server caching should include web-page caches and data access caches. Other caches include caching servers which "guard" the application server, intercepting requests and either returning those that do not need to go to the server, or rejecting or delaying those that may overload the app server.
- Application servers should use connection pooling and database caching to minimize connection overheads and round-trips.
- Load balancing mechanisms include: round-robin DNS (alternating different IP-addresses assigned to a server name); and re-routing mechanisms to distribute requests across multiple servers. By maintaining multiple re-routing servers and a client connection mechanism that automatically checks for an available re-routing server, fault tolerance is added.
- Using one thread per user can become a bottleneck if there are a large number of concurrent users.
- Distributed components should consider the proximity of components to their data (i.e., avoid network round-trips) and how to distribute any resource bottlenecks (i.e., CPU, memory, I/O) across the different nodes.
http://www.sys-con.com/java/article.cfm?id=671
J2EE Application server performance (Page last updated April 2001, Added 2001-04-20, Author Misha Davidson). Tips:
- Good performance has sub-second latency (response time) and hundreds of (e-commerce) transactions per second.
- Avoid n-way database joins: every join has a multiplicative effect on the amount of work the database has to do. The performance degradation may not be noticeable until large datasets are involved.
- Avoid bringing back thousands of rows of data: this can use a disproportionate amount of resources.
- Cache data when reuse is likely.
- Avoid unnecessary object creation.
- Minimize the use of synchronization.
- Avoid using the SingleThreadModel interface for servlets: write thread-safe code instead.
- ServletRequest.getRemoteHost() is very inefficient, and can take seconds to complete the reverse DNS lookup it performs.
- OutputStream can be faster than PrintWriter. JSPs are only generally slower than servlets when returning binary data, since JSPs always use a PrintWriter, whereas servlets can take advantage of a faster OutputStream.
- Excessive use of custom tags may create unnecessary processing overhead.
- Using multiple levels of BodyTags combined with iteration will likely slow down the processing of the page significantly.
- Use optimistic transactions: write to the database while checking that new data is not be overwritten by using WHERE clauses containing the old data. However note that optimistic transactions can lead to worse performance if many transactions fail.
- Use lazy-loading of dependent objects.
- For read-only queries involving large amounts of data, avoid EJB objects and use JavaBeans as an intermediary to access manipulate and store the data for JSP access.
- Use stateless session EJBs to cache and manage infrequently changed data. Update the EJB occasionally.
- Use a dedicated session bean to perform and cache all JNDI lookups in a minimum number of requests.
- Minimize interprocess communication.
- Use clustering (multiple servers) to increase scalability.
http://www.javaworld.com/javaworld/jw-04-2001/jw-0406-syslog.html
Using the Syslog class for logging (Page last updated April 2001, Added 2001-04-20, Author Nate Sammons). Tips:
- Use Syslog to log system performance.
- Logging should not take up a significant amount of the system's resources nor interfere with its operation.
- Use
static final boolean
s to wrap logging statements so that they can be easily truned off or eliminated.
- Beware of logging to slow external channels. These will slow down logging, and hence the application too.
http://developer.java.sun.com/developer/Books/performance/performance2/appendixa.pdf
Appendix A (Garbage Collection) of "Java Platform Performance: Strategies and Tactics." (Page last updated 2001, Added 2001-04-20, Authors Steve Wilson, Jeff Kesselman). Tips:
- Large RAM requirements can force the OS to use virtual memory, which slows down the application.
- Most JVM implementations will not dereference temporary objects until the method has gone out of scope, even if the object is created in an inner block which has gone out of scope. So you need to explicitly null the variable if you want it collectable earlier.
- Adding a finalizer method extends the life of the object, since it cannot be collected until the finalize() method is run.
- Do not use finalizers to free resources in a timely manner.
http://www.AmbySoft.com/javaCodingStandards.pdf
Coding standards with a small but interesting section (section 7.3) on optimizations (Page last updated January 2000, Added 2001-04-20, Author Scott Ambler). Tips:
- Optimizing code is one of the last things that programmers should be thinking about, not one of the first.
- Don't optimize code that already runs fast enough.
- Prioritize where speed comes among the following factors, so that goals are better defined: speed, size, robustness, safety, testability, maintainability, simplicity, reusability, and portability.
- The most important factors in looking for code to optimize are fixed overhead and performance on large inputs: fixed overhead dominates speed for small inputs and the algorithm dominates for large inputs (a program that works well for both small and large inputs will likely work well for medium-sized inputs).
- Operations that take a particular amount of time, such as the way that memory and buffers are handled, often show substantial time variations between platforms.
- Users are sensitive to particular delays: users will likely be happier with a screen that draws itself immediately and then takes eight seconds to load data than with a screen that draws itself after taking five seconds to load data.
- Give users immediate feedback: you do not always need to make your code run faster to optimize it in the eyes of your users.
- Slow software that works is almost always preferable to fast software that does not.
http://developer.java.sun.com/developer/J2METechTips/2001/tt0416.html
Using Timers (java.util.Timer) (Page last updated April 2001, Added 2001-04-20, Author Eric Giguere). Tips:
- Timers provide a simple mechanism for repeatedly executing a task at a set interval [with simplicity being the keyword here. Don't look for anything sophisticated like thread interrupt control].
http://www-106.ibm.com/developerworks/java/library/j-super.html
Parallel clustering of machines using Java (Page last updated April 2001, Added 2001-04-20, Author Aashish N. Patil). Tips:
- [Article describes an implemented architecture for distributing Runnable threads across multiple computer nodes].
http://developer.java.sun.com/developer/TechTips/2000/tt0829.html
The Javap disassembler (Page last updated August 2000, Added 2001-04-20, Author Stuart Halloway). Tips:
- [Article describes using the
javap
disassembler, useful for identifying exactly what the code has been compiled into].
- Use the
javap
disassembler to determine the efficiency of generated bytecodes.
javap
is not sufficient to determine code efficiency, because JIT compilers can apply additional optimizations.
Jack Shirazi
Last Updated: 2025-03-25
Copyright © 2000-2025 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
URL: http://www.JavaPerformanceTuning.com/newsletter005.shtml
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us