|
|
|
Published August 2002
HPjmeter is a free, supported Java performance analysis tool that graphically displays important metrics related to the performance of Java applications. The metrics displayed include:
-Xrunhprof
(JVMPI, available
on all platforms)
or -Xeprof
(Low-Intrusion Extended Profiling available
on HP's Unix, HP-UX).
HPjmeter reads the profile data produced by the Java VM (or a profiling agent). Therefore, using HPjmeter always requires two steps:
-Xrunhprof:<options>
to the command line)
options
.
In most cases,
the options for analysis of the application execution time will be different
from the options used for heap analysis.
For example, to collect the CPU time data using sampling when running
MyApp
program, one can use:
java -Xrunhprof:cpu=samples,thread=y,depth=12,cutoff=0 MyAppThe
hprof
profiler is available for most Java 2 platforms,
although its robustness may vary.
For more information about the hprof
options
please see the on-line help in HPjmeter's Help -> User's Guide,
"Profiling with -Xrunhprof."
Download HPjmeter (no cost) from the
Hewlett-Packard
Java website. Just unzip the downloaded bundle to get the jar
file, and invoke it with:
java -mx128m -jar HPjmeter.jar
To demonstrate the metrics used in performance analysis, we will use SciMark2 (http://math.nist.gov/scimark2/), a small Java benchmark for numerical computing.
Because the default options for the benchmark result in runs that are too short
to collect meaningful data, we choose the -large 100.0
option
to let it run longer.
The command line to run the benchmark with profiling is:
java -classpath scimark2lib.jar -Xrunhprof:cpu=times,cutoff=0 jnt.scimark2.commandline -large 100.0We used Sun's Java 2 SDK 1.4.0 on Windows 2000, but the same command can be used on other platforms.
Upon completion, the benchmark displays the score, but more importantly,
the profile data file java.hprof.txt
is written to the current folder.
We can now analyze the data, so we start HPjmeter and use the File -> Open command to load the file. The metrics available for the file are highlighted in the Metric menu (see picture below). The relationship between the available metrics and the profiling options used is described in HPjmeter's Help -> User's Guide, "Using HPjmeter - Features and Platforms Summary."
We start by looking at Method Call Count.
HPjmeter displays the methods sorted by the metric value, with the
methods consuming the most resources at the head of the list.
We can see two methods dominate this metric, while all the other methods
are called relatively infrequently.
To find out how the topmost entry,
jnt.scimark2.Random.nextDouble
is used by the application,
we select it by single clicking on its name, and then on the Mark To Find
button (the green check mark , pictured below).
This puts the name of the selected package.class.method
into HPjmeter's Edit -> Find facility and
makes that item easy to find when another metric is displayed.
Next, we look at Call Graph Tree (CPU). We have a choice to view the new metric in the same window as the old one, or open a new window (Window -> New Window) that allows us to visually cross-reference items in different metrics.
Initially, the call graph is displayed in the collapsed form, showing only root node(s). Individual nodes can be expanded and collapsed, just as they can in traditional tree views. However, the tree can be also expanded automatically by searching for a specific node. By default, the search will look for the most recently marked method. The name of that method is also visible by looking at the contents of the Edit -> Find popup window. The Find Immediately operation (click on the button with the binoculars and lightning bolt, ) performs this search.
In general, many call paths can lead to a particular method. HPjmeter always chooses the path representing the consumption of the highest amount of CPU time (or the specific resource for other metrics) for the method displayed. The numbers displayed by HPjmeter for the call graph should be interpreted as follows:
jnt.scimark2.Random.nextDouble
(and its descendants, if there are any) when called directly from
jnt.scimark2.MonteCarlo.integrate
is 64519
time units
87%
of the total time spent in
jnt.scimark2.Random.nextDouble
, the remaining 13% is used
when the method is called by 2
other callers
jnt.scimark2.Random.nextDouble
.
Double clicking pops up a smaller window that contains the amount of CPU
time spent in this method when called from each of the callers.
Technically, Java does not suffer from "memory leaks" as do other programming languages, but Java applications are often affected by unexpected object retention or lingering objects.
One of the early versions of HPjmeter had a problem with lingering objects. Whenever a call graph was displayed, HPjmeter's heap size increased. However, even when either another metric was selected in the window or the call graph window was closed and each event was followed by a full GC, the heap space was not returned.
To pinpoint the problem, we ran HPjmeter using the hprof
profiler, specifying the following option for heap profiling:
-Xrunhprof:heap=all,cutoff=0We repeated the sequence of actions that we knew was leaving behind objects on the heap, and closed the application. This produced a fairly large file,
java.hprof.txt
, containing
the snapshot of all Java objects
that were present on the heap when the application terminated.
After loading the file into HPjmeter,
we looked at Metric -> Residual Objects (Count).
The analysis of the topmost entries revealed that the heap contained
objects of type hp.jmeter.metric.data.GraphDataNode
(see picture below).
From our understanding about how our tool works, we did not expect to find any live
objects of this type. They should all have
been garbage collected after we closed
the call graph display.
Contrary to the programmers' intentions, other live objects were still holding references
to these objects.
To find the references
to hp.jmeter.metric.data.GraphDataNode
, we
selected this type, and then
marked it by clicking on the Mark button.
We opened a second window, Window -> New Window, and selected another metric, the Metric -> Reference Graph Tree. As the name implies, this is the object reference graph, displayed as a tree. It shows objects and references between objects.
Having marked the object type previously, we perform the Find Immediately operation. This expands the graph to show one object of the chosen type, and the shortest path leading from a root to that object.
We analyzed the references going upward from the found object
(light blue) to the root.
On the way we found several objects created by our application, which
we also thought should have been collected, but otherwise made sense.
However, we also found a JButton
(hand marked with
1) with a
listenerList
(2) which still contained one of the action objects
used by the application (PruneAction
)
(3). This action object
obviously holds references to the data it is supposed to manipulate.
But why was the JButton
still there?
We explicitly removed it from the toolbar
and made certain that our application no longer held any references to it.
The answer is found by proceeding upwards in the reference chain,
which contained objects not directly created by our code.
There we see a javax.swing.plaf.metalToolBarUI
object
(4),
which has a field marginTable
(5).
Apparently, this is a hashtable that associates some resources
with various UI components. The entry with index 31
used our JButton
as a key
(6).
This analysis helped us learn more about Swing.
Removing a button from the toolbar and revalidating the toolbar obviously
works as far as the display behavior is concerned,
but the data structures behind it
(in this case, marginTable
within the MetalToolBarUI
object)
do not get updated as we had expected.
Instead of fixing the Swing behavior, we decided to remove our
ActionListener
from the JButton
when closing
down the call graph view (by the way, this is the recommended set
of steps when using Swing).
A careful reader will note that this does not eliminate the
"object retention/memory leak,"
it only makes it much less significant. The JButton
will still
be referenced as a key in the hashtable, but its listenerList
will be
shorter.
How can we find the lingering objects if we do not know the type of objects to look for? HPjmeter offers heuristic help to suggest some possibilities. Selecting Guess -> Memory Leaks causes HPjmeter to analyze the whole heap (be patient - it can take a long time!) in search of objects that would free a large amount of space if an individual object were not alive. The picture below shows the result of this search in our example.
The last column shows the "bytes held." This is the total number of bytes that could be de-allocated from the heap by the garbage collector, if the particular object "died" (that is, if all references to this object were nullified). Naturally, the tool is unable to tell if the object is kept alive intentionally or not.
By cross-referencing the objects ID with the tree view in the previous picture, we see that this heuristic aid found all the "right" objects. After the numbers of "bytes held" are calculated, they are also displayed by the reference graph.
When profiling with the -Xeprof
option, available for Java 2 RTEs
running on any version of HP-UX after 11.0 on both PA-RISC and Itanium,
HPjmeter is able to present advanced metrics,
such as precise CPU times and
thread specific metrics, including measurement of lock contention.
This example is based on a real-world three-tier web application. However, for this example, the names of companies and commercial products have been changed.
The problem with the application was that it did not scale well under load, despite running on a powerful multi-processor machine. After loading the profile data file, we looked at the CPU time consumed by each of the methods in the process (select Metric -> Exclusive Method Times (CPU)). The application ran for about 25 minutes, but the reported CPU times were generally very small, and did not indicate any CPU-related bottleneck.
We then selected Metric -> Threads Histogram. The histogram shows a horizontal bar for each thread used by the application, showing the lifetime of the thread with respect to the lifetime of the whole application (the window's width).
The thread bars are color coded, indicating the prevailing state of the thread during the application execution. Double clicking on one of the threads pops up a smaller window that shows the distribution of the thread's states over the thread's lifetime.
We can see that this thread spent most of its time waiting (i.e. within the
java.Object.wait
method). The CPU usage and garbage collector impact
are negligible.
One indicator of a bottleneck is lock contention, which contributed to
over 20% of the thread time.
The lock contention is measured as the time elapsed between the thread's request
to enter a Java monitor and the time when the monitor is entered.
To uncover the location in the application where the lock contention occurs, we want to look at the profile data for this single thread. To do this, select Scope -> Thread. The window is split vertically, with the right pane showing the list of threads, and the left pane showing the metrics for the selected thread. We select the ninth thread and then the Metric -> Lock Delay - Method Exclusive metric for that thread.
Here we see that almost all of the delay caused by the lock contention occurs
in the method subsys.util.logging.Logger.log
.
The analysis of the data for other threads reveals the same pattern. This allows us to assume that all these threads contend for a single lock, guarding the logging device. In this case, therefore, if logging is only a debugging feature, turning it off should dramatically increase the application performance. Otherwise, it needs to be redesigned for concurrent use.
HPjmeter is part of the Hewlett-Packard offering for Java, performance leading JVMs for PA-RISC and Itanium servers running HP-UX, and several tools, including a Java heap tuning tool named HPjtune, that help users achieve best performance in the deployment environment.
Coupled with the standard profiling interfaces available in the Java Virtual machines, HPjmeter visually displays the internal performance of your Java application. Quick data collection and display make iterative application performance improvements fast. HPjmeter's design goal strives to focus your attention on the resource consumption that most limits the throughput or scalability of your application.
We have used HPjmeter with many applications to help engineers in banking, financial, automotive, pharmaceutical, electronics, software, telecom, retail and many other companies and governments troubleshoot and tune their products.
Here we have explored the performance, object retention and thread performance areas, but HPjmeter offers many other metrics and features that we have found extremely useful. Extensive printing capabilities make report writing easy because any information displayed can be written to a file. A metrics file compare feature, File -> Compare, makes the impact of changes in user load or to the program itself easily seen in changes to either memory usage or performance. Use the online tutorial available at www.hp.com/java and the information in Help -> User's Guide to explore these additional areas.
You now have a tool available to explore your application's performance and to make your Java applications run exceptionally fast!