Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Training online: Concurrency, Threading, GC, Advanced Java and more ...
Tips December 2004
JProfiler
|
Get rid of your performance problems and memory leaks!
|
JProfiler
|
Get rid of your performance problems and memory leaks!
|
|
|
Back to newsletter 049 contents
http://www-106.ibm.com/developerworks/library/j-jtp12214/
Microbenchmarking in Java (Page last updated December 2004, Added 2004-12-30, Author Brian Goetz, Publisher IBM). Tips:
- Optimizing only the code that is executed frequently has several performance advantages: No time is wasted optimizing code that will execute infrequently, and you can, therefore, spend more time on optimization of hot code paths because you know that the time will be well spent.
- The HotSpot server compiler has been optimized to maximize peak operating speed, and is intended for long-running server applications. The Hotspot client compiler has been optimized to reduce application startup time and memory footprint, employing fewer complex optimizations than the server compiler, and accordingly requires less time for compilation.
- The HotSpot server compiler can perform as code hoisting, common subexpression elimination, loop unrolling, range check elimination, dead-code elimination, data-flow analysis, and aggressive inlining of virtual method invocations.
- To get some insight into what the HotSpot compiler is doing, invoke the JVM with the -XX:+PrintCompilation flag, which causes the compiler (client and server) to print a short message every time it runs.
- The HotSpot JIT is continuously recompiling and recompilation can be triggered at unexpected times. Timing measurements in the face of continuous recompilation can be quite noisy and misleading, and it is often necessary to run Java code for quite a long time to amortize the compilation overheads.
- Many microbenchmarks perform much "better" when run with -server than with -client, because the server compiler is more adept at optimizing away blocks of dead code.
- For microbenchmarking you should warm up the JVM: execute your target operation enough times that the compiler will have had time to run and replace the interpreted code with compiled code before starting to time the execution.
- Run your benchmarks with -XX:+PrintCompilation, observe what causes the compiler to kick in, then restructure your benchmark program to ensure that all of this compilation occurs before you start timing and that no further compilation occurs in the middle of your timing loops.
- If the garbage collector is going to have to run during your microbenchmark it can badly distort timing results: a small change in the number of iterations could mean the difference between no garbage collection and one garbage collection, skewing the "time per iteration" measurement.
- Run your benchmarks with -verbose:gc so you can see how much time was spent in garbage collection
- Run your program for a long, long time, ensuring that you trigger many garbage collections, more accurately amortizing the allocation and garbage collection cost.
- HotSpot can do speculative inlining. For example if a class has no subclasses loaded, its methods can be considered final by the compiler and inlined. If a subclass is later loaded, the inline is backed out. For this reason, declaring methods and classes final does not usually improve performance - HotSpot is already effectively assuming that everything that can be final, is final.
http://java.sun.com/developer/JDCTechTips/2004/tt1116.html#2
Pooling Threads to Execute Short Tasks (Page last updated November 2004, Added 2004-12-30, Author John Zukowski, Publisher Sun). Tips:
- You should consider pooling threads to execute short tasks
- J2SE 5.0 java.util.concurrent has concurrency utilities that provide a pre-built thread pooling framework.
- The primary thread pooling interface in 5.0 is Executor. An Executor implementation which does no pooling would be
class MyExecutor implements Executor { public void execute(Runnable r) {new Thread(r).start();} }
- The 5.0 ThreadPoolExecutor class supports many common pooling operations. You can specify options such as pool size, keep alive time, a thread factory, and a handler for rejected threads.
- The 5.0 the ExecutorService extends Executor, proviing a submit method that allows you to submit a Runnable to be executed asynchronously by the pool and later get a result back: the submit method returns a Future object that you can use to check if the task is done.
- The new 5.0 FixedThreadPool implements a fixed size thread pool that supports Executor and ExecutorService.
- [Article provides an example of using 5.0 thread pool classes].
- 5.0. allows you to create scheduled thread pools, where you can schedule tasks to run "later".
http://java.sun.com/j2se/1.5.0/docs/guide/concurrency/overview.html
Concurrency Utilities Overview (Page last updated October 2004, Added 2004-12-30, Author Sun, Publisher Sun). Tips:
- The 5.0 Concurrency Utilities include a high-performance, flexible thread pool; a framework for asynchronous execution of tasks; a host of collection classes optimized for concurrent access; synchronization utilities such as counting semaphores; atomic variables; locks; and condition variables.
- The 5.0 Executor framework is a framework for standardizing invocation, scheduling, execution, and control of asynchronous tasks according to a set of execution policies. Implementations are provided that allow tasks to be executed within the submitting thread, in a single background thread (as with events in Swing), in a newly created thread, or in a thread pool, and developers can create of Executor supporting arbitrary execution policies. The built-in implementations offer configurable policies such as queue length limits and saturation policy which can improve the stability of applications by preventing runaway resource consumption.
- 5.0 includes several new Collections classes including the Queue and BlockingQueue interfaces, and high-performance, concurrent implementations of Map, List, and Queue.
- 5.0 has classes for atomically manipulating single variables (primitive types or references), providing high-performance atomic arithmetic and compare-and-set methods. The atomic variable implementations in java.util.concurrent.atomic offer higher performance than would be available by using synchronization (on most platforms), making them useful for implementing high-performance concurrent algorithms as well as conveniently implementing counters and sequence number generators.
- General purpose 5.0 synchronization classes, include semaphores, mutexes, barriers, latches, and exchangers, which facilitate coordination between threads.
- The 5.0 java.util.concurrent.locks package provides a high-performance lock implementation with the same memory semantics as synchronization, but which also supports specifying a timeout when attempting to acquire a lock, multiple condition variables per lock, non-lexically scoped locks, and support for interrupting threads which are waiting to acquire a lock.
- From 5.0 the System.nanoTime method enables access to a nanosecond-granularity time source for making relative time measurements, and methods which accept timeouts (such as the BlockingQueue.offer, BlockingQueue.poll, Lock.tryLock, Condition.await, and Thread.sleep) can take timeout values in nanoseconds. The actual precision of System.nanoTime is platform-dependent.
http://www.fawcette.com/javapro/2004_11/magazine/features/pvarhol/
Performance process (Page last updated November 2004, Added 2004-12-30, Author Peter Varhol, Publisher JavaPro). Tips:
- Response time is the ability of an application to return a result or be ready for the next action by an individual user. Response time is a critical measure of the usability of an application.
- Scalability is the ability of the application to service the necessary number of users while maintaining an adequate response time.
- Good scalability means that as users are added, response time should degrade gradually, rather than crash or prevent further activity all at once.
- Poor user response time or inadequate scalability costs money, directly or indirectly: it either turns users away during peak times; or seems too unresponsive to keep users engaged; or makes users less productive.
- Poor performance can be a symptom of more serious application problems: object leaks; or because a database call brings too much data into the application.
- Application users need to define performance as a critical application characteristic.
- You should use tools that assess, forecast, measure, and improve performance at each step of the application lifecycle.
- Include performance characteristics as a part of the application-approval process.
- Underestimating response time and scalability is almost as bad as not specifying them at all; overestimating them entails excessive design, development, and testing efforts that can be expensive and inefficient.
- You should factor in how many users there potentially could be in the future.
- Using trivial tests and data cases that simply aren't representative in tests produces the wrong performance data.
- When the application is stressed with real data sets it produces far more representative performance data.
- You should prototype at least two designs and test them for performance and scalability - doing so could save excessive performance tuning later in the development lifecycle.
- Reducing the amount of time it takes to execute lines of code helps the code execute faster.
- Excessive use of temporary objects in code makes the garbage collector work harder, causing more CPU usage.
- Load testing is essential. There are many load testers including TestMaker, Apache Jmeter, Mercury Interactive.
- It is important to perform final testing on the same systems and configurations that will be used in production.
- Collect data on the application itself, rather than the underlying server and network infrastructure.
- Production monitoring tools need to deliver good diagnostics data with only a small performance impact.
- Try to apply infrastructure constraints during application design, so that the designer is not permitted to apply options that violate those constraints.
- Your project should have a performance life cycle that runs parallel to the application life cycle.
http://www-106.ibm.com/developerworks/websphere/library/techarticles/0412_kochuba/0412_kochuba.html
Developing a WebSphere client to determine a hung thread problem (Page last updated December 2004, Added 2004-12-30, Author James Kochuba, Publisher IBM). Tips:
- [Article describes the steps needed to create a WebSphere client than can identify hung threads in the server].
http://developers.sun.com/techtopics/mobility/midp/ttips/gamecanvas/
Game Canvas Basics (Page last updated December 2004, Added 2004-12-30, Author Eric Giguere, Publisher Sun). Tips:
- The shortcoming of Canvas is that it gives the application no control over when a canvas repaints itself - all it can do is request a repaint - or over how quickly key and pointer events get delivered to the canvas. GameCanvas was designed specifically to fix these weak points.
- MIDP user-interface components are event-driven. GameCanvas is different: It lets the application poll for key events quickly and repaint the canvas in a timely fashion. This polling and repainting is normally done in a loop on a separate thread, hence the term game loop.
- To poll for key events, use getKeyStates().
- A game canvas uses a technique called double buffering: You perform drawing operations in an off-screen buffer, then quickly copy from the buffer to the visible area of the canvas. (the canvas automatically creates and manages the off-screen buffer.)
- Each call to getGraphics() returns a new instance, so you should call the method once outside the game loop and save the reference.
- To update the display after drawing, a call to flushGraphics() will force an immediate repaint that's based on the current contents of the off-screen buffer.
http://www-106.ibm.com/developerworks/websphere/library/techarticles/0411_persichetti/0411_persichetti.html
Heapdump (Page last updated November 2004, Added 2004-12-30, Author Jiwu Tao, Priamo Persichetti, Publisher IBM). Tips:
- Profiling memory use and debugging memory leaks in application code in the early stages of development is considered a best practice for application development, providing early detection of memory problems long before the production stage and averting major architectural changes late in the application life cycle.
- Solving a memory leak problem is a two-step process: identify which Java classes caused the memory leak, and then determine where in the application the leak occurred.
- Tracking memory information in a profiling tool slows down performance.
- [Article runs through using the Heapdump tool to determine the cause of a memory leak].
Jack Shirazi
Back to newsletter 049 contents
Last Updated: 2024-08-26
Copyright © 2000-2024 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
URL: http://www.JavaPerformanceTuning.com/news/newtips049.shtml
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us