Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Training online: Concurrency, Threading, GC, Advanced Java and more ...
Tips February 2004
JProfiler
|
Get rid of your performance problems and memory leaks!
|
JProfiler
|
Get rid of your performance problems and memory leaks!
|
|
|
Back to newsletter 039 contents
http://www.javaperformancetuning.com/articles/randomaccess.shtml
Fast random access (Page last updated February 2004 , Added 2004-02-29, Author Jack Shirazi , Publisher JavaPerformanceTuning.com). Tips:
- A java.util.List object which implements RandomAccess should be faster when using List.get() than when using Iterator.next().
- Use
instanceof RandomAccess
to test whether to use List.get() or Iterator.next() to traverse a List object.
http://www-106.ibm.com/developerworks/library/j-perf02104.html
Exceptions best practice (Page last updated February 2004 , Added 2004-02-29, Author Jack Shirazi Kirk Pepperdine, Publisher IBM). Tips:
- The major differentiator between an exception and any other object is that it can be thrown and caught.
- Handling a thrown exception is quite an expensive proposition: the athrow instruction causes the JVM to pop the exception object off the top of the execution stack. It then searches the current execution stack frame looking for the first catch clause that can handle an exception of that class, or one of its superclasses. If no catch block is found in the current stack frame, then the current stack frame is released and the exception is re-thrown in the context of the next stack frame, and so on until a stack frame with a suitable catch clause is found, or the bottom of the execution stack is reached. Ultimately, if no appropriate catch block is found, all of the stack frames are released, and the thread is terminated after the ThreadGroup object has been given a chance to handle the exception (see ThreadGroup.uncaughtException). If an appropriate catch block is found, the program counter is reset to the first line of code in that block.
- If using instanceof operator avoids creating an exception, it is a much less expensive operation in both memory and execution resources than creating and handling an exception.
- The data used affects the performance of an algorithm.
- Using exceptions in exceptional circumstances is ideal for performance; using checks to avoid throwing exceptions in non-exceptional circumstances is ideal for performance.
http://cocoonhive.org/articles/LoadControlFilter/load_control_filter.htm
Using a Request Filter to Limit the Load on Web Applications (Page last updated January 2004 , Added 2004-02-29, Author Kevin Chipalowsky, Ivelin Ivanov, Publisher cocoonhive.org). Tips:
- When a web application is running a little slow users can "try to help it along" by clicking on another link, refreshing the page, using the Back button, or otherwise sending more requests to the server: this makes the situation worse by increasing the load.
- When a single user sends multiple concurrent requests to a server, they usually care most about the last request sent.
- To control the load and prevent unnecessary processing, restrict the server so that it only processes one request at a time per session.
- Use a session queue to control session requests: The queue holds a maximum of one request at a time; A new request always replaces an old request in the queue, except for the request that is currently being processed by the application.
- Use a filter that synchronizes client requests and restricts the load each user can put on your applications.
- It would be possible to build a filter that synchronizes requests spread across multiple servers within the same session, but the overhead may exceed the potential gain. In those environments, an easier solution to performance problems may be to increase the size of the server farm.
- Some types of requests should not be queued: Requests for images and other static resources; Concurrent requests from different browser windows within the same session; Concurrent requests for different frames; Multi-threaded robots spidering a website.
- Time limit the requests waiting in the queue so that the user does not wait too long on a single long running request: if the server does not complete a request within a set amount of time, the filter allows the server to start working on the next request waiting in the queue.
- Pages that take more than five seconds to complete should rarely exist if we expect our users to be happy.
http://www.sys-con.com/weblogic/article.cfm?id=395
How to Diagnose a Performance Problem in a J2EE System (Page last updated November 2003, Added 2004-02-29, Author John Bley, Publisher WeblogicDJ). Tips:
- Memory/Resource leaks degrade performance over time; fix by removing the cause of the leak. Common causes include: adding but never removing elements to Maps and Lists; missing finally blocks which should close() a resource.
- Remote communications can cause consistent slowness; fix by reducing request frequency, batching, breaking up large requests, tune the requests.
- CPU-bound components can often be improved using caching.
- Tune pool sizes to handle the highest expected load.
- Handle retries in an efficient fashion i.e. build in design elements to handle retried requests without causing large overloads.
- Measure total memory in use at various levels (JVM heap, OS), using Java heap profilers (-verbose:gc), and tools like top, vmstat, and Windows Perfmon.
- Measure CPU time in aggregate per component or per method using OS level statistics and with a Java profiler.
- Measure wall-clock time: Per transaction, per component, per method; an application monitoring solution is probably your best bet to obtain this data.
- Measure internal resources: Number allocated, number in use, number of waiting clients, average wait time to obtain a resource, average time spent using the resource, average time it takes the resource to accomplish requested work. Application servers typically give some minimal visibility into these numbers.
- Measure external resources: Number allocated, number in use, number of waiting clients, average wait time, plus measurements directly on the external system such as its view of how quickly it's completing requested work. Don't forget that the operating system and hardware that the application server run on are "external resources" as well - e.g., are you using too many processes or ports? Measuring these resources comes in two forms - measuring the bridge layer to that resource from inside the JVM and measuring the external resource with a tool native to that resource.
- Network utilization: Bandwidth usage, latency. Network sniffers and equipment give insight into this, though OS-local tools like netstat can help too.
- Measure system state: Use thread dumps, logs and trace files, stack traces, etc.
- Measuring load: watch the system's behavior over time; vary the load; compartmentalize the system and stress the individual parts in turn.
- To distinguish between poor coding and a bottleneck, try looking at aggregate CPU usage. If it doesn't vary under load but overall response time does, then the application is spending most of its time waiting.
- Architectural diagrams can help you understand overall interactions inside the system.
- Coding mistakes or misunderstandings of architectural intent may make the actual behavior of the system vary from what's expected.
- Trust hard numbers from a performance tool more than a document claiming that only one SQL statement will be issued per user transaction.
- Unless you have specific data to prove otherwise, investigate the simpler theory more fully than the more complex one.
http://www-106.ibm.com/developerworks/java/library/j-nioserver/
Servlet API and NIO (Page last updated February 2004, Added 2004-02-29, Author Taylor Cowan, Publisher IBM). Tips:
- Multiplexed I/O allows a growing number of users to be served by a fixed number of threads.
- [Article shows how to write a Servlet-based Web server using NIO].
http://sys-con.com/story/?storyid=43555
NIO (Page last updated February 2004, Added 2004-02-29, Author Vish Krishnan, Publisher JDJ). Tips:
- Minimize the impact of working with slow I/O mediums and maximize throughput and performance, using vectored I/O (scatter/gather) and multiplexing I/O.
- Blocked I/O and unbuffered streams can be inefficient.
- Scattering and gathering, also known as vectored I/O, are widely used for developing high-performance I/O applications. Most communicated packets and data files have several components. Gathering lets the data be read into multiple separate buffers in a single invocation. Scattering, can result in multiple data transfers in a single method invocation. This technique avoids the need for multiple system calls to perform the reads, and combines all reads into one optimized read system call. The result - a performance boost through the means of optimized data transfers to/from variable-size buffers.
- Polling results in burning CPU cycles and is, therefore, considered inefficient. NIO Selector allows multiple I/O resources to simultaneously (so efficiently) check for I/O availability.
http://sys-con.com/story/?storyid=43549
HTTP Session Garbage Collector (Page last updated February 2004, Added 2004-02-29, Author Abhinasha Karana, Publisher JDJ). Tips:
- Failure to remove cached data from HTTP sessions may lead to memory leakage, which becomes noticeable when a user HTTP session continues for hours.
- Each request should go through a cache garbage collector which should delete any unneeded elements.
- A cache framework using softly referenced objects is an efficient mechanism to avoid HTTP session memory leaks.
http://www.microjava.com/articles/techtalk/optimization
J2ME Game Optimization Secrets (Page last updated January 2004 , Added 2004-02-29, Author Mike Shivas, Publisher MicroJava). Tips:
- Skill and Action games: Refresh rates must be at least 10fps (frames per second); there has to be enough action to keep gamers challenged; they must be extemely responsive to user input.
- Providing lots of graphical activity at high framerates while responding quickly to key-presses is why code for real-time games has to be fast.
- If you're not writing a Skill and Action game, there's probably no need to optimize the game, unless the game does a great deal of processing between moves.
- J2ME optimization challenges include that your optimized code might run faster on an emulator, but slower on the actual device, or vice versa; and optimizing for one handset might actually decrease performance on another.
- To turn on the Profiler Utility in the J2ME Wireless Toolkit, select the Preferences item from the Edit menu. This will bring up the Preferences window. Select the Monitoring tab, check the box marked "Enable Profiling", and click the OK button. Now run the program in the emulator and then exit before the Profiler window appears.
- Close any background applications, like email clients, and keep activity to a minimum while you're running tests.
- You shouldn't bother optimizing code in your game that's outside the main game loop. Only optimize where it counts.
- Most likely the vast proportion of execution time in a real videogame is spent in the paint() method. Graphics routines take a very long time when compared to non-graphical routines. As graphics routines not optimizable you need to make smart decisions about which ones you use and how you use them.
- Using the right (i.e. fastest) algorithm will increase performance much more than using low-level techniques to improve a mediocre algorithm.
- Leave as much code as possible outside of your loops.
- Strings can be a huge memory drain if they're not controlled. Use String constants and StringBuffers.
- The behavior of getGraphics() when called multiple times on the same Image is ill-defined in J2ME and differs across platforms so you cannot use optimizations involving that consistently.
- A technique for speeding up text redraws (like player scores) is to calculate the size of the text and only redraw the clipped area (use setClip()).
- Method speed: synchronized methods are the slowest; interface methods are the next slowest; instance methods are in the middle; final methods are faster; static methods are fastest.
- Marking methods as
final static
can provide a small speedup.
- Techniques like loop unrolling, strength reduction, common sub-expression elimination, comparing to zero, eliminating method parameters, optimized switch blocks, multiplication instead of division, eliminating casts and using bitshift operators can provide a small speedup.
- Display.callSerially() let's you miss out calls to wait(), but may not work properly. The system will ensure that your work() and paint() methods are called in synch with the user input routines, so your game will remain responsive.
- A lot of the problems presented by game programming such as fast 3D geometry and collision detection have already been solved very elegantly and efficiently, use existing code to handle these.
- Use the profiler to see where to optimize; The profiler won't help you on the device, so use the System timer on the hardware.
- Drawing is slow, so use the Graphics calls as sparingly as possible.
- Pre-calculate and cache like crazy.
- Where possible, remove method calls altogether.
- You can use bit operators to implement circular loops instead of modulo.
- Cache array elements.
- Local variables are faster than instance variables.
- Look inside your Fixed Point math library and optimize it.
- Use proprietary high-performance APIs with care to preserve portability.
http://www.devx.com/Java/Article/19938
Translets (Page last updated January 2004, Added 2004-02-29, Author Raghu Donepudi, Publisher devX). Tips:
- Optimize XSL transformation using Translets. Translets are precompiled XSL documents that are optimized and converted into simple Java classes. When you compile your application Java files, you compile your XSL files into Java class files. During runtime, you can load translets like any regular Java class and perform XSL transformations over and over again. The syntax checking and parsing of XSL documents are done when the XSL files are compiled. The transformation therefore takes only as long as the compiled code takes to execute, which improves performance multiple fold.
Jack Shirazi
Back to newsletter 039 contents
Last Updated: 2024-11-29
Copyright © 2000-2024 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
URL: http://www.JavaPerformanceTuning.com/news/newtips039.shtml
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us