Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Tips February 28th, 2003
Get rid of your performance problems and memory leaks!
Get rid of your performance problems and memory leaks!
Back to newsletter 027 contents
BEA JMS performance (Page last updated January 2003, Added 2003-02-28, Author Peter Zadrozny, Publisher BEA). Tips:
- The only way to understand JMS performance is by testing your own application (or a proof of concept).
- Performance of asynchronous messaging systems is typically measured based on throughput, e.g. messages per second
- Throughput is a measure of capacity, not speed, i.e. consumer performance, not response time.
Swing perf community chat (Page last updated January 2003, Added 2003-02-28, Author Edward Ort, Publisher Sun). Tips:
- For trees that contain many nodes, you can set the rowHeight to a fixed value, and set the largeModel property to true. On the down side, this makes your model queried much more often (very little is cached), but can improve memory and performance usage.
- VariableHeightLayoutCache creates one object per visible node to track its size, whereas FixedHeightLayoutCache will only create an object per expanded node. The tradeoff between Fixed* and Variable* is that the model is queried much more often.
- To garbage collect non-visible nodes, send a TreeModelEvent indicating that the nodes are no longer in the model.
- Use RepaintManager.setRepaint(false) if you only need frame animation, not simple component repaints.
- Gaming apps may want to minimize the number of components to maximize frame rates. Swing components have events and rendering overhead that are unnecessary need for simple sprite-based games.
- The single biggest difference between so-so and really great Swing apps has to do with the way developers handle threading issues.
- Create your objects lazily; if you don't need it yet, don't waste time creating it.
- Watch the number of classes you create, since each class adds some amount of overhead (both memory and performance).
- Translucent components or rendering looks great, but can cause some performance bottlenecks in the current releases (1.4.1).
- JFileChooser on large datasets is slow in the current releases (1.4.1).
- The -Xmx flag will let you specify a larger heap.
- getScaledInstance() is not accelerated, so if you are concerned about performance, you might use the drawImage() call that scales during the copy, or cache a new scaled image for repeated use.
- Use a profiler to determine bottlenecks.
- When you call repaint on a bunch of components, the RepaintManager unions the dirty regions together and makes one giant rectangle to repaint. This is good in the general case, but for cases with LOTS and LOTS of small components, things bog down. You can create your own RepaintManager to just repaint the right parts of the screen.
- Don't spawn more rendering threads. Multi- threaded rendering can at the very least buy you nothing in terms of rendering performance. But it can also lead to deadlock situations and undefined behavior.
- Calls like drawLine or drawText are pretty easy for the printer to do. But images have to be scaled and printed completely, which can be very slow. Rendering to a buffer, then printing is probably bad for printing performance.
- Internationalizing fastest to slowest, hold resources in: Java class; Property files; ListResourceBundle (XML)?
Preventing Repeated Operations (Page last updated January 2003, Added 2003-02-28, Author Mark Johnson, Publisher Sun). Tips:
- Generate a unique symbol that can be embedded as a hidden input in the form being processed to ensure that operations are processed only once.
IBM GC 1 (Page last updated August 2002, Added 2003-02-28, Author Sam Borman, Publisher IBM). Tips:
- If the frequency of GCs is too high prior to reaching a steady state, use verbosegc to determine the size of the heap at a steady state and set -Xms to this value.
- If the heap is fully expanded and the occupancy level is greater than 70%, increase the -Xmx value so the heap is not more than 70% occupied. For best performance, try to make sure that the heap never pages. The maximum heap size should, if possible, fit in physical memory.
- If, at 70% occupancy, the frequency of GCs is too great, change the setting of -Xminf. The default is 0.3, which will try to maintain 30% free space by expanding the heap. A setting of 0.4 increases this free space target to 40%, reducing the frequency of GCs.
- If pause times are too long, try using -Xgcpolicy:optavgpause (introduced in 1.3.1), which reduces pause times and makes them more consistent as the heap occupancy rises. There is a cost to pay in throughput. This cost varies and will be about 5%.
- Make sure the heap never pages; the maximum heap size must fit in physical memory.
- Avoid finalizers. If you do use finalizers, try to avoid allocating objects in the finalizer method. A verbosegc trace shows if finalizers are being called.
- Avoid compaction. Compaction is usually caused by requests for large memory allocations. Analyze requests for large memory allocations and avoid them if possible; if they are large arrays, try to split them into smaller pieces.
IBM GC 2 (Page last updated August 2002, Added 2003-02-28, Author Sam Borman, Publisher IBM). Tips:
- GC is done in three phases: mark, sweep, and optionally compaction. [Article describes the GC algorithm].
IBM GC 3 (Page last updated September 2002, Added 2003-02-28, Author Sam Borman, Publisher IBM). Tips:
- The output from -verbosegc lets you analyze the garbage collections. [Article describes the IBM JVM -verbosegc output].
GC (Page last updated January 2003, Added 2003-02-28, Author Sumit Chawla, Publisher IBM). Tips:
- Once the application has reached a steady state garbage collection, where heap expansions are no longer required, a good value for the startup heap size has been determined.
- Determine the maximum heap size by stressing the application and finding the value for -Xmx that avoids an OutOfMemory error.
- Keep the heap small enough to avoid paging. The heap size must never exceed the amount of physical memory installed on the system.
- If the heap size is too small, there will be frequent garbage collection cycles.
- The time to complete a full garbage collection cycle grows directly proportional to the size of the heap, so if the heap is too large, this can lead to long delays in the application.
- A common performance-tuning measure is to make the initial heap size (-Xms) equal to the maximum heap size (-Xmx). Since no heap expansion or contraction occurs, this can result in significant performance gains in some situation.
- Usually, only the applications that need to handle a surge of allocation requests keep a substantial difference between initial and maximum heap size.
- If -Xms is different from -Xmx the application can run into a scenario where too many allocation failures are occurring, but the heap doesn't expand - known as heap thrashing. Increase the -Xminf and -Xmaxf values to avoid this.
- Finalizers are a bad idea, and performing allocations from inside the finalizers should be avoided at all costs.
- Break down large (>500 KB) requests to smaller chunks, if possible. If the heap is fragmented, it is possible to attain an out-of-memory condition when there is plenty of space if trying to allocate a very large object (i.e. large array) which cannot fit into any of the fragments.
- Do not ignore the "mark stack overflow" messages. These indicate an inappropriate object retention problem.
- Concurrent GC smooths the effects of GC but may reduce the throughput of an application.
- Avoid -Xnocompactgc, -Xcompactgc and -Xgcthreads.
- Question whether calls to System.gc() are of any use, and if they are not, remove them.
floating point and decimal numbers (Page last updated January 2003, Added 2003-02-28, Author Brian Goetz, Publisher IBM). Tips:
- [Article has nothing specific about performance, but if you want to improve the performance of manipulating floating point numbers, you need to know all of this].
Performance of networked J2ME apps (Page last updated January 2003, Added 2003-02-28, Author Michael Abernethy, Publisher IBM). Tips:
- Every class in the application brings with it some size overhead. You have to create constructors, variables, and functions for every class you create.
- Every class has to be loaded into memory when it is used.
- Cherish every byte and keep as much available as possible.
- Make each screen its own function, NOT its own class.
- Don't use get/set methods; make every variable a public one.
- Eliminate screen location checking by using a stack and simply push() the screens when we go forward, and pop() the screens as we go backwards.
- When communicating with a database, the size of the data object being sent and returned is at the top of the list of concerns.
- Every request to the database should get as short a response as possible while still answering the request.
- Assume: small screen sizes; slow transmission speeds; slow processing speed; and a cost for every byte transmitted; 2Kb is the maximum size of a file (RecordStore) stored locally (to maintain portability).
XML data binding performance (Page last updated January 2003, Added 2003-02-28, Author Dennis M. Sosnoski, Publisher IBM). Tips:
- The choice of XML framework used dramatically affects performance and memory usage.
- [Article introduces a highly efficient XML data binding framework which avoids reflection and uses a pull parser rather than SAX2].
Manage Java apps for premium performance (Page last updated January 2003, Added 2003-02-28, Author Peter Bochner, Publisher ADTMagazine). Tips:
- The average time for resolving a performance problem is 25.8 hours.
Four Critical Issues for Mobile Apps Developers (Page last updated January 2003, Added 2003-02-28, Author Alan Zeichick, Publisher DevX). Tips:
- Mobile technology is not ubiquitous or reliable. Even when connected, a link's reliability and bandwidth can change instantly, so the software needs to be able gracefully accommodate disconnects and reconnects, with minimal impact to the quality of the user experience.
- Putting all of the program logic on the server with only a user-interface stub on the client, may produce an unsatisfactory user experience due to unpredictable network availability.
- When thinking "mobile," think "power miser." There's a power cost to everything: consuming CPU cycles, lighting the display, driving the speakers, operating a USB-based peripheral, even pinging across the Internet. Among the biggest power drains: memory, processor, and wireless transceiver.
Concise Guide to WebSphere Capacity Management (Page last updated January 2003, Added 2003-02-28, Author Ruth Willenborg Stacy Joines, Publisher e-ProMag.com). Tips:
- Define acceptable service levels: throughput (how many users/second or hits the site must support); response time (the wait a user experiences before receiving a response from the Web site); tolerance for outages (also called the site's availability).
- The first step in establishing a service level for your Web site is to examine what the site must do well.
- Determine the peak load. Peak load is frequently an order-of-magnitude higher than average load. Target the application to handle peak loads rather than average loads.
- Capture at least the following application data: user load (concurrent HTTP sessions or servlet requests being processed), throughput (requests per second), and response time.
- Capacity utilization is: how much CPU is required; the memory footprint of the application; the network capacity utilized at peak loading; the disk capacity utilized during peak loading. This data can be obtained using tools such as vmstat, iostat, or perfmon to report CPU, memory, and disk utilization and netstat for network data.
- JVM capacity can be measured using ?verbosegc or other JVM memory utilization measures.
- Continual garbage collecting indicates the need for a larger heap or another JVM (clone).
- Underutilization of the heap allows you to reduce the JVM's maximum heap allocation and give other JVMs sharing the machine more memory as needed.
- Look for high CPU utilization, excessive paging, unusually high disk I/O, network saturation.
- A component running at full capacity can't handle additional work. Any component running at full utilization may constrain the overall responsiveness and throughput of the Web site.
- Project future capacity requirements using historical growth data, expected additional functionality, marketing campaigns and promotions. Don't extrapolate beyond twice the performance limits measured in your existing production or test environment.
- Continually test and monitor the application.
Web service management (Page last updated January 2003, Added 2003-02-28, Author Justin Murray, Publisher DevX). Tips:
- The key concerns in managing Web services begin with runtime instantiation and responsiveness to requests. These issues include:
- Consider how multiple instances of the Web service may be handled concurrently, and how the load is being dealt with by one instance or shared across those instances
- Consider how the loading characteristics on a Web service can be discovered and presented.
- Measures should include: the number of concurrent messages being processed; current load statistics including byte sizes.
- The Web service platform must be able to identify the source and destination of SOAP messages to understand the usage of a service so that it's usage can be optimized.
- Response time and uptime are measures of the quality of a Web service.
Applying Design Issues and Patterns in Web Services (Page last updated January 2003, Added 2003-02-28, Author Chris Peltz, Publisher DevX). Tips:
- When you parse an XML document using DOM, you have to parse the entire document. So, there is an up-front performance and memory hit in this approach, especially for very large XML documents. DOM is appropriate when you are dealing with more document-oriented XML files.
- The SAX approach (events that invoke a callback when a given tag is found) reduces the overall memory overhead and can increase scalability if you are only processing subsets of the data.
- The Fa?ade pattern takes existing components that are already exposed and encapsulates some of the complexity into high-level, coarse-grained services that meet the specific needs of the client. This can enhance overall performance of the Web service.
Monitoring Performance with WebSphere (Page last updated January 2002, Added 2003-02-28, Author Ruth Willenborg, Publisher e-ProMag.com). Tips:
- A poorly performing Web site causes unsatisfied customers and lost revenue opportunities.
- Monitoring the performance of your Web site, identifying performance problems, and quickly finding the cause of and resolving problems should be a critical part of your overall Web site operations.
- If you don't measure performance, you won't know that you have problems.
- If you don't know where to measure performance, you can't find problems.
- If you don't know how to find the source of a problem, you can't fix problems.
- Users don't care about how a website works, they care only about how fast your Web page appears. Monitoring the end- user view tells you whether you have a publicly visible performance problem.
- If your Web site is too slow, your customers will give up and leave.
- You need to monitor all components, including your application servers, databases, network, and routers
- Watch key metrics and compare them with normal, expected behavior. If you find deviations, investigate these areas more closely.
- Three main metrics to monitor are: load (number of concurrent users); response time; throughput (requests per second).
- As the number of concurrent user requests increases, throughput should grow almost linearly and request response times should remain approximately constant. When throughput starts to grow more slowly or reaches an upper bound, you have a bottleneck ? typically a saturated resource.
- If throughput has reached an upper bound, request response time increases linearly with the rate of requests (until a system resource becomes exhausted).
- If a system resource becomes exhausted, throughput starts to degrade and response times will simultaneously increase.
- The ten most commonly monitored parameters are: concurrent requests being processed (EJB and HTTP); response time (EJB/HTTP); throughput (requests serviced per second EJB/HTTP); busy/idle HTTP servers; thread pool (size and percent maxed); DB connection pool (size/percent in use); JVM heap used/free; CPU utilization; disk I/O read/write rates; paging.
- Thread state profiling, e.g. dumping the thread stack, allows you to identify synchronization bottlenecks and method hotspots.
- Application call flow analysis, i.e. looking at the times spent in the various components of a request, can help eliminate components as bottleneck suspects.
Back to newsletter 027 contents
Last Updated: 2017-11-28
Copyright © 2000-2017 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us