Java Performance Tuning
Java(TM) - see bottom of page
Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks!
Training online: Concurrency, Threading, GC, Advanced Java and more ...
Tips May 2006
JProfiler
|
Get rid of your performance problems and memory leaks!
|
JProfiler
|
Get rid of your performance problems and memory leaks!
|
|
|
Back to newsletter 066 contents
http://java.sun.com/developer/technicalArticles/Interviews/Oaks_qa.html
Interview of Scott Oaks (Page last updated May 2006, Added 2006-05-30, Author Tomas Hurka and Jaroslav Bachorik, Publisher Sun). Tips:
- The Grizzly NIO HTTP connector is slightly faster than our previous C-based Glassfish HTTP connector.
- The 1.5 Executor classes provide very flexible thread pools (the ThreadPoolExecutor uses four pieces of information in determining whether to hand of a task to a new thread or queue it for an existing thread; they support different data structures for queueing; they have four or five different lifecycle methods; and so on. But the flexibility has a little performance cost - a simpler thread pool implementation outperfoms it
- When there's more code to execute, it will take more time.
- "You always hope that you'll load an application into a profiler, it will show that there's a hotspot in the code accounting for 60% of the time, and you can just go fix that and call it a day ... But in reality, bottlenecks are subtle. Leaf methods in a profile will all show very little CPU usage; root methods will all be too far removed from the source of the bottleneck. So you're left to walk up and down the stack and figure out where to look. I'm often asked how to do this well, and it's not really something I can explain. I think with practice you just develop an intuition for it."
- Appserver performance is dependent on many external factors: spend time getting the database to run optimally; make sure that the appserver has the correct number of threads; make sure that the OS TCP parameters aren't reducing throughput.
- There are three big issues with current Java profilers: they tend to be very intrusive; offer limited visibility into the JVM (e.g. of gc threads and compiler calls); blocking methods - writes to network sockets, poll calls, waiting for locks - aren't traditionally handled very well by profilers, especially waits for locks - lock-contention analysis is typically tricky with development tools right now.
- Measure everything possible: try and take system statistics (iostat for disk usage, mpstat for cpu usage, jstat/visualgc for garbage collection (GC) usage, database usage statistics and so on) performance problems can come from outside the application, or at least from something like GC that can be tuned pretty easily. Also gather statistics from the appserver: how it's managing its thread pools, the passivation and creation rate of EJBs, size of the MDB pools, and so on. Analyze all the monitoring data first to make sure that you haven't missed something. Then profile the code.
- On the Solaris Operating System, [Sun engineering] profiles with either NetBeans Profiler or the Sun Java Studio collector/analyzer (which is C based, so it offers some visibility into the JVM).
- Different profilers will sometimes show different results, which is a usual artifact of sampling. So sometimes its useful to try many different profilers before you get a good answer.
- [Interview also includes an honest analysis of the strengths and weaknesses of the NetBeans profile].
http://weblogs.java.net/blog/enicholas/archive/2006/05/understanding_w.html
Understanding Weak References (Page last updated May 2006, Added 2006-05-30, Author Ethan Nicholas, Publisher java.net). Tips:
- If an object is reachable via a chain of strong references (strongly reachable), it is not eligible for garbage collection
- Keeping track of objects for whatever reason (like caching) is a common cause for memory leaks, because of using strong references instead of weak references.
- A weak reference is a reference that isn't strong enough to force an object to remain in memory.
- WeakHashMap works exactly like HashMap, except that the keys (not the values!) are referred to using weak references. If a WeakHashMap key becomes garbage, its entry is removed automatically.
- If you pass a ReferenceQueue into a weak reference's constructor, the reference object will be automatically inserted into the reference queue when the object to which it pointed becomes garbage. You can then process the ReferenceQueue and perform whatever cleanup is needed for dead references.
- There are four different degrees of reference strength: strong, soft, weak, and phantom, in order from strongest to weakest.
- An object which is weakly reachable (the strongest references to it are WeakReferences) will be discarded at the next garbage collection cycle.
- An object which is softly reachable (the strongest references to it are SoftReferences) will generally stick around for a while after the garbage collection cycle that it becomes collectable in. SoftReferences aren't required to behave any differently than WeakReferences, but in practice softly reachable objects are generally retained as long as memory is in plentiful supply.
- A phantom reference's grip on its object is so tenuous that you can't even retrieve the object -- its get() method always returns null. The only use for such a reference is keeping track of when it gets enqueued into a ReferenceQueue, as at that point you know the object to which it pointed is dead. PhantomReferences are enqueued only when the object is physically removed from memory, and the get() method always returns null specifically to prevent you from being able to "resurrect" an almost-dead object.
- PhantomReferences allow you to determine that an object was definitely removed from memory. They are in fact the only way to determine that.
- PhantomReferences allow you to avoid a fundamental problem with finalization: finalize() methods can "resurrect" objects by creating new strong references to them. This means that an object which overrides finalize() must now be determined to be garbage in at least two separate garbage collection cycles in order to be collected. When the first cycle determines that it is garbage, it becomes eligible for finalization. Because of the (slim, but unfortunately real) possibility that the object was "resurrected" during finalization, the garbage collector has to run again before the object can actually be removed. And because finalization might not have happened in a timely fashion, an arbitrary number of garbage collection cycles might have happened while the object was waiting for finalization. This can mean serious delays in actually cleaning up garbage objects, and is why you can get OutOfMemoryErrors even when most of the heap is garbage. PhantomReferences allow you to avoid this situation.
- An important use of PhantomReferences is in DGC (Distributed GC like in RMI). You most certainly do not want to perform remote notification in the GC thread.
http://www.developer.com/java/other/article.php/3606401
Dynamic Loading/Reloading of Classes (Page last updated May 2006, Added 2006-05-30, Author Richard G. Baldwin, Publisher Developer.com). Tips:
- Once a class has been loaded, a special object of the class named Class will have been automatically instantiated. The Class object represents the newly-loaded class.
- Once a class is loaded, it can only be reloaded if it was loaded by a class loader object. Classes loaded by the primordial class loader cannot be reloaded.
- If you ask a class loader object to load a class, the primordial class loader is given an opportunity to load the class first. If the primordial class loader finds the class on the classpath, it will load it.
- You should avoid putting re-loadable classes on the classpath to avoid the primordial classloader from loading it.
- The URLClassLoader is a useful classloader to use to control class reloading. By using multiple URLClassLoader's you can load more than one class with the same name. Dereferencing previous URLClassLoader objects effectively means that a class of the same name is reloaded.
http://www.javaworld.com/javaworld/jw-05-2006/jw-0501-jdbc.html
Design and performance improvements with JDBC 4.0 (Page last updated May 2006, Added 2006-05-30, Author Shashank Tiwari, Publisher Javaworld). Tips:
- The SQLTransientException is thrown where a previously failed operation may succeed on retrial (like a timeout). The SQLNonTransientException is thrown where retrial will not lead to a successful operation unless the cause of the SQLException is corrected (like a syntax error).
- SQLXML resources can be released by calling their free() methods, which might prove pertinent where the objects are valid in long-running transactions.
- Prior to JDBC 4.0, there is no way to distinguish between a stale connection and a closed connection. The new API adds an isValid() method to the Connection interface to query if the connection is still valid.
- Database connections are often shared among clients, and sometimes some clients tend to use more resources than others, which can lead to starvation-like situations. The Connection interface defines a setClientInfo() method to define client-specific properties, which could be utilized to analyze and monitor resource utilization by the clients.
- A new java.sql.Wrapper interface rovides the ability to access datasource-vendor-specific resources by retrieving the delegate instance. An unwrap() method returns the object that implements the given interface to allow access to vendor-specific methods.
http://www.informit.com/articles/article.asp?p=465310
Monitoring and Optimizing Apps on Dual-Core and Multiprocessor Systems (Page last updated April 2006, Added 2006-05-30, Author Kurt Hudson, Publisher informit). Tips:
- You can set a process's priority in Windows using Task Manager?s Processes tab and selecting Set Priority.
- Be careful when using the Realtime priority setting. Many people have caused their systems to stop responding by setting an application?s base priority to Realtime.
- If you have a an application that you expect to run for a long time, but you want to minimize the impact of that application on the computer?s performance while you?re using it, you can set the less important process to run at a BelowNormal or Low priority level.
- If you have a critical application that you want to receive processing time over all other applications on the system, you can set the priority level to AboveNormal or High.
- If you want a particular Windows application to always start with a base priority other than normal, you can use the Start command in a batch file to always launch that application at a higher priority. For example, start /high c:\priorityone.exe
- You can control the processor(s) on which an application runs. In Windows use the Process tab of the Task Manager, right-click the desired process and select the Set Affinity option to open a dialog box - this allows you to specify on which CPUs a process can be run. But by default the Windows operating systems set processes to run on all available processors, so you shouldn't set processor affinity unless necessary for a determined reason.
http://www-128.ibm.com/developerworks/java/library/j-jtp04186/
Introduction to nonblocking algorithms (Page last updated April 2006, Added 2006-05-30, Author Brian Goetz, Publisher IBM). Tips:
- Nonblocking algorithms are concurrent algorithms that derive their thread safety not from locks, but from low-level atomic hardware primitives such as compare-and-swap.
- Modern processors provide special instructions for atomically updating shared data that can detect interference from other threads, and compareAndSet()
- Nonblocking algorithms have only become possible in the Java language as of Java 5.0.
- When used properly, intrinsic locking by synchronizing code blocks can make your programs thread-safe, but locking can be a relatively heavyweight operation when used to protect short code paths when threads frequently contend for the lock.
- Atomic variables provide atomic read-modify-write operations for safely updating shared variables without locks. Atomic variables have memory semantics similar to that of volatile variables, but because they can also be modified atomically, they can be used as the basis for lock-free concurrent algorithms.
- The increment operator is not atomic (in Java), it consists of three separate operations: fetch the value, add one to it, and write the value out.
- A
synchronized(LOCK) {++value;}
increment can be made nonblocking with AtomicInteger like this int v = value.get(); while (!value.compareAndSet(v, v + 1)) v = value.get();
. The nonblocking version synchronizes at a finer level of granularity (an individual memory location) reducing the chance that there will be contention; and losing threads can retry immediately rather than being suspended and rescheduled. Even with a few failed compareAndSet() operations, this approach is still likely to be faster than being rescheduled because of lock contention.
- A basic characteristic of all nonblocking algorithms is that some algorithmic step is executed speculatively, with the knowledge that it may have to be redone if the compareAndSet() is not successful (called optimistic).
- [Article gives examples of a non blocking stack implementation using speculative compareAndSet() to push/pop, retrying if the compareAndSet() operation fails].
- Under light to moderate contention, nonblocking algorithms tend to outperform blocking ones as few retries are required and delays are shorter than for lock management and context switching.
- Under high contention lock-based algorithms start to offer better throughput than nonblocking ones because when a thread blocks, it stops pounding and patiently waits its turn, avoiding further contention.
- Contention levels high enough to require lock-based algorithms rather than nonblocking ones are uncommon. Such contention also suggests re-examining you design.
- compareAndSet() enables atomic conditional updates on a single pointer, but not on two. To construct a nonblocking linked list, tree, or hash table, you need to find a way to update multiple pointers with compareAndSet() without leaving the data structure in an inconsistent state.
- The "trick" to building nonblocking algorithms for nontrivial data structures is to make sure that the data structure is always in a consistent state. An interrupting thread can complete the operation, "telling" the interrupted thread there is no need to continue.
http://blogs.sun.com/roller/page/rchrd?entry=why_scalar_optimization_is_important
Why Scalar Optimization Is Important (Page last updated May 2006, Added 2006-05-30, Author Richard Friedman, Publisher Sun). Tips:
- Optimizing the scalar parts of your programs is more important than parallelizing the code.
- If half the code can be parallelized, a 64 processor system might only speed up a program twice as fast as a single processor.
- [Article discusses some consequences of Amdahl's law and the efficiency of parallelizing code].
Jack Shirazi
Back to newsletter 066 contents
Last Updated: 2023-08-28
Copyright © 2000-2023 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
URL: http://www.JavaPerformanceTuning.com/news/newtips066.shtml
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us