Java Performance Tuning

Java(TM) - see bottom of page

|home |services |training |newsletter |tuning tips |tool reports |articles |resources |about us |site map |contact us |
Tools: | GC log analysers| Multi-tenancy tools| Books| SizeOf| Thread analysers|

Our valued sponsors who help make this site possible
JProfiler: Get rid of your performance problems and memory leaks! 

Training online: Threading Essentials course 

Learning From Code: Fast Fail Iterators, page 2 of 2

JProfiler
Get rid of your performance problems and memory leaks!


Java Performance Training Courses
COURSES AVAILABLE NOW. We can provide training courses to handle all your Java performance needs

Java Performance Tuning, 2nd ed
The classic and most comprehensive book on tuning Java

Java Performance Tuning Newsletter
Your source of Java performance news. Subscribe now!
Enter email:


Training online
Threading Essentials course


JProfiler
Get rid of your performance problems and memory leaks!


What is a mysterious field called modCount doing in the List Iterator classes? How can we efficiently handle iterating over collections when another thread may be updating the collection at the same time? Read on in our "Fast Fail Iterators" article from our "Learning From Code" series.

Published January 2004, Authors Jack Shirazi, Kirk Pepperdine

Page 2 of 2
previous page: Iteration and concurrency problems

Locking The Collection

Just as databases lock tables and rows while performing updates, we could lock the collection while we were iterating over it to solve the concurrent access/update problem. In this scenario, we prevent any thread from changing the collection while iteration takes place, thus guaranteeing predictable behavior. So, we might use a Vector object for our collection since that class has it's accessors and updators synchronized on the Vector object itself. Then your iteration can lock the Vector simply:


  public ... search (Vector list)
  {
    //Synchronizing on the Vector itself ensures
    //that no other threads can use the Vector
    //accessors or updators while we are in the
    //code block
    synchronized(list)
    {
      ListIterator iterator = list.listIterator();
      while(iterator.hasNext())
      {
        Object o = iterator.next();
        //do something 
        ...
      }
    }
  }

By taking a pessimistic approach, we have guaranteed that we will avoid any corruption of the collection and that the iteration is consistent. However, there is a cost to this scenario. First we have to use a synchronized collection which has a slightly higher overhead than a non-synchronized collection. And secondly, and much more importantly for performance, the threads are serialized for access and update to the collection, including for the duration of this and any other synchronized iterations. Each thread trying to update, access, or iterate over the collection must wait for any other thread doing so to finish before it can start. In the worst case scenario, this would convert a mutlti-threaded program into an effectively single-threaded program, independently of the number of CPUs available.

Copy On Write

A third scenario is to use a List class like the CopyOnWriteArrayList class, due to be available in the java.util.concurrent package of the 1.5 distribution, and available now as part of Doug Lea's util.concurrent classes. In this case, the iterator carries on iterating over the previous version of the list, while the updator gets to use a new version of the list. In database parlance, these are optimistic transactions; each session gets to carry on it's transaction as if the other didn't run, though the iterator is now using out-of-date information (in a transaction one session would fail to commit its transaction). This solution can carry a heavy performance overhead from the copying, especially if there are many concurrency conflicts.

Fast Fail

The fourth scenario to handle the concurrency issue is to (optimistically) assume that multiple threads with access to the collection will mostly not encounter concurrent update problems. In this case, locking the collection is an unnecessary overhead. Instead, what you can do is just catch those few occasions when concurrent access to the collection is an issue, and handle those cases specially. This is the tactic that AbstractList takes with it's iterators.

AbstractList includes an instance variable called modCount (short for "modification count"). Every time any change is made to the AbstractList object, the modCount is incremented. (Subclasses of AbstractList are expected to adhere to this contract, though it cannot be enforced.) The AbstractList iterators retain the value of modCount that existed when they are created, and check at every access and update whether that modCount variable has been changed from that creation time value. If the modCount variable is changed, then the iterator throws a java.util.ConcurrentModificationException. The iteration code looks like this:


  public ... search (List list)
  {
    try
    {
      ListIterator iterator = list.listIterator();
      while(iterator.hasNext())
      {
        Object o = iterator.next();
        //do something 
        ...
      }
    }
    catch (ConcurrentModificationException concEx)
    {
      //handle this case, maybe just iterate the collection again
      //e.g. return search(list);
    }
  }

This scenario has a very small overhead in the additional cost of updating the modCount variable on any change to the collection, and comparing the modCount variable for each iteration. This should be negligible in most cases.

Optimistic Vs. Pessimistic

The two alternative scenarios of locking the collection or throwing an exception on detecting concurrent access/update are special cases of the more generic pessimistic transaction vs. optimistic transactions scenarios. Pessimistic transactions lock the recources they need to ensure that they can complete their activity without errors, at the cost of making all other transactions wait for the resources to be freed. Optimistic transactions avoid locking resources, detecting if conflicts have occured which require rolling back and re-applying the transaction.

Optimistic transactional processing is usually more efficient where reads predominate over writes; pessimistic transactional processing is usually more efficient where writes predominate over reads. In the case of iterators, if you are unlikely to have an update occur during the iteration, using the "fast fail" scenario is more efficient. But if updates often occur during iteration of the collection, then using "fast fail" would end up producing many ConcurrentModificationExceptions and and also require the iteration to be re-run many times. In this situation locking the collection is usually more efficient.

It is also possible to mix pessimistic and optimistic modes, by locking some resources but not others, and this is a way of fine tuning an application. In the scenarios we described in this article, that mixed mode option is not available, but many concurrency conflicts are more complicated and can benefit from a mixed mode approach.

So there you have it, an interesting technique in which we can use an optimistic approach to solve a concurrency issue. Although you have to be careful when using optimitic approaches to these types of problems, done right, they can radically improve the scalability characteristics of your application. In this instance, we have seen how some clever coding has resulted in one being able to allow safe concurrent access to Iterators without forcing that access to be mutually exclusive.

Page 2 of 2
previous page: Iteration and concurrency problems

My colleague Brian Goetz points out that because modCount is not declared volatile, you cannot rely on the modCount value in the iterating thread to be updated in a timely manner. So one thread could alter the collection but the change might not be noticed by the iterating thread until later. If this is significant behavior for your application, you will need to synchronize access across the threads, or use some alternative acceptable implementation. Implementing your own iterators with modCount declared volatile is one option. The reason modCount is not declared volatile is for performance - if it was declared volatile then the majority of applications that didn't need timely thread concurrent modification notification would suffer at the expense of those few applications that did need that functionality.


Last Updated: 2018-11-28
Copyright © 2000-2018 Fasterj.com. All Rights Reserved.
All trademarks and registered trademarks appearing on JavaPerformanceTuning.com are the property of their respective owners.
Java is a trademark or registered trademark of Oracle Corporation in the United States and other countries. JavaPerformanceTuning.com is not connected to Oracle Corporation and is not sponsored by Oracle Corporation.
URL: http://www.JavaPerformanceTuning.com/articles/fastfail2.shtml
RSS Feed: http://www.JavaPerformanceTuning.com/newsletters.rss
Trouble with this page? Please contact us