Tuesday, March 15, 2011

Concurrent HashMap and CopyOnWriteArrayList


What is ConcurrentModificationException?

In Java ConcurrentModificationException comes when a collection data set is being iterated in multiple threads and at least one of the thread tries to modify the data set stored in the collection.

Lets take an example of ArrayList.There are two threads which are using this shared ArrayList.One of the thread is just iterating the collection and printing it out.Another thread is iterating the collection and deleting the data if it matches with the value passed.

As a result of which we get unchecked exception- ConcurrentModificationException at runtime and program terminates.

ConnModificationExample.java
package com.kunaal.concurrent.list;

import java.util.ArrayList;
import java.util.List;

/**
 * This example starts two threads iterating over a list 
 * One of the thread tries to remove a value 
 * Since both threads share the same data structure we 
 * end up having ConcurrentModification Exception
 *  
 * @author Kunaal A Trehan
 *
 */
public class ConnModificationExample {

 /**
  * @param args
  */
 public static void main(String[] args) {
  List<String> dataList=initializeList();
  DelRunImpl delImpl=new DelRunImpl(dataList, "Grapes");
  TraverseRunImpl traverseImpl=new TraverseRunImpl(dataList);

  Thread traverseThread=new Thread(traverseImpl);
  Thread delThread=new Thread(delImpl);
  
  traverseThread.start();
  delThread.start();
 }
 
 /**
  * Utility method returning list containing fruit names
  * @return
  */
 private static List<String> initializeList(){
  List<String> dataList=new ArrayList<String>();
  dataList.add("Apple");
  dataList.add("Banana");
  dataList.add("Water Melon");
  dataList.add("Musk Melon");
  dataList.add("Grapes");
  dataList.add("Strawberry");
  dataList.add("Mango");
  
  return dataList;
 }

}

DelRunImpl.java
package com.kunaal.concurrent.list;

import java.util.Iterator;
import java.util.List;

/**
 * Run implementation trying to delete data from the 
 * shared data list.
 * 
 * @author Kunaal A Trehan
 *
 */
public class DelRunImpl implements Runnable{

 private List<String> dataList;
 private String delData; 
 
 /**
  * Parameterized constructor containing datalist and date to be deleted.
  * 
  * @param dataList
  * @param dataToDel
  */
 public DelRunImpl(List<String> dataList,String dataToDel) {
  this.dataList=dataList;
  this.delData=dataToDel;
 }

 /**
  * Overridden run implementation where we iterate through the 
  * list and check the contents 
  * If it matches with the data to be deleted 
  * Then current thread sleeps for 1000 milli seconds
  * before deleting the value
  */
 @Override
 public void run() {
  Iterator<String> iterator = dataList.iterator();
  
  while(iterator.hasNext()){
   String nextVal = iterator.next();
   
   if(nextVal.equals(delData)){
    try {
     Thread.currentThread().sleep(1000);
    } catch (InterruptedException e) {
     e.printStackTrace();
    }
    iterator.remove();
   }
   
  }
    
 }

}


TraverseRunImpl.java
package com.kunaal.concurrent.list;

import java.util.Iterator;
import java.util.List;

/**
 * Run implementation in which we traverse through  the elements
 * of the array list and print it out .
 * 
 * @author Kunaal A Trehan
 *
 */
public class TraverseRunImpl implements Runnable {

 private List<String> dataList;
 
 /**
  * Parameterized constructor 
  */
 public TraverseRunImpl(List<String> dataList) {
  this.dataList=dataList;  
 }

 /**
  * Overriden run method in which we iterate to next element 
  * In between thread sleeps for 1000 milli seconds
  */
 @Override
 public void run() {
  Iterator<String> iterator = dataList.iterator();
  
  while(iterator.hasNext()){
   System.out.println("Element is-"+ iterator.next());
   try {
    Thread.currentThread().sleep(1000);
   } catch (InterruptedException e) {
    e.printStackTrace();
   }
  }

 }

}

Data Output
Element is-Apple
Exception in thread "Thread-0" java.util.ConcurrentModificationException
 at java.util.AbstractList$Itr.checkForComodification(Unknown Source)
 at java.util.AbstractList$Itr.next(Unknown Source)
 at com.kunaal.concurrent.list.TraverseRunImpl.run(TraverseRunImpl.java:33)
 at java.lang.Thread.run(Unknown Source)

So it shows that List does not handle the situation gracefully when more than one thread is using the dataset and modifying the underlying collection.
We could use synchronization to handle such scenarios.However its not the ideal solution in most of the cases.Imagine a production system where we have frequent listing and modification routines the system will choke if we use synchronization.

So Java 1.5+ introduced ConcurrentHashMap,CopyOnWriteArrayList and others to make the lock more finer and reduce this contention.We will go through this in detail in further sections.

ConcurrentHashMap

Normally when we acquire a lock on Hash Map full data set is locked.No other thread which requires exclusive lock for any operation can start even though different threads need to work in different regions.Still access to data set will be sequential one by one.
    Hash Map creates buckets/regions to store the data.Java states that if two objects are equal,their hash code must be same.However if hash codes are same,it does not mean that objects are equal .It may or may not be equal.

    ConcurrentHashMap on the contrary does not use one lock for the full data set.It uses fine grained approach and uses lock per region/bucket to allow following operations

    • Multiple reads can execute concurrently.
    • Simultaneous reads and writes can execute concurrently.
    • Multiple simultaneous writes can execute concurrently as long as writes are happening in different buckets/regions
      ConcurrentHashMap, along with the other concurrent collections, further improve on the synchronized collection classes by providing iterators that do not throw ConcurrentModificationException, thus eliminating the need to lock the collection during iteration.


      When we have different threads iterating/modifying/adding/..... the data set.Concurrent Hash Map does not throw ConcurrentModificationException.However it does not loose the modification.But whether it would be reflected in the other thread or not depends upon the state of the other thread which region it is traversing and others.

      To understand it better lets go through few examples.

      Example :- In this example I have a ConcurrentHashMap having Country Name as Key and population as the value.I have created two threads,one is iterating the map and other thread adds the data element to it.If we tweak the sleep time interval in the thread which adds the data element,other thread which is traversing the data set,sees the new entry.It all depends whether the add operation has finished by the time other thread has reached that bucket or not.

      TraverseInfoRunImpl.java
      package com.kunaal.concurrent.map;
      
      import java.util.Iterator;
      import java.util.concurrent.ConcurrentMap;
      
      /**
       * Run implementation printing data set information.
       * 
       * @author Kunaal A Trehan
       *
       */
      public class TraverseInfoRunImpl implements Runnable{
       
       private ConcurrentMap<String,Long>  dataMap;
      
       @Override
       public void run() {
        Iterator<String>  iterator = dataMap.keySet().iterator();
        
        while(iterator.hasNext()){
         String ctryName = iterator.next();
         System.out.println(ctryName + " has population of " + dataMap.get(ctryName));
         try {
          Thread.currentThread().sleep(1000);
         } catch (InterruptedException e) {
          e.printStackTrace();
         }
        }
       }
      
       /**
        * @param dataMap
        */
       public TraverseInfoRunImpl(ConcurrentMap<String,Long>  dataMap) {
        this.dataMap = dataMap;
       }
      
      }
      


      AddInfoRunImpl.java
      package com.kunaal.concurrent.map;
      
      import java.util.concurrent.ConcurrentMap;
      
      /**
       * Run implementation adding new data element to 
       * concurrent hash map.
       * 
       * @author Kunaal A Trehan
       *
       */
      public class AddInfoRunImpl implements Runnable {
       
       private ConcurrentMap<String,Long>  dataMap;
      
       /**
        * Parameterized constructor
        * @param dataMap
        */
       public AddInfoRunImpl(ConcurrentMap<String,Long>  dataMap) {
        this.dataMap = dataMap;
       }
      
       /**
        * Over ridden run() method
        */
       @Override
       public void run() {
        try {
         Thread.currentThread().sleep(1000);//Newly added entry gets reflected in other threadd
         //Thread.currentThread().sleep(5000);//Newly added entry does not get reflected in other thread
        } catch (InterruptedException e) {
         e.printStackTrace();
        }
        dataMap.putIfAbsent("Iceland", new Long(100000 ));
        System.out.println("Iceland added");
       }
      
      }
      

      ConcurrentMapExample.java
      package com.kunaal.concurrent.map;
      
      import java.util.concurrent.ConcurrentHashMap;
      import java.util.concurrent.ConcurrentMap;
      
      /**
       * Concurrent HashMap example in which there are two threads
       * In one thread we are iterating the data set 
       * In another thread we are adding the data
       * 
       * By tweaking the sleep interval in AddInfoRunImpl,
       * newly added data element can be reflected in other concurrently 
       * executing thread.
       * @author Kunaal A Trehan
       *
       */
      public class ConcurrentMapExample {
      
       /**
        * @param args
        */
       public static void main(String[] args) {
        ConcurrentMap<String,Long> dataMap=createDataMap();
        AddInfoRunImpl addImpl=new AddInfoRunImpl(dataMap);
        TraverseInfoRunImpl traverseImpl=new TraverseInfoRunImpl(dataMap);
        
        Thread traverseThread=new Thread(traverseImpl);
        Thread addThread=new Thread(addImpl);
        
        traverseThread.start();
        addThread.start();
       }
      
      
       /**
        * Utility method returning concurrent map containing 
        * information -name of the city and corresponding population.
        * 
        * @return
        */
       private static ConcurrentMap<String,Long> createDataMap(){
        ConcurrentMap<String,Long> dataMap=new ConcurrentHashMap<String, Long>();
        dataMap.putIfAbsent("NYC", new Long(20000000));
        dataMap.putIfAbsent("SFO", new Long(17500000));
        dataMap.putIfAbsent("Chicago", new Long(16500000));
        dataMap.putIfAbsent("London", new Long(24000000));
        dataMap.putIfAbsent("Tokyo", new Long(34500000));
        dataMap.putIfAbsent("New Delhi", new Long(10000000));
        dataMap.putIfAbsent("Bangalore", new Long(9500000));
        dataMap.putIfAbsent("Lisbon", new Long(6500000));
        
        return dataMap;
       }
      }
      

      Output when add thread slept for 1000 ms.In this case other thread was able to see the new entity added
      Bangalore has population of 9500000
      Iceland added
      Tokyo has population of 34500000
      SFO has population of 17500000
      Lisbon has population of 6500000
      London has population of 24000000
      Iceland has population of 100000
      NYC has population of 20000000
      New Delhi has population of 10000000
      Chicago has population of 16500000
      

      Output when add thread slept for 5000 ms.In this case other thread was not able to see the new entity added
      Bangalore has population of 9500000
      Tokyo has population of 34500000
      SFO has population of 17500000
      Lisbon has population of 6500000
      London has population of 24000000
      NYC has population of 20000000
      Iceland added
      New Delhi has population of 10000000
      Chicago has population of 16500000
      


      Example :- In this example I have a ConcurrentHashMap having Country Name as Key and population as the value.I have created two threads,one is iterating the map and other thread deletes a particular data element.If we tweak the sleep time interval in the thread which deletes the data element,other thread which is traversing the data set,may not see the deleted entry.It all depends whether the delete operation has finished by the time other thread has reached that bucket or not.

      ConcurrentMapDelExample.java
      package com.kunaal.concurrent.map;
      
      import java.util.concurrent.ConcurrentHashMap;
      import java.util.concurrent.ConcurrentMap;
      
      /**
       * Concurrent HashMap example in which there are two threads
       * In one thread we are iterating the data set 
       * In another thread we are deleting the data
       * 
       * @author Kunaal A Trehan
       */
      public class ConcurrentMapDelExample {
      
       /**
        * @param args
        */
       public static void main(String[] args) {
        ConcurrentMap<String,Long> dataMap=createDataMap();
        DelRunImpl delImpl=new DelRunImpl(dataMap,"Lisbon");
        TraverseInfoRunImpl traverseImpl=new TraverseInfoRunImpl(dataMap);
        
        Thread traverseThread=new Thread(traverseImpl);
        Thread delThread=new Thread(delImpl);
        
        traverseThread.start();
        delThread.start();
       }
      
      
       /**
        * Utility method returning concurrent map containing 
        * information -name of the city and corresponding population.
        * 
        * @return
        */
       private static ConcurrentMap<String,Long> createDataMap(){
        ConcurrentMap<String,Long> dataMap=new ConcurrentHashMap<String, Long>();
        dataMap.putIfAbsent("NYC", new Long(20000000));
        dataMap.putIfAbsent("SFO", new Long(17500000));
        dataMap.putIfAbsent("Chicago", new Long(16500000));
        dataMap.putIfAbsent("London", new Long(24000000));
        dataMap.putIfAbsent("Tokyo", new Long(34500000));
        dataMap.putIfAbsent("New Delhi", new Long(10000000));
        dataMap.putIfAbsent("Bangalore", new Long(9500000));
        dataMap.putIfAbsent("Lisbon", new Long(6500000));
        
        return dataMap;
       }
      
      }
      

      DelRunImpl.java
      package com.kunaal.concurrent.map;
      
      import java.util.Iterator;
      import java.util.concurrent.ConcurrentMap;
      
      /**
       * Run implementation deleting a particular key.
       * 
       * @author Kunaal A Trehan
       *
       */
      public class DelRunImpl implements Runnable {
      
       private ConcurrentMap<String,Long>  dataMap;
       
       private String keyToBeDel;
       
       /**
        * @param dataMap
        * @param keyToBeDel
        */
       public DelRunImpl(ConcurrentMap<String, Long> dataMap, String keyToBeDel) {
        this.dataMap = dataMap;
        this.keyToBeDel = keyToBeDel;
       }
      
       @Override
       public void run() {
        Iterator<String> iterator = dataMap.keySet().iterator();
        
        while(iterator.hasNext()){
         String next = iterator.next();
         
         if(next.equals(keyToBeDel)){
          try {
           Thread.currentThread().sleep(1000);
          } catch (InterruptedException e) {
           e.printStackTrace();
          }
          dataMap.remove(next);
          break;
         }   
        }
      
       }
      
      }
      

      Output showing the deleted entry does get reflected when other thread is traversing the data set
      Bangalore has population of 9500000
      Tokyo has population of 34500000
      Data corresponding to key-Lisbon deleted from map
      SFO has population of 17500000
      London has population of 24000000
      NYC has population of 20000000
      New Delhi has population of 10000000
      Chicago has population of 16500000
      


      Some important points regarding ConcurrentHashMap
      • Iterators provided by ConcurrentHashMap does not throw ConcurrentModificationException,eliminating the need to lock the collection while iterating the data set. 
      • Since ConcurrentHashMap uses fine grained locks instead of single lock it allows multiple write operations to happen concurrently as long it effects different regions of concurrent hash map.
      • ConcurrentHashMap provides utility methods for compound actions like putIfAbsent,conditional remove and conditional replace.
      • Read operation uses lock only in one case when value is null.This can occur when object is added while initialization is still underway.
      CopyOnWriteArrayList

      This is new class added  in java.util.concurrent package for supporting concurrent threads execution.Its an alternative to synchronized list implementation.

      Whenever any thread modifies the data set a new list  is created and thread updates the newly created list.Any active iterators still points to old copy of the list.Later on all references will point to this newly created list.

      Like ConcurrentHashMap,CopyOnWriteArrayList also does not throw ConcurrentModificationException. Iterators created by a CopyOnWriteArrayList object cannot change the underlying array. Though these iterators do have methods for changing the underlying collection, their implementations throw an UnsupportedOperationException instead of modifying the underlying collection.

      To understand it better lets go through few examples.

      Example: -  In this example I have created CopyOnWriteArrayList containing names of fruits.I have created two threads,one is traversing the list using the iterator.Other thread is tries to delete the entry from the list using iterator.Since iterator returned by CopyOnWriteArrayList does not allow any modification.It throws UnsupportedOperationException.


      TraverseRunImpl.java

      package com.kunaal.concurrent.list;
      
      import java.util.Iterator;
      import java.util.List;
      
      /**
       * Run implementation in which we traverse through  the elements
       * of the array list and print it out .
       * 
       * @author Kunaal A Trehan
       *
       */
      public class TraverseRunImpl implements Runnable {
      
       private List<String> dataList;
       
       /**
        * Parameterized constructor 
        */
       public TraverseRunImpl(List<String> dataList) {
        this.dataList=dataList;  
       }
      
       /**
        * Overriden run method in which we iterate to next element 
        * In between thread sleeps for 1000 milli seconds
        */
       @Override
       public void run() {
        Iterator<String> iterator = dataList.iterator();
        
        while(iterator.hasNext()){
         System.out.println("Element is-"+ iterator.next());
         try {
          Thread.currentThread().sleep(1000);
         } catch (InterruptedException e) {
          e.printStackTrace();
         }
        }
      
       }
      
      }
      

      DelRunImpl.java
      package com.kunaal.concurrent.list;
      
      import java.util.Iterator;
      import java.util.List;
      
      /**
       * Run implementation trying to delete data from the 
       * shared data list.
       * 
       * @author Kunaal A Trehan
       *
       */
      public class DelRunImpl implements Runnable{
      
       private List<String> dataList;
       private String delData; 
       
       /**
        * Parameterized constructor containing datalist and date to be deleted.
        * 
        * @param dataList
        * @param dataToDel
        */
       public DelRunImpl(List<String> dataList,String dataToDel) {
        this.dataList=dataList;
        this.delData=dataToDel;
       }
      
       /**
        * Overridden run implementation where we iterate through the 
        * list and check the contents 
        * If it matches with the data to be deleted 
        * Then current thread sleeps for 1000 milli seconds
        * before deleting the value
        */
       @Override
       public void run() {
        Iterator<String> iterator = dataList.iterator();
        
        while(iterator.hasNext()){
         String nextVal = iterator.next();
         
         if(nextVal.equals(delData)){
          try {
           Thread.currentThread().sleep(1000);
          } catch (InterruptedException e) {
           e.printStackTrace();
          }
          iterator.remove();
         }
         
        }
          
       }
      
      }
      

      AddRemoveExample.java
      package com.kunaal.concurrent.list;
      
      import java.util.List;
      import java.util.concurrent.CopyOnWriteArrayList;
      
      /**
       * This example uses CopyOnWriteArraylist
       * Two thread will be traversing the list 
       * and one of the thread will try to remove some elements
       * 
       * @author Kunaal A Trehan
       *
       */
      public class AddRemoveExample {
      
       /**
        * @param args
        */
       public static void main(String[] args) {
        List<String> dataList=initializeList();
        DelRunImpl delRunImpl=new DelRunImpl(dataList, "Grapes");  
        TraverseRunImpl traverseRunImpl=new TraverseRunImpl(dataList);
        
        Thread traverseThread=new Thread(traverseRunImpl,"TraverseThread");
        Thread delThread=new Thread(delRunImpl,"DeleteThread");
        
        traverseThread.start();
        delThread.start();  
       }
       
       /**
        * Utility method returning list containing fruit names
        * @return
        */
       private static List<String> initializeList(){
        List<String> dataList=new CopyOnWriteArrayList<String>();
        dataList.add("Apple");
        dataList.add("Banana");
        dataList.add("Water Melon");
        dataList.add("Musk Melon");
        dataList.add("Grapes");
        dataList.add("Strawberry");
        dataList.add("Mango");
        
        return dataList;
       }
      
      }
      

      Output showing UnsupportedOperationException for DeleteThread.However other thread keeps on running as if nothing happened.
      Element is-Apple
      Exception in thread "DeleteThread" java.lang.UnsupportedOperationException
       at java.util.concurrent.CopyOnWriteArrayList$COWIterator.remove(Unknown Source)
       at com.kunaal.concurrent.list.DelRunImpl.run(DelRunImpl.java:49)
       at java.lang.Thread.run(Unknown Source)
      Element is-Banana
      Element is-Water Melon
      Element is-Musk Melon
      Element is-Grapes
      Element is-Strawberry
      Element is-Mango
      

      Example:- In this example I have created two threads.One thread is traversing the list.Other list adds an entry to the underlying data list.Once both the threads finishes the work.List is iterated again in the main thread showing that newly added entry is reflected and now all references point to newly created list.

      ConnListExample.java
      package com.kunaal.concurrent.list;
      
      import java.util.Iterator;
      import java.util.List;
      import java.util.concurrent.CopyOnWriteArrayList;
      
      /**
       * This example uses CopyOnWriteArrayList
       * Here one thread iterates over the list 
       * and another thread tries to add on element .
       * 
       * @author Kunaal A Trehan
       *
       */
      public class ConnListExample {
      
       /**
        * @param args
        */
       public static void main(String[] args) {
        List<String> dataList=initializeList();
        TraverseRunImpl traverseImpl=new TraverseRunImpl(dataList);
        AddRunImpl addRunImpl=new AddRunImpl(dataList);
      
        Thread traverseThread=new Thread(traverseImpl,"TraverseThread");
        Thread addThread=new Thread(addRunImpl,"AddThread");
        
        traverseThread.start();
        addThread.start();
        
        try {
         traverseThread.join();
         addThread.join();
        } catch (InterruptedException e) {
         e.printStackTrace();
        }
        
        System.out.println("--------------Results after thread finished work--------------");
        Iterator<String> itr=dataList.iterator();
        while(itr.hasNext()){
         System.out.println(itr.next());
        }
         
       }
       
       /**
        * Utility method returning list containing fruit names
        * @return
        */
       private static List<String> initializeList(){
        List<String> dataList=new CopyOnWriteArrayList<String>();
        dataList.add("Apple");
        dataList.add("Banana");
        dataList.add("Water Melon");
        dataList.add("Musk Melon");
        dataList.add("Grapes");
        dataList.add("Strawberry");
        dataList.add("Mango");
        
        return dataList;
       }
      
      }
      


      AddRunImpl.java
      package com.kunaal.concurrent.list;
      
      import java.util.List;
      
      /**
       * Run implementation trying to add data to the shared data list.
       * @author Kunaal A Trehan
       *
       */
      public class AddRunImpl implements Runnable{
       
       private List<String> dataList;
      
       /**
        * Overriden run method where we add one element in the data list.
        */
       @Override
       public void run() {
        try {
         Thread.currentThread().sleep(1000);
        } catch (InterruptedException e) {
         e.printStackTrace();
        }
        dataList.add("Pineapple");
        System.out.println("Pineapple added by "+ Thread.currentThread().getName());
       }
      
       /**
        * @param dataList
        */
       public AddRunImpl(List<String> dataList) {
        super();
        this.dataList = dataList;
       }
      
      }
      

      TraverseRunImpl.java
      package com.kunaal.concurrent.list;
      
      import java.util.Iterator;
      import java.util.List;
      
      /**
       * Run implementation in which we traverse through  the elements
       * of the array list and print it out .
       * 
       * @author Kunaal A Trehan
       *
       */
      public class TraverseRunImpl implements Runnable {
      
       private List<String> dataList;
       
       /**
        * Parameterized constructor 
        */
       public TraverseRunImpl(List<String> dataList) {
        this.dataList=dataList;  
       }
      
       /**
        * Overriden run method in which we iterate to next element 
        * In between thread sleeps for 1000 milli seconds
        */
       @Override
       public void run() {
        Iterator<String> iterator = dataList.iterator();
        
        while(iterator.hasNext()){
         System.out.println("Element is-"+ iterator.next());
         try {
          Thread.currentThread().sleep(1000);
         } catch (InterruptedException e) {
          e.printStackTrace();
         }
        }
      
       }
      
      }
      

      Output showing all references are updated to newly created list having 'Pineapple'.
      Element is-Apple
      Element is-Banana
      Pineapple added by AddThread
      Element is-Water Melon
      Element is-Musk Melon
      Element is-Grapes
      Element is-Strawberry
      Element is-Mango
      --------------Results after thread finished work--------------
      Apple
      Banana
      Water Melon
      Musk Melon
      Grapes
      Strawberry
      Mango
      Pineapple
      

      Some important points regarding CopyOnWriteArrayList
      • Iterators returned by CopyOnWriteArrayList does not allow any modification of the date set.
        It will throw UnsupportedOperationException.
      • Whenever any modification on the data set is performed by any thread.A new copy of the list is created and modification is performed in that.Later on all object references synch up to the newly created list.







      2 comments:

      1. It was precisely what I needed... Thank you for this post!!!

        ReplyDelete
      2. This Details is Awsome, well explain.
        thank you

        ReplyDelete