In-depth analysis of SynchronizedMap and ConcurrentHashMap

Before we start, what is Map?

The explanation of Map in javadoc is as follows:

An object that maps keys to values. A mapcannot contain duplicate keys; each key can map to at most one value.This interface takes the place of the Dictionary class, which was a totally abstract class rather than an interface. The Map interface provides three collection views, which allow a map\'s contents to be viewed as a set of keys, collection of values, or set of key-value mappings.      



 As can be seen from the above, Map is used to store "key-value" element pairs, which map a key to one and only one value.

Map can be implemented in a variety of ways. The implementation of HashMap uses a hash table; while TreeMap uses a red-black tree.

1. Hashtable 和 HashMap

The two classes differ mainly in the following ways:

    Both Hashtable and HashMap implement the Map interface, but the implementation of Hashtable is based on the Dictionary abstract class.

    In HashMap, null can be used as a key, and there is only one such key; there can be one or more keys corresponding to null.  When the get() method returns a null value, it means that the key does not exist in the HashMap, or that the value corresponding to the key is null. Therefore, in HashMap, the get() method cannot be used to judge whether a key exists in the HashMap, but the containsKey() method should be used to judge. In Hashtable, neither key nor value can be null  .

   The biggest difference between these two classes is that Hashtable is thread-safe, its methods are synchronized, and can be used directly in a multi-threaded environment. And HashMap is not thread safe. In a multithreaded environment, synchronization mechanisms need to be implemented manually. Therefore, a method is provided in the Collections class that returns a synchronized version of the HashMap for use in a multithreaded environment:

Java code
  1. publicstatic <K,V> Map<K,V> synchronizedMap(Map<K,V> m) {    
  2.     returnnew SynchronizedMap<K,V>(m);    
  3.  }  

This method returns an instance of SynchronizedMap .  The SynchronizedMap class is a static inner class defined in Collections. It implements the Map interface, and implements synchronization control through the synchronized  keyword for each method .

2. Potential thread safety issues

As mentioned above, Collections provides a concurrent version of SynchronizedMap for HashMap. The methods in this version are all synchronized, but that doesn't mean the class is necessarily thread-safe. At some point there will be some unexpected results.

Such as the following code:

Java code
  1. // shm is an instance of SynchronizedMap   
  2. if(shm.containsKey(\'key\')){   
  3.         shm.remove(key);   
  4. }  

 This code is used to determine whether an element exists before removing it from the map. The containsKey and reomve methods here are both synchronous, but the entire code is not. Consider such a usage scenario: Thread A executes the containsKey method to return true, ready to execute the remove operation; then another thread B starts executing, also executes the containsKey method to return true, and then executes the remove operation; then thread A then executes the remove operation During operation, it is found that this element does not exist at this time. One way to ensure that this code works as we want is to have synchronous control over this code, but doing so is too expensive.

This problem changes significantly as you iterate. The Map collection provides three ways to return a collection of keys, values, and key-value pairs:

Java code
  1. Set<K> keySet();   
  2.   
  3. Collection<V> values();   
  4.   
  5. Set<Map.Entry<K,V>> entrySet();  

 On the basis of these three methods, we generally access the elements of Map in the following ways:

Java code
  1. Iterator keys = map.keySet().iterator();   
  2.   
  3. while(keys.hasNext()){   
  4.         map.get(keys.next());   
  5. }  

One thing to note here is that the resulting keySet and iterator are a "view" of the elements in the Map, not a "copy"  . The problem here is that while one thread is iterating over the elements in the Map, another thread may be modifying the elements in it. In this case, modCount != expectedModCount may throw when iterating over elements ConcurrentModificationException异常。

为了解决这个问题通常有两种方法,一是直接返回元素的副本,而不是视图。

这个可以通过The toArray() method of the collection class is implemented, but the efficiency of creating copies is lower than before, especially when there are many elements; another method is to lock the entire collection when iterating, which is even less efficient .  

 

3. Better option: ConcurrentHashMap

The ConcurrentMap interface and one of its implementation classes ConcurrentHashMap have been added in java5. In ConcurrentHashMap, neither key nor value can be null  . ConcurrentHashMap provides a different locking mechanism from Hashtable and SynchronizedMap. The locking mechanism used in Hashtable is to lock the entire hash table at a time, so that only one thread can operate on it at the same time; while in ConcurrentHashMap, one bucket is locked at a time . ConcurrentHashMap divides the hash table into 16 buckets by default. Common operations such as get, put, and remove only lock the buckets that are currently needed. In this way, only one thread can enter in the past, but now 16 writing threads can execute at the same time, and the improvement of concurrent performance is obvious.

The 16 threads mentioned above refer to the write thread, and most of the read operations do not need to use locks . The entire hash table needs to be locked only for operations such as size .

In terms of iteration, ConcurrentHashMap uses a different way of iteration. In this iterative method, when the iterator is created and the collection changes again, ConcurrentModificationException is no longer thrown. Instead, new data is created when the iterator is changed so that the original data is not affected  . After the iterator is completed, the head pointer is replaced. For new data  , so that the iterator thread can use the old data , and the writer thread can also complete the change concurrently.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326996254&siteId=291194637