ConcurrentHashMap ConcurrentHashMap the principle and source code analysis

Before there is the underlying principle wrote HashMap, and today again write thread-safe ConcurrentHashMap:

In the review the knowledge before it:

  • HashMap :

HashMap is not thread-safe , in a concurrent environment, may form a cyclic linked list (may cause during expansion, the specific reasons for their own Baidu google or view source code analysis), the operating result for get, cpu idle, therefore, be used in a concurrent environment HashMap is very dangerous.

 

  • HashTable :

The principle HashTable and HashMap is almost the same, the difference is nothing more than

1.HashTable not allow key and value is null;

2.HashTable is thread-safe of.

But HashTable thread-safe policy implementation costs are too great, simple and crude, get / put all operations are synchronized , which is equivalent to the entire hash table plus a large lock , multi-threaded access, as long as there is a thread access or manipulate the object, that other threads can only be blocked, the equivalent of all the operations serialization , in the competitive scenario of concurrent performance will be very poor.

as the picture shows:

 

HashTable poor performance was mainly due to all the operations required to compete the same lock , and if the container has multiple locks, the locks every piece of data, when such data in different segments of the multi-threaded access, there would not be a lock contention , so that we can effectively improve the efficiency of concurrency. This is ConcurrentHashMap used " sub-lock " thinking.

Illustrated as follows:

 

 

  •  ConcurrentHashMap

 ConcurrentHashMap with a very subtle "segment lock" policy, ConcurrentHashMap trunk is Segment array.

And notes , Segment is final.

final Segment<K,V>[] segments;

Segment inherited ReentrantLock , so it is a kind of reentrant lock (ReentrantLock).

In ConcurrentHashMap, a Segment is a sub-hash table , Segment in maintaining a HashEntry array of concurrent environment, the operating data for different Segment is regardless of lock contention. (On by default ConcurrentLeve 16 is concerned, in theory, allows 16 concurrent execution threads).

So, for the operation of the same Segment only need to consider thread synchronization, there is no need to consider the different Segment.

 

Once again --Segment similar to HashMap, a Segment maintains a HashEntry array. The HashEntry arrays are:

 transient volatile HashEntry<K,V>[] table;

HashEntry logic processing unit is the smallest of the current we mentioned.

In other words: a ConcurrentHashMap maintain a Segment array, a Segment maintain a HashEntry array.

static final class HashEntry<K,V> {
        final int hash;
        final K key;
        volatile V value;
        volatile HashEntry<K,V> next;
        //其他省略
}

We say Segment similar hash table, then some properties just like we mentioned before HashMap Cha Buli, such as load factor loadFactor, such as threshold threshold and so on, look at the constructor Segment:

Segment ( a float LF, int threshold, HashEntry <K, V> [] Tab) {
             the this .loadFactor = LF; // load factor of 
            the this .threshold = threshold; // threshold 
            the this .table = Tab; // trunk array i.e. HashEntry array 
        }

We look at the constructor of ConcurrentHashMap:

public of ConcurrentHashMap ( int initialCapacity, a float loadFactor, int concurrencyLevel) {
           IF ((loadFactor> 0) initialCapacity || <|| concurrencyLevel 0 <= 0! )
               the throw  new new an IllegalArgumentException ();
           // to MAX_SEGMENTS 16 = 65536 << 1, i.e. the maximum number of concurrent 65536
           IF (concurrencyLevel> to MAX_SEGMENTS) 
              concurrencyLevel = to MAX_SEGMENTS;
           // sshif power equal to 2 ssize, Example: ssize the = 16, = sshift. 4; ssize the = 32, =. 5 sshif 
         int sshift = 0 ;
          / /ssize segments of the array length, obtained according concurrentLevel calculation 
         int ssize =. 1 ;
          the while (ssize < concurrencyLevel) {
              ++ sshift; 
             ssize << =. 1 ; 
         } 
         // segmentShift segmentMask and these two variables will be used when targeting segment , speaking in detail later 
         the this .segmentShift = 32 - sshift;
          the this .segmentMask = ssize the -. 1 ;
          IF (initialCapacity> MAXIMUM_CAPACITY) 
             initialCapacity = MAXIMUM_CAPACITY;
          // calculate the size of the cap, i.e. the length of the array in HashEntry Segment, and certainly for the cap 2 ^ n. 
         intinitialCapacity = C / ssize The;
          IF (ssize The C * < initialCapacity)
              ++ C;
          int CAP = MIN_SEGMENT_TABLE_CAPACITY;
          the while (CAP < C) 
             CAP << =. 1 ;
          // Create array segments and initialize the first Segment , the Segment remaining delay initialization 
         Segment <K, V> S0 = new new Segment <K, V> (loadFactor, ( int ) (CAP * loadFactor), 
                              (HashEntry <K, V> []) new new HashEntry [CAP]); 
         Segment <K, V> [] ss = (Segment <K, V> [])
             new Segment[ssize];
         UNSAFE.putOrderedObject(ss, SBASE, s0); 
         this.segments = ss;
     }

Initialization method has three parameters, if the user does not specify a default value is used, initialCapacity 16, loadFactor 0.75 (load factor, necessary to refer to expansion), concurrentLevel 16.

It can be seen from the above code:

Segment size of the array ssize concurrentLevel determined by, but not necessarily equal concurrentLevel, ssize must be equal to or greater than the minimum power of 2 concurrentLevel . For example: By default concurrentLevel 16, 16 of the ssize; if concurrentLevel to 14, 16 ssize; if concurrentLevel 17, 32 of the ssize.

Why Segment size of the array must be a power of 2?

Actually, the key is easy to press by bit hash algorithm to locate Segment of the index . For a more detailed reasons, we are interested can refer to another article " HashMap the principle and source code analysis ," which for the length of the array must be a power of 2. Why is there a more detailed analysis.

Next, we take a look at the put method:

public V PUT (key K, V value) { 
        Segment <K, V> S;
         // concurrentHashMap allowed key / value is empty
         IF (value == null )
             the throw  new new a NullPointerException ();
         // the hash function to the key of hashCode re-hash, avoiding bad hashCode unreasonable, the hash uniform 
        int hash = hash (Key);
         // unsigned arithmetic right bit mask return segmentShift bits of the hash value segment, positioning segment 
        int J = ( >>> segmentShift the hash) & segmentMask;    // here uses segmentShift and segmentMask, positioning segment;
         IF ((S = (segment <K, V>) UNSAFE.getObject           // Nonvolatile; Recheck in
             (segments, (j << SSHIFT) + SBASE)) == null) //  in ensureSegment
            s = ensureSegment(j);
        return s.put(key, hash, value, false);
    }

From the source, the main logic will put in two steps:

1. Locate segment and ensure the positioning of Segment initialized;

2. Call put the method Segment.

 

About segmentShift and segmentMask

  The main role segmentShift segmentMask global variables and is used to positioning Segment , J = int (the hash >>> segmentShift) & segmentMask.

  segmentMask : segment mask, if the length of the array 16 segments, the segments mask is 16-1 = 15; segments of length 32, the mask for the segment 32-1 = 31. All the thus obtained bit position is 1, can better ensure the uniformity of the hash

  segmentShift : sshift power equal to 2 ssize, segmentShift = 32-sshift. If the length of segments 16, segmentShift = 32-4 = 28; if the length of segments 32, segmentShift = 32-5 = 27. The calculated hash value of a maximum of 32-bit, unsigned shift right segmentShift, means that only a few high retention (the remaining bits are useless), then the mask segments positioned segmentMask Bitwise Segment.

 

get / put methods

  get method

public V get(Object key) {
        Segment<K,V> s; 
        HashEntry<K,V>[] tab;
        int h = hash(key);
        long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
        //先定位Segment,再定位HashEntry
        if ((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) != null &&
            (tab = s.table) != null) {
            for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile
                     (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE);
                 e != null; e = e.next) {
                K k;
                if ((k = e.key) == key || (e.hash == h && key.equals(k)))
                    return e.value;
            }
        }
        return null;
    }

get method does not require locking, Because this involves the use of shared variables are volatile modification, volatile can guarantee the visibility of memory , so it will not read stale data.

 

ConcurrentHashMap look at the method of the agent to put the Segment, the Segment is put method to be locked in. Just lock up just fine granularity.

final V put(K key, int hash, V value, boolean onlyIfAbsent) {
It will traverse to locate the unsuccessful when // tryLock list HashEnry position (traversing mainly to the CPU cache list), 
// if not found, created HashEntry. tryLock after a certain number (MAX_SCAN_RETRIES decision variables), the lock.
// If the traversal process, since the operation causes the other thread list head node changes, the need to traverse.
       HashEntry <K, V> = tryLock Node ()? Null : 
                scanAndLockForPut (Key, the hash, value); 
V oldValue; the try { HashEntry <K, V> [] Tab = Table;
          // positioning hashEntry, it can be seen that hash value in the positioning segment, and the segment will be positioned in HashEntry
// used, but when positioning Segment, only use a few high.
int index = (tab.length -. 1) & the hash; HashEntry <K, V> = First entryAt (Tab, index); for (HashEntry <K, V> E = First ;;) { IF (E =! null ) { K K; IF ((k = e.key) == key || (e.hash == hash && key.equals(k))) { oldValue = e.value; if (!onlyIfAbsent) { e.value = value; ++modCount; } break; } e = e.next; } else { if (node != null) node.setNext (First); the else Node = new new HashEntry <K, V> (the hash, Key, value, First); int c = COUNT +. 1 ;               // if c exceeds the threshold value threshold, and additional capacity is needed rehash. Capacity after the expansion is twice the current capacity.
// so you can maximize avoid hash good entry before re-hash, the specific article detailed analysis on the other, do not repeat them.
// expansion and rehash of this process is resource consuming.
IF (C> threshold && tab.length < MAXIMUM_CAPACITY) the rehash (Node); the else setEntryAt (Tab, index, Node); ++ ModCount; COUNT= c; oldValue = null; break; } } } finally { unlock(); } return oldValue; }

 

 

 

 

Over...

 

reference:

1. ConcurrentHashMap the principle and source code analysis

2. The  interview will ask ConcurrentHashMap realization of the principle: data structure, get and put operations (this one compare the image of the source code Each segment has a comment, you can reference the look)

 3.  ConcurrentHashMap principle analysis  (explained in detail what this initialization, put operation, get the operation process)

 

Guess you like

Origin www.cnblogs.com/gjmhome/p/11531091.html