ConcurrentHashMap of Java Concurrent Container

Take a look at the thread-safe ConcurrentHashMap in this article.

When it comes to Map structure, I think you should know HashTable and HashMap, but what is the difference between the two? HashTable is a thread-safe Map data structure, because its internal operation methods are modified by the synchronized keyword, which ensures that there will be no conflict in element operations under concurrent conditions. HashMap is not thread-safe, and there is an infinite loop problem in the case of concurrent reading. This problem has been briefly explained in another article " HashMap Infinite Loop Problem ".

So why do we say ConcurrentHashMap here? Can't you just use HashTable and HashMap directly?

The answer is: the thread safety of HashTable is heavy, and it is implemented by the keyword synchronized and virtual machine. There are certain performance problems in the case of high concurrency. HashMap also has thread safety problems, so another optimized thread-safe Map structure is provided in JDK, that is, ConcurrentHashMap, which can achieve the purpose of thread safety and does not use synchronized to achieve concurrency control, but is used in The code-level locking is used for control, which weakens the granularity of locking and enhances the performance under concurrency.

The following describes the implementation of ConcurrentHashMap in JDK1.7. The JDK version is specified because the implementation of this class is slightly different under different versions.

ConcurrentHashMap implementation principle

Implementation in JDK1.7

We know that the entire table is locked in HashTable, which is obviously inappropriate in the case of a large amount of data, because the processing of other elements is directly affected in the process of operating one data. Therefore, a technology called lock segmentation is adopted in ConcurrentHashMap, that is, the data of different segments in the table are controlled by different locks, which have no influence on each other, which reduces the scope of lock control, thereby enhancing the concurrency. Read and write performance.

In JDK1.7, the data structure of ConcurrentHashMap consists of a Segment array and multiple HashEntry, as shown in the following figure:

The meaning of the Segment array is to divide a large table into multiple small tables for locking, which is the lock separation technology mentioned above, and each Segment element stores a HashEntry array + linked list, which is similar to HashMap. The data storage structure is the same.

initialization process

The initialization code of ConcurrentHashMap is shown below. The size ssize of the segment array is initialized by bit operation, so ssize must be the N power of 2; in addition, the maximum concurrencyLevel can only be 16, that is, the maximum size of the segment array can only be 63336. For the HashEntry array under each segment, the size of each array is also 2 to the Nth power.

/**
     * Creates a new, empty map with the specified initial
     * capacity, load factor and concurrency level.
     *
     * @param initialCapacity the initial capacity. The implementation
     * performs internal sizing to accommodate this many elements.
     * @param loadFactor  the load factor threshold, used to control resizing.
     * Resizing may be performed when the average number of elements per
     * bin exceeds this threshold.
     * @param concurrencyLevel the estimated number of concurrently
     * updating threads. The implementation performs internal sizing
     * to try to accommodate this many threads.
     * @throws IllegalArgumentException if the initial capacity is
     * negative or the load factor or concurrencyLevel are
     * nonpositive.
     */
    @SuppressWarnings("unchecked")
    public ConcurrentHashMap(int initialCapacity,
                             float loadFactor, int concurrencyLevel) {
        if (!(loadFactor > 0) || initialCapacity < 0 || concurrencyLevel <= 0)
            throw new IllegalArgumentException();
        if (concurrencyLevel > MAX_SEGMENTS)
            concurrencyLevel = MAX_SEGMENTS;
        // Find power-of-two sizes best matching arguments
        int sshift = 0;
        int ssize = 1;
        while (ssize < concurrencyLevel) {
            ++sshift;
            ssize <<= 1;
        }
        this.segmentShift = 32 - sshift;
        this.segmentMask = ssize - 1;
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        int c = initialCapacity / ssize;
        if (c * ssize < initialCapacity)
            ++c;
        int cap = MIN_SEGMENT_TABLE_CAPACITY;
        while (cap < c)
            cap <<= 1;
        // create segments and segments[0]
        Segment<K,V> s0 =
            new Segment<K,V>(loadFactor, (int)(cap * loadFactor),
                             (HashEntry<K,V>[])new HashEntry[cap]);
        Segment<K,V>[] ss = (Segment<K,V>[])new Segment[ssize];
        UNSAFE.putOrderedObject(ss, SBASE, s0); // ordered write of segments[0]
        this.segments = ss;
    }

As can be seen from the source code, it should be noted during the initialization process that all segments are not initialized when the ConcurrentHashMap is created. Only one segment is initialized here, that is, the s0 variable, and then the container is operated on. During the process, other segments are dynamically created as needed.

put operation

For data insertion, two hash operations are required to locate the final insertion position of the element, one is to locate the segment, and the other is to locate the corresponding HashEntry array element index position under this segment.

/**
     * Maps the specified key to the specified value in this table.
     * Neither the key nor the value can be null.
     *
     * <p> The value can be retrieved by calling the <tt>get</tt> method
     * with a key that is equal to the original key.
     *
     * @param key key with which the specified value is to be associated
     * @param value value to be associated with the specified key
     * @return the previous value associated with <tt>key</tt>, or
     *         <tt>null</tt> if there was no mapping for <tt>key</tt>
     * @throws NullPointerException if the specified key or value is null
     */
    @SuppressWarnings("unchecked")
    public V put(K key, V value) {
        Segment<K,V> s;
        if (value == null)
            throw new NullPointerException();
        int hash = hash(key);
        int j = (hash >>> segmentShift) & segmentMask;
        if ((s = (Segment<K,V>)UNSAFE.getObject          // nonvolatile; recheck
             (segments, (j << SSHIFT) + SBASE)) == null) //  in ensureSegment
            s = ensureSegment(j);
        return s.put(key, hash, value, false);
    }

As shown in the above code, first, according to the hash value of the key, it is judged that the segment element that should be inserted is at the subscript position j of the segments array. If the segment[j] element does not yet exist, then call ensureSegment(j) to initialize the segment element. Then call the put method of the segment to complete the insertion of the element key in the HashEntry array of the segment.

The put operation logic in the Segment class is as follows:

final V put(K key, int hash, V value, boolean onlyIfAbsent) {
            HashEntry<K,V> node = tryLock() ? null :
                scanAndLockForPut(key, hash, value);
            V oldValue;
            try {
                HashEntry<K,V>[] tab = table;
                int index = (tab.length - 1) & hash;
                HashEntry<K,V> first = entryAt(tab, index);
                for (HashEntry<K,V> e = first;;) {
                    if (e != null) {
                        K k;
                        if ((k = e.key) == key ||
                            (e.hash == hash && key.equals(k))) {
                            oldValue = e.value;
                            if (!onlyIfAbsent) {
                                e.value = value;
                                ++modCount;
                            }
                            break;
                        }
                        e = e.next;
                    }
                    else {
                        if (node != null)
                            node.setNext(first);
                        else
                            node = new HashEntry<K,V>(hash, key, value, first);
                        int c = count + 1;
                        if (c > threshold && tab.length < MAXIMUM_CAPACITY)
                            rehash(node);
                        else
                            setEntryAt(tab, index, node);
                        ++modCount;
                        count = c;
                        oldValue = null;
                        break;
                    }
                }
            } finally {
                unlock();
            }
            return oldValue;
        }

As can be seen from the code, the entire operation is carried out under the protection of the lock. First, perform the AND operation on the hash value calculated for the first time and the length of the HashEntry array to obtain the index of the element in the HashEntry array in the following table, and then take the corresponding linked list element from the index position. If there is no element, initialize a linked list. If there is, start traversing from the head node of the linked list to determine whether the inserted element is an existing element. If it is, it will directly jump out and return to the old element.

get operation

The get operation of ConcurrentHashMap is similar to HashMap, except that ConcurrentHashMap needs to go through a hash to locate the segment position for the first time, and then hash to locate the specified HashEntry, traverse the linked list under the HashEntry for comparison, and return if successful, or null if unsuccessful. .

The source code is as follows:

/**
     * Returns the value to which the specified key is mapped,
     * or {@code null} if this map contains no mapping for the key.
     *
     * <p>More formally, if this map contains a mapping from a key
     * {@code k} to a value {@code v} such that {@code key.equals(k)},
     * then this method returns {@code v}; otherwise it returns
     * {@code null}.  (There can be at most one such mapping.)
     *
     * @throws NullPointerException if the specified key is null
     */
    public V get(Object key) {
        Segment<K,V> s; // manually integrate access methods to reduce overhead
        HashEntry<K,V>[] tab;
        int h = hash(key);
        long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
        if ((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) != null &&
            (tab = s.table) != null) {
            for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile
                     (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE);
                 e! = null; e = e.next) {
                K k;
                if ((k = e.key) == key || (e.hash == h && key.equals(k)))
                    return e.value;
            }
        }
        return null;
    }

From the source code, it first finds the corresponding segment index subscript h in the segments array through hash(key), and then judges whether there is an element in the segment, if not, it returns null directly; The hash calculation obtains the index subscript of the corresponding element in the HashEntry array, and takes out the corresponding element list, then traverses the elements in this list, compares it with the key in the request parameter, and finds the appropriate element to return or traverse the entire linked list. And can not find the final return null.

size operation

Calculating the element size of ConcurrentHashMap is an interesting problem, because it operates concurrently, that is, when you calculate the size, it is still inserting data concurrently, which may cause the calculated size to be different from your actual size ( When you return size, multiple data are inserted), let's see how this problem is solved in JDK1.7.

public int size() {
        // Try a few times to get accurate count. On failure due to
        // continuous async changes in table, resort to locking.
        final Segment<K,V>[] segments = this.segments;
        int size;
        boolean overflow; // true if size overflows 32 bits
        long sum;         // sum of modCounts
        long last = 0L;   // previous sum
        int retries = -1; // first iteration isn't retry
        try {
            for (;;) {
                if (retries++ == RETRIES_BEFORE_LOCK) {
                    for (int j = 0; j < segments.length; ++j)
                        ensureSegment(j).lock(); // force creation
                }
                sum = 0L;
                size = 0;
                overflow = false;
                for (int j = 0; j < segments.length; ++j) {
                    Segment <K, V> seg = segmentAt (segments, j);
                    if (seg! = null) {
                        sum + = seg.modCount;
                        int c = seg.count;
                        if (c < 0 || (size += c) < 0)
                            overflow = true;
                    }
                }
                if (sum == last)
                    break;
                last = sum;
            }
        } finally {
            if (retries > RETRIES_BEFORE_LOCK) {
                for (int j = 0; j < segments.length; ++j)
                    segmentAt(segments, j).unlock();
            }
        }
        return overflow ? Integer.MAX_VALUE : size;
    }

Here, firstly, the number of elements under each segment is calculated and accumulated in the unlocked mode, and the calculation results are compared twice before and after the comparison. If they are the same, it is considered that there is no element inserted in the middle, and the calculation result is The number of elements in the final container; if the two results are different, calculate it again and compare it with the previous calculation result until the two results are the same or exceed the maximum number of retries (3 times, that is, in the first time After calculating the number of container elements, it will retry the number of container elements at most 3 times and compare whether it is equal to the previous calculation result). When the maximum number of retries is exceeded, it is considered that there are more concurrent operations. At this time, all segments of the entire container will be locked to prevent new elements from being added to the container, and then the number of elements will be counted, and each lock will be released after the count is completed.

In JDK1.8, the concept of segment has been abandoned, and instead the data structure of Node array + linked list + red-black tree is used, and the synchronized keyword and CAS are used for concurrency control. Readers interested in the implementation of ConcurrentHashMap in JDK1.8 can look at the source code for themselves.

Thank you for reading. If you are interested in Java programming, middleware, databases, and various open source frameworks, please pay attention to my blog and Toutiao (Source Code Empire). The blog and Toutiao will regularly provide some related technical articles for later. Let's discuss and learn together, thank you.

If you think the article is helpful to you, please give me a reward. A penny is not too little, and a hundred is not too much. ^_^ Thank you.


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325915075&siteId=291194637