What does HashMap's thread insecurity mean?

One question that Java programmers have ever been asked is:

  • Why is HashMap not thread safe?
  • Why is ConcurrentHashMap thread safe?

Why is HashMap not thread safe?

Fail-Fast mechanism

  • If other threads modify the map while using the iterator, ConcurrentModificationException will be thrown. This is the so-called fail-fast strategy.
  • When looking at the source code, we will see that inside ArrayList, LinkedList, and HashMap, there is a modCount object of type int. Modifications to the content of the above collection will increase this value. modCount will be used in the iterator. In the constructor of the iterator, there is such a line of code, expectedModCount = modCount. During the call of the nextEntity and remove methods, if modCount != expectedModCount, a ConcurrentModificationException is thrown.
final Entry<K,V> nextEntry() {
    
    
    if (modCount != expectedModCount)
        throw new ConcurrentModificationException();
}
public void remove() {
    
    
    if (modCount != expectedModCount)
        throw new ConcurrentModificationException();
}

Endless loop

  • There are two important attributes in HashMap, one is capacity and the other is load factor. The capacity must be 2 to the power of n. This is to ensure that the value after the key hash can be evenly dispersed in the array. After the current number of nodes is equal to capacity * load factor, it needs to be expanded to double the original size. This process is called rehash.
  • Because the hashmap is a header insertion method, the new node is inserted at the head node of the array. When thread 1 executes rehash to newTable[i] = e, thread 1 is suspended. At this time, thread 2 starts to execute rehash and completes, which will cause the circular reference problem of the node.
void transfer(Entry[] newTable, boolean rehash) {
    
    
   int newCapacity = newTable.length;
    for (Entry<K,V> e : table) {
    
    // table是老数组
        while(null != e) {
    
    
            Entry<K,V> next = e.next;
            if (rehash) {
    
    
                e.hash = null == e.key ? 0 : hash(e.key);
            }
            int i = indexFor(e.hash, newCapacity);
            e.next = newTable[i];
            newTable[i] = e;
            e = next;
        }
    }
}

Why is ConcurrentHashMap thread safe?

Structure in java 7

Insert picture description here

  • The data structure of ConcurrentHashMap is composed of a Segment array and multiple HashEntry. The meaning of the Segment array is to divide a large table into multiple small tables for locking, and each Segment element stores a HashEntry array + linked list.
  • When locating data, the segment will be found first, and then the bucket will be located in the segment.
  • If multiple threads operate on the same segment, the segment lock ReentrantLock will be triggered, which is the basic realization principle of segment lock.
  • Segment inherits the ReentrantLock lock and is used to store the array HashEntry[]. Segment cannot be expanded after initialization, but HashEntry can be expanded.
static final class Segment<K,V> extends ReentrantLock implements Serializable {
    
    
}
  • When inserting data, you first need to find the segment, call the put method of the segment, and insert the new node into the segment. Acquire the lock in the segment, if the acquisition fails, go to the scanAndLockForPut method, if the acquisition is successful, return, if it fails, it will keep acquiring in the loop until it succeeds.
public V put(K key, V value) {
    
    
    Segment<K,V> s;
    if (value == null)
        throw new NullPointerException();
    // 根据key的hash再次进行hash运算
    int hash = hash(key.hashCode());
    // 基于hash定位segment数组的索引。
    // hash值是int值,32bits。segmentShift=28,无符号右移28位,剩下高4位,其余补0。
    // segmentMask=15,二进制低4位全部是1,所以j相当于hash右移后的低4位。
    int j = (hash >>> segmentShift) & segmentMask;
    if ((s = (Segment<K,V>)UNSAFE.getObject          // nonvolatile; recheck
         (segments, (j << SSHIFT) + SBASE)) == null) //  in ensureSegment
    // 找到对应segment
        s = ensureSegment(j);
    // 调用该segment的put方法,将新节点插入segment中
    return s.put(key, hash, value, false);
}
final V put(K key, int hash, V value, boolean onlyIfAbsent) {
    
    
    // 是否获取锁,失败自旋获取锁(直到成功)
    // 拿到结点之后,对结点进行插入操作。
    HashEntry<K,V> node = tryLock() ? null :
        scanAndLockForPut(key, hash, value); 
}

Structure in java 8

Insert picture description here

  • Java8 no longer uses segment, no longer uses segment lock, but uses a large array. In order to solve the hash collision, when the length of the linked list exceeds a certain value (default 8), the linked list is converted to a red-black tree. The complexity of the address is converted from O(n) to Olog(n).
  • Concurrency control uses Synchronized and CAS to operate.
  • The constructor of ConcurrentHashMap is an empty function, and initialization is implemented in put.
  • Both value and next in Node are modified with volatile to ensure the visibility of concurrency. Both value and next are modified with volatile to ensure the visibility of concurrency.

Pessimistic lock: It can completely guarantee the exclusivity and correctness of the data, because each request will first lock the data, then perform data operations, and finally unlock, and the process of locking and releasing the lock will cause consumption, so the performance is not high ;
Optimistic lock: Assuming that no conflict occurs, if a conflict is detected, it will fail and retry until it succeeds. Optimistic locking of the database is achieved through version control.

  • In the process of putting, we keep trying, because compareAndSwapInt and compareAndSwapObject are used in the initialization of the table and casTabAt, which is an implementation of optimistic locking.
  • Lock the node tab[i] of the specific array to be added, traverse the linked list or red-black tree, and insert the data
// 不参与序列化
transient volatile Node<K,V>[] table; // volatile保证线程间可见
 // 记录容器的容量大小,通过CAS更新
private transient volatile long baseCount;

/**
 * 初始化和扩容控制参数。为负数时表示table正在被初始化或resize:-1(初始化),-(1+扩容线程数)

 * sizeCtl默认值为0,大于0是扩容的阀值
 */
private transient volatile int sizeCtl;

final V putVal(K key, V value, boolean onlyIfAbsent) {
    
    
        if (key == null || value == null) throw new NullPointerException();
        int hash = spread(key.hashCode());
        int binCount = 0;
     // while(true)循环,不断的尝试,因为在table的初始化和casTabAt用到了compareAndSwapInt、compareAndSwapObject
        for (Node<K,V>[] tab = table;;) {
    
    
            Node<K,V> f; int n, i, fh;
       // 如果数组(HashMap)还未初始化,进行初始化操作(CAS操作)
            if (tab == null || (n = tab.length) == 0)
                tab = initTable();
       // 计算新插入数据在数组(HashMap)中的位置,如果该位置上的节点为null,直接放入数据(CAS操作)
            else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
    
    
                if (casTabAt(tab, i, null,
                             new Node<K,V>(hash, key, value, null)))
                    break;                   // no lock when adding to empty bin
            }
       // 如果该节点的hash值为MOVED,说明正在进行扩容操作或者已经扩容
            else if ((fh = f.hash) == MOVED)
                tab = helpTransfer(tab, f);
       // 
            else {
    
    
                V oldVal = null;
                synchronized (f) {
    
      // 对特定数组节点tab[i]加锁
                    if (tabAt(tab, i) == f) {
    
     // 判断tab[i]是否有变化
                        if (fh >= 0) {
    
     // 插入操作
                            binCount = 1;
                            for (Node<K,V> e = f;; ++binCount) {
    
     // 遍历链表
                                K ek;
                                if (e.hash == hash &&
                                    ((ek = e.key) == key ||
                                     (ek != null && key.equals(ek)))) {
    
     // 如果新插入值和tab[i]处的hash值和key值一样,进行替换
                                    oldVal = e.val;
                                    if (!onlyIfAbsent)
                                        e.val = value;
                                    break;
                                }
                                Node<K,V> pred = e;
                                if ((e = e.next) == null) {
    
     // 如果此节点为尾部节点,把此节点的next引用指向新数据节点
                                    pred.next = new Node<K,V>(hash, key,
                                                              value, null);
                                    break;
                                }
                            }
                        }
                        else if (f instanceof TreeBin) {
    
     // 如果是一颗红黑树
                            Node<K,V> p;
                            binCount = 2;
                            if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                           value)) != null) {
    
    
                                oldVal = p.val;
                                if (!onlyIfAbsent)
                                    p.val = value;
                            }
                        }
                    }
                }
                if (binCount != 0) {
    
    
                    if (binCount >= TREEIFY_THRESHOLD) //如果数组节点的链表长度超过限定长度,转换为一颗红黑树
                        treeifyBin(tab, i);
                    if (oldVal != null)
                        return oldVal;
                    break;
                }
            }
        }
        addCount(1L, binCount); 
        return null;
    }

Guess you like

Origin blog.csdn.net/u010659877/article/details/108790315