9 pictures for an in-depth analysis of ConcurrentHashMap

17112206:

[Yuan Chuang Conference Preview] 1024 Programmers’ Day (two days before), meet at the Open Source China office, let’s talk about AI! >>>

Preface

In daily development, we often use HashMap of key-value pairs, which is implemented using a hash table, exchanging space for time and improving query performance.

But in multi-threaded concurrent scenarios, HashMap is not thread-safe.

If you want to use thread safety, you can use ConcurrentHashMap, HashTable, Collections.synchronizedMap, etc.

However, since the granularity of using synchronized is too large for the latter two, they are generally not used. Instead, ConcurrentHashMap in the concurrent package is used.

In ConcurrentHashMap, volatile is used to ensure memory visibility, so that there is no need to "lock" to ensure atomicity in read scenarios.

Use CAS+synchronized in write scenarios. Synchronized only locks the first node at a certain index position in the hash table, which is equivalent to fine-grained locking and increases concurrency performance.

This article will analyze the use of ConcurrentHashMap, the implementation principles of reading, writing, and expansion, and design ideas.

Before reading this article, you need to understand hash tables, volatile, CAS, synchronized, etc.

For volatile, you can check out this article: 5 cases and flow charts to help you understand the volatile keyword from 0 to 1

For CAS and synchronized, you can view this article: 15,000 words, 6 code cases, and 5 schematic diagrams to help you thoroughly understand Synchronized

Use ConcurrentHashMap

ConcurrentHashMap is a thread-safe Map in concurrent scenarios. It can query and store K and V key-value pairs in concurrent scenarios.

Immutable objects are absolutely thread-safe, regardless of how they are used by the outside world.

ConcurrentHashMap is not absolutely thread safe. It only provides thread safety for methods. If used incorrectly in the outer layer, it will still cause thread insecurity.

Let’s look at the following case. Use value to store the number of auto-increment calls, start 10 threads and execute each one 100 times. The final result should be 1000 times, but incorrect use results in less than 1000.


    public void test() {
//        Map<String, Integer> map = new HashMap(16);
        Map<String, Integer> map = new ConcurrentHashMap(16);

        String key = "key";
        CountDownLatch countDownLatch = new CountDownLatch(10);


        for (int i = 0; i < 10; i++) {
            new Thread(() -> {
                for (int j = 0; j < 100; j++) {
                    incr(map, key);
//                    incrCompute(map, key);
                }
                countDownLatch.countDown();
            }).start();
        }

        try {
            //阻塞到线程跑完
            countDownLatch.await();
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
        //1000不到
        System.out.println(map.get(key));
    }

	private void incr(Map<String, Integer> map, String key) {
        map.put(key, map.getOrDefault(key, 0) + 1);
    }

In the auto-increment method incr, the reading operation is performed first, then the calculation is performed, and finally the writing operation is performed. This composite operation does not guarantee atomicity, so the final accumulation of all results must not be 1000.

The correct way to use it is to use the default method provided by JDK8compute

The principle of ConcurrentHashMap implementation computeis to use synchronization means in put before calculation.

	private void incrCompute(Map<String, Integer> map, String key) {
        map.compute(key, (k, v) -> Objects.isNull(v) ? 1 : v + 1);
    }

data structure

Similar to HashMap, implemented using hash table + linked list/red-black tree

Hash table

The implementation of the hash table is composed of arrays. When a hash conflict occurs (the hash algorithm obtains the same index), the chain address method is used to construct a linked list.

When the nodes on the linked list are too long and the traversal and search overhead is high and exceeds the threshold (the linked list has more than 8 nodes and the hash table length is greater than 64), the tree is transformed into a red-black tree to reduce the traversal and search overhead, and the time complexity is optimized from O(n) is (log n)

ConcurrentHashMap is composed of Node array. During expansion, there will be two hash tables, old and new: table and nextTable.

public class ConcurrentHashMap<K,V> extends AbstractMap<K,V>
    implements ConcurrentMap<K,V>, Serializable {	
	//哈希表 node数组
	transient volatile Node<K,V>[] table;
    
    //扩容时为了兼容读写，会存在两个哈希表，这个是新哈希表
    private transient volatile Node<K,V>[] nextTable;
    
    // 默认为 0
    // 当初始化时, 为 -1
    // 当扩容时, 为 -(1 + 扩容线程数)
    // 当初始化或扩容完成后，为 下一次的扩容的阈值大小
    private transient volatile int sizeCtl;
    
    //扩容时 用于指定迁移区间的下标
    private transient volatile int transferIndex;
    
    //统计每个哈希槽中的元素数量
    private transient volatile CounterCell[] counterCells;
}

node

Node is used to implement the nodes of the hash table array and the nodes constructed into linked lists when a hash conflict occurs.

//实现哈希表的节点，数组和链表时使用
static class Node<K,V> implements Map.Entry<K,V> {
    //节点哈希值
	final int hash;
	final K key;
	volatile V val;
    //作为链表时的 后续指针 
	volatile Node<K,V> next;    	
}

// 扩容时如果某个 bin 迁移完毕, 用 ForwardingNode 作为旧 table bin 的头结点
static final class ForwardingNode<K,V> extends Node<K,V> {}

// 用在 compute 以及 computeIfAbsent 时, 用来占位, 计算完成后替换为普通 Node
static final class ReservationNode<K,V> extends Node<K,V> {}

// 作为 treebin 的头节点, 存储 root 和 first
static final class TreeBin<K,V> extends Node<K,V> {}

// 作为 treebin 的节点, 存储 parent, left, right
static final class TreeNode<K,V> extends Node<K,V> {}

Node hash value

//转发节点
static final int MOVED     = -1;
//红黑树在数组中的节点
static final int TREEBIN   = -2;
//占位节点
static final int RESERVED  = -3;

Forwarding node: Inherit Node and set it at the first node of an index in the old hash table when expanding the capacity. When encountering a forwarding node, you have to search for it in the new hash table.

static final class ForwardingNode<K,V> extends Node<K,V> {
    	//新哈希表
        final Node<K,V>[] nextTable;
    	
        ForwardingNode(Node<K,V>[] tab) {
            //哈希值设置为-1
            super(MOVED, null, null, null);
            this.nextTable = tab;
        }
}

The node TreeBin of the red-black tree in the array: inherits Node, first points to the first node of the red-black tree

static final class TreeBin<K,V> extends Node<K,V> {
        TreeNode<K,V> root;
    	//红黑树首节点
        volatile TreeNode<K,V> first;
}

Red-black tree node TreeNode

static final class TreeNode<K,V> extends Node<K,V> {
        TreeNode<K,V> parent;  
        TreeNode<K,V> left;
        TreeNode<K,V> right;
        TreeNode<K,V> prev; 
    	boolean red;
}

Placeholder node: Inherit Node. When calculation is needed (usage computermethod), first use the placeholder node to occupy the place. After calculation, the node is constructed to replace the placeholder node.

	static final class ReservationNode<K,V> extends Node<K,V> {
        ReservationNode() {
            super(RESERVED, null, null, null);
        }

        Node<K,V> find(int h, Object k) {
            return null;
        }
    }

Implementation principle

structure

During construction, the input parameters are checked, then the hash table capacity is calculated based on the data capacity and load factor to be stored, and finally the hash table capacity is adjusted to the power of 2.

It is not initialized during construction, but waits until it is used before creating it (lazy loading)

	public ConcurrentHashMap(int initialCapacity,
                             float loadFactor, int concurrencyLevel) {
        //检查负载因子、初始容量
        if (!(loadFactor > 0.0f) || initialCapacity < 0 || concurrencyLevel <= 0)
            throw new IllegalArgumentException();
        
        //concurrencyLevel：1
        if (initialCapacity < concurrencyLevel)   // Use at least as many bins
            initialCapacity = concurrencyLevel;   // as estimated threads
        //计算大小 = 容量/负载因子 向上取整
        long size = (long)(1.0 + (long)initialCapacity / loadFactor);
        //如果超过最大值就使用最大值 
        //tableSizeFor 将大小调整为2次幂
        int cap = (size >= (long)MAXIMUM_CAPACITY) ?
            MAXIMUM_CAPACITY : tableSizeFor((int)size);
        
        //设置容量
        this.sizeCtl = cap;
    }

read-get

The reading scenario uses volatile to ensure visibility. Even modifications by other threads are visible, and there is no need to use other means to ensure synchronization.

The read operation needs to find elements in the hash table, scramble the hash value through the perturbation algorithm, and then use the hash value to obtain the index through the hash algorithm. It is divided into multiple situations according to the first node on the index.

The perturbation algorithm fully disrupts the hash value (to avoid causing too many hash conflicts), and the sign bit &0 ensures that the result is positive.

int h = spread(key.hashCode())

Perturbation algorithm: 16-bit XOR operation of high and low hash values

After the perturbation algorithm, &HASH_BITS = 0x7fffffff (011111...), the sign bit is 0 to ensure that the result is a positive number

Negative hash values represent special functions, such as forwarding nodes, first nodes of trees, placeholder nodes, etc.
```
	static final int spread(int h) {
        return (h ^ (h >>> 16)) & HASH_BITS;
    }
```
Use the scrambled hash value to get the index (subscript) in the array through the hash algorithm

n is the length of the hash table:(n = tab.length)

(e = tabAt(tab, (n - 1) & h)

h is the calculated hash value, and the index position can be found by hash value % (hash table length - 1)

In order to improve performance, it is stipulated that the length of the hash table is the n power of 2. The length of the hash table in binary must be 1000...., and the (n-1)length in binary must be 0111...

Therefore, (n - 1) & hwhen calculating the index, the result of AND operation must be between 0~n-1. Use bit operations to improve performance.
After getting the nodes on the array, you need to compare them

After finding the first node on the hash table, compare the key to see if it is the current node.

Comparison rules: Compare the hash values first. If the object hash values are the same, it may be the same object. You also need to compare the key (== and equals). If the hash values are not the same, then it is definitely not the same object.

The advantage of comparing hash values first is to improve search performance . If equals is used directly, the time complexity may increase (such as String's equals).
The chain address method is used to resolve hash conflicts, so after finding the node, you may traverse the linked list or tree; due to the expansion of the hash table, you may also have to search on a new node.

4.1 The first node is relatively successful and returns directly

4.2 The hash value of the first node is negative, indicating that the node is a special case: forwarding node, first node of the tree, calculated reservation placeholder node
- If it is a forwarding node and is being expanded, go to the new array to find it.
- If it is TreeBin, search in the red-black tree
- If it is a placeholder node, return empty directly.
4.3 Traverse the linked list and compare sequentially

get code

public V get(Object key) {
    Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
    //1.spread：扰动算法 + 让key的哈希值不能为负数,因为负数哈希值代表红黑树或ForwardingNode
    int h = spread(key.hashCode());
    //2.(n - 1) & h：下标、索引 实际上就是数组长度模哈希值 位运算效率更高
    //e：哈希表中对应索引位置上的节点
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (e = tabAt(tab, (n - 1) & h)) != null) {
        //3.如果哈希值相等,说明可能找到,再比较key
        if ((eh = e.hash) == h) {
            //4.1 key相等说明找到 返回
            if ((ek = e.key) == key || (ek != null && key.equals(ek)))
                return e.val;
        }
        //4.2 首节点哈希值为负，说明该节点是转发节点，当前正在扩容则去新数组上找
        else if (eh < 0)
            return (p = e.find(h, key)) != null ? p.val : null;
        
        //4.3 遍历该链表,能找到就返回值,不能返回null
        while ((e = e.next) != null) {
            if (e.hash == h &&
                ((ek = e.key) == key || (ek != null && key.equals(ek))))
                return e.val;
        }
    }
    return null;
}

write-put

When adding elements, use the synchronization method CAS+synchronized (only locking a certain first node in the hash table) to ensure atomicity.

Get the hash value: perturbation algorithm + make sure the hash value is positive
The hash table is empty, CAS guarantees a thread initialization

	private final Node<K,V>[] initTable() {
        Node<K,V>[] tab; int sc;
        while ((tab = table) == null || tab.length == 0) {
            //小于0 说明其他线程在初始化 让出CPU时间片 后续初始化完退出
            if ((sc = sizeCtl) < 0)
                Thread.yield(); 
            else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
                //CAS将SIZECTL设置成-1 （表示有线程在初始化）成功后 初始化
                try {
                    if ((tab = table) == null || tab.length == 0) {
                        int n = (sc > 0) ? sc : DEFAULT_CAPACITY;
                        @SuppressWarnings("unchecked")
                        Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n];
                        table = tab = nt;
                        sc = n - (n >>> 2);
                    }
                } finally {
                    sizeCtl = sc;
                }
                break;
            }
        }
        return tab;
    }

Pass the hash value through the hash algorithm to get the node on the indexf = tabAt(tab, i = (n - 1) & hash)
Handle according to different situations
- 4.1 When the first node is empty, CAS directly adds the node to the index position.casTabAt(tab, i, null,new Node<K,V>(hash, key, value, null))
- 4.2 When the hash of the first node is MOVED -1, it means that the node is a forwarding node, indicating that it is expanding, helping to expand the capacity.
- 4.3 First node locking
  - 4.3.1 Traverse the linked list to find and add/overwrite
  - 4.3.2 Traverse the tree to find and add/overwrite
addCountCount the data on each node and check the expansion

put code

//onlyIfAbsent为true时,如果原来有k,v则这次不会覆盖
final V putVal(K key, V value, boolean onlyIfAbsent) {
    if (key == null || value == null) throw new NullPointerException();
    //1.获取哈希值：扰动算法+确保哈希值为正数
    int hash = spread(key.hashCode());
    int binCount = 0;
    //乐观锁思想 CSA+失败重试
    for (Node<K,V>[] tab = table;;) {
        Node<K,V> f; int n, i, fh;
        //2.哈希表为空 CAS保证只有一个线程初始化
        if (tab == null || (n = tab.length) == 0)
            tab = initTable();
        //3. 哈希算法求得索引找到索引上的首节点
        //4.1 节点为空时,直接CAS构建节点
        else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
            if (casTabAt(tab, i, null,
                         new Node<K,V>(hash, key, value, null)))
                break;                   // no lock when adding to empty bin
        }
        //4.2 索引首节点hash 为MOVED 说明该节点是转发节点，当前正在扩容,去帮助扩容
        else if ((fh = f.hash) == MOVED)
            tab = helpTransfer(tab, f);
        else {
            V oldVal = null;
            //4.3 首节点 加锁
            synchronized (f) {
                //首节点没变
                if (tabAt(tab, i) == f) {
                    //首节点哈希值大于等于0 说明节点是链表上的节点  
                    //4.3.1 遍历链表寻找然后添加/覆盖
                    if (fh >= 0) {
                        //记录链表上有几个节点
                        binCount = 1;
                        //遍历链表找到则替换,如果遍历完了还没找到就添加（尾插）
                        for (Node<K,V> e = f;; ++binCount) {
                            K ek;
                            //替换
                            if (e.hash == hash &&
                                ((ek = e.key) == key ||
                                 (ek != null && key.equals(ek)))) {
                                oldVal = e.val;
                                //onlyIfAbsent为false允许覆盖（使用xxIfAbsent方法时，有值就不覆盖）
                                if (!onlyIfAbsent)
                                    e.val = value;
                                break;
                            }
                            Node<K,V> pred = e;
                            //添加
                            if ((e = e.next) == null) {
                                pred.next = new Node<K,V>(hash, key,
                                                          value, null);
                                break;
                            }
                        }
                    }
                    //如果是红黑树首节点,则找到对应节点再覆盖
                    //4.3.2 遍历树寻找然后添加/覆盖
                    else if (f instanceof TreeBin) {
                        Node<K,V> p;
                        binCount = 2;
                        //如果是添加返回null,返回不是null则出来添加
                        if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                       value)) != null) {
                            oldVal = p.val;
                            //覆盖
                            if (!onlyIfAbsent)
                                p.val = value;
                        }
                    }
                }
            }
            
            if (binCount != 0) {
                if (binCount >= TREEIFY_THRESHOLD)
                    //链表上的节点超过TREEIFY_THRESHOLD 8个（不算首节点） 并且 数组长度超过64才树化，否则扩容
                    treeifyBin(tab, i);
                if (oldVal != null)
                    return oldVal;
                break;
            }
        }
    }
    //5.添加计数,用于统计元素（添加节点的情况）
    addCount(1L, binCount);
    return null;
}

Expansion

In order to avoid frequent hash conflicts, when the number of elements in the hash table/the length of the hash table exceeds the load factor, expand the capacity (increase the length of the hash table)

Generally speaking, expansion is to increase the length of the hash table by 2 times. For example, from 32 to 64, the length is guaranteed to be a power of 2; if the expansion length reaches the upper limit of the integer type, the maximum integer value is used.

When expansion occurs, the linked list or tree in each slot in the array needs to be migrated to the new array.

If the processor is multi-core, then this migration operation is not completed by one thread alone, but other threads will also help with the migration.

During migration, let each thread migrate multiple slots from right to left. After the migration is completed, it will be judged whether all migrations have been completed. If not, the migration will continue in a circular manner.

The expansion operation is mainly in transferthe method, and the expansion is mainly in three scenarios:

addCount: After adding the node, increase the count and check the expansion.
helpTransfer: When the thread is put, it is found that it is migrating, to help expand the capacity.
tryPresize: Try to adjust the capacity (batch addition putAll, called when the length of the tree array does not exceed 64 treeifyBin)

Divided into the following 3 steps

Calculate how many slots to migrate each time based on the number of CPU cores and the total length of the hash table, the minimum is 16
The new hash table is empty, indicating that it is initialized.
Circular migration
- 3.1 Allocate the interval [bround,i] responsible for migration (there may be simultaneous migration of multiple threads)
- 3.2 Migration: divided into linked list migration and tree migration
  
  Linked list migration
  1. Fully hash the nodes on the linked list into the index of the new hash table and the two subscripts of index + the length of the old hash table (similar to HashMap)
  2. Put the node in the index position linked list (hash & hash table length), the result is 0 into the index position of the new array, the result is 1 into the position of the new array index + the length of the old hash table
    
    For example, the length of the old hash table is 16. At index 3, the binary value of 16 is 10000, hash&16 => hash& 10000. That is to say, if the fifth bit of the node hash value is 0, it is placed at position 3 of the new hash table. If it is 1, put it at the 3+16 subscript of the new hash table.
  3. Use the head interpolation method to construct an ln linked list when the calculation result is 0, and an hn linked list when it is 1. In order to facilitate the construction of the linked list, the lastRun node will be found first: the lastRun node and subsequent nodes are all nodes on the same linked list, which facilitates migration.
    
    Build lastRun first before building the linked list, for example, lastRun e->f in the figure, first put lastRun on the ln linked list, then traverse the original linked list, traverse to a: a->e->f, traverse to b: b->a ->e->f
  1. After each index position is migrated, the forwarding node is set to the corresponding position in the original hash table. When other threads perform read and get operations, they search in the new hash table according to the forwarding node and perform write and put operations to help expand the capacity (other range migration)

Expansion code

//tab 旧哈希表
//nextTab 新哈希表
private final void transfer(Node<K,V>[] tab, Node<K,V>[] nextTab) {
        //1.计算每次迁移多少个槽
        //n:哈希表长度（多少个槽）
        int n = tab.length, stride;
        //stride:每次负责迁移多少个槽
        //NCPU: CPU核数
        //如果是多核，每次迁移槽数 = 总槽数无符号右移3位（n/8）再除CPU核数  
        //每次最小迁移槽数 = MIN_TRANSFER_STRIDE = 16
        if ((stride = (NCPU > 1) ? (n >>> 3) / NCPU : n) < MIN_TRANSFER_STRIDE)
            stride = MIN_TRANSFER_STRIDE; // subdivide range
    
        //2.如果新哈希表为空，说明是初始化
        if (nextTab == null) {            // initiating
            try {
                @SuppressWarnings("unchecked")
                Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n << 1];
                nextTab = nt;
            } catch (Throwable ex) {      // try to cope with OOME
                sizeCtl = Integer.MAX_VALUE;
                return;
            }
            nextTable = nextTab;
            //transferIndex用于记录 每次负责迁移的槽右区间下标，从右往左分配，起始为最右
            transferIndex = n;
        }
        //新哈希表长度
        int nextn = nextTab.length;
        //创建转发节点，转发节点一般设置在旧哈希表首节点，通过转发节点可以找到新哈希表
        ForwardingNode<K,V> fwd = new ForwardingNode<K,V>(nextTab);
        //advance:是否继续循环迁移
        boolean advance = true;
        // 
        boolean finishing = false; // to ensure sweep before committing nextTab
        //3.循环迁移
        for (int i = 0, bound = 0;;) {
            Node<K,V> f; int fh;
            //3.1 分配负责迁移的区间
            //bound为左区间 i为右区间
            while (advance) {
                int nextIndex, nextBound;
                //处理完一个槽 右区间 自减
                if (--i >= bound || finishing)
                    advance = false;
                //transferIndex<=0说明 要迁移的区间全分配完
                else if ((nextIndex = transferIndex) <= 0) {
                    i = -1;
                    advance = false;
                }
                //CAS设置本次迁移的区间，防止多线程分到相同区间
                else if (U.compareAndSwapInt
                         (this, TRANSFERINDEX, nextIndex,
                          nextBound = (nextIndex > stride ?
                                       nextIndex - stride : 0))) {
                    bound = nextBound;
                    i = nextIndex - 1;
                    advance = false;
                }
            }
            
            //3.2 迁移
            
            //3.2.1 如果右区间i不再范围，说明迁移完
            if (i < 0 || i >= n || i + n >= nextn) {
                int sc;
                //如果完成迁移，设置哈希表、数量
                if (finishing) {
                    nextTable = null;
                    table = nextTab;
                    sizeCtl = (n << 1) - (n >>> 1);
                    return;
                }
                //CAS 将sizeCtl数量-1 表示 一个线程迁移完成 
                if (U.compareAndSwapInt(this, SIZECTL, sc = sizeCtl, sc - 1)) {
                    //如果不是最后一条线程直接返回
                    if ((sc - 2) != resizeStamp(n) << RESIZE_STAMP_SHIFT)
                        return;
                    //是最后一条线程设置finishing为true  后面再循环 去设置哈希表、数量等操作
                    finishing = advance = true;
                    i = n; // recheck before commit
                }
            }
            //3.2.2 如果旧哈希表i位置节点为空就CAS设置成转发节点
            else if ((f = tabAt(tab, i)) == null)
                advance = casTabAt(tab, i, null, fwd);
            //3.2.3 如果旧哈希表该位置首节点是转发节点，说明其他线程已处理，重新循环
            else if ((fh = f.hash) == MOVED)
                advance = true; // already processed
            else {
                //3.2.4 对首节点加锁 迁移
                synchronized (f) {
                    if (tabAt(tab, i) == f) {
                        Node<K,V> ln, hn;
                        //3.2.4.1 链表迁移
                        //首节点哈希值大于等于0 说明 是链表节点
                        if (fh >= 0) {
                            int runBit = fh & n;
                            Node<K,V> lastRun = f;
                            //寻找lastRun节点 
                            for (Node<K,V> p = f.next; p != null; p = p.next) {
                                int b = p.hash & n;
                                if (b != runBit) {
                                    runBit = b;
                                    lastRun = p;
                                }
                            }
                            //如果最后一次计算值是0
                            //lastRun节点以及后续节点计算值都是0构建成ln链表 否则 都是1构建成hn链表
                            if (runBit == 0) {
                                ln = lastRun;
                                hn = null;
                            }
                            else {
                                hn = lastRun;
                                ln = null;
                            }
                            
                            //遍历构建ln、hn链表 （头插）
                            for (Node<K,V> p = f; p != lastRun; p = p.next) {
                                int ph = p.hash; K pk = p.key; V pv = p.val;
                                //头插：Node构造第四个参数是后继节点
                                if ((ph & n) == 0)
                                    ln = new Node<K,V>(ph, pk, pv, ln);
                                else
                                    hn = new Node<K,V>(ph, pk, pv, hn);
                            }
                            //设置ln链表到i位置
                            setTabAt(nextTab, i, ln);
                            //设置hn链表到i+n位置
                            setTabAt(nextTab, i + n, hn);
                            //设置转发节点
                            setTabAt(tab, i, fwd);
                            advance = true;
                        }
                        //3.2.4.2 树迁移
                        else if (f instanceof TreeBin) {
                            TreeBin<K,V> t = (TreeBin<K,V>)f;
                            TreeNode<K,V> lo = null, loTail = null;
                            TreeNode<K,V> hi = null, hiTail = null;
                            int lc = 0, hc = 0;
                            for (Node<K,V> e = t.first; e != null; e = e.next) {
                                int h = e.hash;
                                TreeNode<K,V> p = new TreeNode<K,V>
                                    (h, e.key, e.val, null, null);
                                if ((h & n) == 0) {
                                    if ((p.prev = loTail) == null)
                                        lo = p;
                                    else
                                        loTail.next = p;
                                    loTail = p;
                                    ++lc;
                                }
                                else {
                                    if ((p.prev = hiTail) == null)
                                        hi = p;
                                    else
                                        hiTail.next = p;
                                    hiTail = p;
                                    ++hc;
                                }
                            }
                            ln = (lc <= UNTREEIFY_THRESHOLD) ? untreeify(lo) :
                                (hc != 0) ? new TreeBin<K,V>(lo) : t;
                            hn = (hc <= UNTREEIFY_THRESHOLD) ? untreeify(hi) :
                                (lc != 0) ? new TreeBin<K,V>(hi) : t;
                            setTabAt(nextTab, i, ln);
                            setTabAt(nextTab, i + n, hn);
                            setTabAt(tab, i, fwd);
                            advance = true;
                        }
                    }
                }
            }
        }
    }

The implementation principle does not describe much about red-black trees. On the one hand, there are too many concepts of red-black trees, and on the other hand, I have almost forgotten them (I am already old, and I can’t write red-black trees by hand like in college)

Another aspect is: I think it is enough to know the benefits of using red-black trees, and it is not commonly used in work. Even if the red-black tree changes color, rotates left or right to meet the conditions of the red-black tree, it is meaningless. , interested students can just go and study.

Iterator

Iterators in ConcurrentHashMap are weakly consistent and use the recorded hash table to reconstruct new objects on fetch

Entry iterator:

public Iterator<Map.Entry<K,V>> iterator() {
    ConcurrentHashMap<K,V> m = map;
    Node<K,V>[] t;
    int f = (t = m.table) == null ? 0 : t.length;
    return new EntryIterator<K,V>(t, f, 0, f, m);
}

key iterator

public Enumeration<K> keys() {
    Node<K,V>[] t;
    int f = (t = table) == null ? 0 : t.length;
    return new KeyIterator<K,V>(t, f, 0, f, this);
}

value iterator

public Enumeration<V> elements() {
    Node<K,V>[] t;
    int f = (t = table) == null ? 0 : t.length;
    return new ValueIterator<K,V>(t, f, 0, f, this);
}

Summarize

ConcurrentHashMap uses the data structure of a hash table. When a hash conflict occurs, it uses the chain address method to resolve it. The nodes hashed to the same index are constructed into a linked list. When the amount of data reaches a certain threshold, the linked list is converted into a red-black tree.

ConcurrentHashMap uses volatile modification to store data, making modifications to other threads visible in the reading scenario. There is no need to use a synchronization mechanism. CAS and synchronzied are used to ensure atomicity in the writing scenario.

When querying data with get, first pass the hash value of the key through the perturbation algorithm (high and low 16-bit XOR) and ensure that the result is a positive number (with the upper sign bit 0), and then combine it with the upper hash table length -1 to find the index value , after finding the index, search according to different situations (comparison determines the hash value first, and then determines the key if equal)

When adding/overwriting data in put, the index position is first found through the perturbation algorithm and hashing, and then searched according to different situations. If found, it will be overwritten, and if not found, it will be replaced.

When expansion is needed, the slot interval that needs to be migrated will be arranged for the thread. When other threads perform put, they will also help with migration. Each time a thread migrates a slot, the forwarding node will be set to the original hash table, so that there are thread queries. You can search the new hash table through the forwarding node. When all slots are migrated, leave a thread to set the hash table, quantity, etc.

The iterator uses weak consistency, and a new object is constructed through a hash table when obtaining the iterator.

ConcurrentHashMap only guarantees relative thread safety, but cannot guarantee absolute thread safety. If you need to perform a series of operations, you must use it correctly.

This article is published by OpenWrite, a blog that publishes multiple articles !