HashMap source code analysis (JDK1.8)

Overview

JDK 1.8 has greatly optimized HashMap, and the underlying implementation has been changed from the previous "array + linked list" to "array + linked list + red-black tree". Questions for discussion.

The data structure of the HashMap of JDK 1.8 is shown in the figure below. When the linked list has fewer nodes, it still exists as a linked list. When the linked list has more nodes (more than 8), it will turn into a red-black tree.

a few points

Understanding the following points first will help you better understand the source code of HashMap and read this article.
1. The head node in this article refers to the node at the index position on the table, which is the head node of the linked list.
2. The root node (root node) refers to the top node of the red-black tree, that is, the node without a parent node.
3. The root node of the red-black tree is not necessarily the head node of the index position (that is, the head node of the linked list). The root node is maintained at the index head node through the moveRootToFront method, but there are exceptions in the removal method.
4. After converting to a red-black tree node, the linked list structure still exists. Through the next attribute maintenance, the red-black tree node will maintain the linked list structure during operation. It does not mean that the linked list structure will no longer exist if it is converted to a red-black tree node. .
5. On a red-black tree, leaf nodes may also have next nodes, because the structure of the red-black tree and the structure of the linked list have no influence on each other. Just because it is a leaf node does not mean that the node has no next node.
6. Some variable definitions in the source code: If a node p is defined, then pl (p left) is the left node of p, pr (p right) is the right node of p, pp (p parent) is the parent node of p, ph (p hash) is the hash value of p, pk (p key) is the key value of p, kc (key class) is the class of key, and so on. In source code, people like to assign values ​​and make judgments in if/for statements, so please pay attention.
7. To remove a node from the linked list, you only need to operate as shown below, and the other operations are the same.

8. When the red-black tree maintains the linked list structure, to remove a node, you only need to operate as shown below (a prev attribute is added to the red-black tree), and the other operations are the same. Note: This is only the operation of the red-black tree to maintain the linked list structure, and the red-black tree needs to be removed or other operations separately.

9. When searching the red-black tree in the source code, the following two rules will be used repeatedly: 1) If the hash value of the target node is less than the hash value of the p node, traverse to the left of the p node; otherwise, traverse to the right of the p node Traverse. 2) If the key value of the target node is less than the key value of the p node, traverse to the left of the p node; otherwise, traverse to the right of the p node. These two rules take advantage of the characteristics of red-black trees (left node < root node < right node).
10. When searching the red-black tree in the source code, dir (direction) will be used to indicate whether to search left or right. The value stored in dir is the comparison result of the hash/key of the target node and the hash/key of the p node.

basic attributes

// 默认容量16
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; 

// 最大容量
static final int MAXIMUM_CAPACITY = 1 << 30;    

// 默认负载因子0.75
static final float DEFAULT_LOAD_FACTOR = 0.75f; 

// 链表节点转换红黑树节点的阈值, 9个节点转
static final int TREEIFY_THRESHOLD = 8; 

// 红黑树节点转换链表节点的阈值, 6个节点转
static final int UNTREEIFY_THRESHOLD = 6;   

// 转红黑树时, table的最小长度
static final int MIN_TREEIFY_CAPACITY = 64; 

// 链表节点, 继承自Entry
static class Node<K,V> implements Map.Entry<K,V> {
    
      
    final int hash;
    final K key;
    V value;
    Node<K,V> next;
     
    // ... ...
}

// 红黑树节点
static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
    
    
    TreeNode<K,V> parent;  // red-black tree links
    TreeNode<K,V> left;
    TreeNode<K,V> right;
    TreeNode<K,V> prev;    // needed to unlink next upon deletion
    boolean red;
       
    // ...
}

Locate the index position of the hash bucket array

Regardless of adding, deleting, or searching for key-value pairs, locating the location of the hash bucket array is a critical first step. As mentioned before, the data structure of HashMap is a combination of "array + linked list + red-black tree", so of course we hope that the element positions in this HashMap will be distributed as evenly as possible, so that there is only one element in each position, then when we When using the hash algorithm to find this position, we can immediately know that the element at the corresponding position is what we want. There is no need to traverse the linked list/red-black tree, which greatly optimizes the query efficiency. HashMap locates the array index position, which directly determines the discrete performance of the hash method. The following is the source code for locating the hash bucket array:

// 代码1
static final int hash(Object key) {
    
     // 计算key的hash值
    int h;
    // 1.先拿到key的hashCode值; 2.将hashCode的高16位参与运算
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}
// 代码2
int n = tab.length;
// 将(tab.length - 1) 与 hash值进行&运算
int index = (n - 1) & hash;

The whole process is essentially three steps:
1. Get the hashCode value of the key
2. Take the high bit of the hashCode into operation and recalculate the hash value
3. Perform & operation with the calculated hash value and (table.length - 1)

method interpretation

For any given object, as long as its hashCode() return value is the same, the calculated hash value is always the same. The first thing we think of is to modulo the hash value to the length of the table. In this way, the distribution of elements is relatively even.

However, the consumption of modulo operation is still relatively large. We know that the faster operation of the computer is bit operation, so the JDK team has optimized the modulo operation, using the bit-and operation of the above code 2 to replace the modulo operation. This method is very clever. It uses "(table.length -1) & h" to get the index position of the object. This optimization is based on the following formula: x mod 2^n = x & (2^n - 1). We know that the length of the underlying array of HashMap is always 2 to the nth power, and the modulo operation is "h mod table.length". Corresponding to the above formula, we can get that the operation is equivalent to "h & (table.length - 1)" . This is a speed optimization of HashMap because & is more efficient than %.

In the implementation of JDK1.8, the high-order operation algorithm is also optimized, and the high-order 16 bits of the hashCode are XORed with the hashCode, mainly to allow the high-order bits to participate in the operation when the length of the table is small, and will not There is too much overhead.

The figure below is a simple example.
When the table length is 16, table.length - 1 = 15. Looking at it in binary, the lower 4 bits are all 1 and the upper 28 bits are all 0. The & operation with 0 must be 0, so at this time the result of the & operation of hashCode and "table.length - 1" only depends on the lower 4 bits of hashCode. In this case, the upper 28 bits of hashCode have no effect, and since the hash result only depends on The lower 4 bits of hashCode will also increase the probability of hash collision. Therefore, in JDK 1.8, the high bits are also included in the calculation in order to reduce the probability of hash conflicts.

get method

public V get(Object key) {
    
    
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}

final Node<K,V> getNode(int hash, Object key) {
    
    
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    // 1.对table进行校验:table不为空 && table长度大于0 &&
    // table索引位置(使用table.length - 1和hash值进行与运算)的节点不为空
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
    
    
        // 2.检查first节点的hash值和key是否和入参的一样,如果一样则first即为目标节点,直接返回first节点
        if (first.hash == hash && // always check first node
            ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
        // 3.如果first不是目标节点,并且first的next节点不为空则继续遍历
        if ((e = first.next) != null) {
    
    
            if (first instanceof TreeNode)
                // 4.如果是红黑树节点,则调用红黑树的查找目标节点方法getTreeNode
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            do {
    
    
                // 5.执行链表节点的查找,向下遍历链表, 直至找到节点的key和入参的key相等时,返回该节点
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    // 6.找不到符合的返回空
    return null;
}

4. If it is a red-black tree node, call the red-black tree's search target node method getTreeNode, see code block 1 for detailed explanation.

Code block 1: getTreeNode

final TreeNode<K,V> getTreeNode(int h, Object k) {
    
    
    // 1.首先找到红黑树的根节点;2.使用根节点调用find方法
    return ((parent != null) ? root() : this).find(h, k, null);
}

2. Use the root node to call the find method, see code block 2 for details.

Code block 2: find

/**
 * 从调用此方法的节点开始查找, 通过hash值和key找到对应的节点
 * 此方法是红黑树节点的查找, 红黑树是特殊的自平衡二叉查找树
 * 平衡二叉查找树的特点:左节点<根节点<右节点
    */
    final TreeNode<K,V> find(int h, Object k, Class<?> kc) {
    
    
    // 1.将p节点赋值为调用此方法的节点,即为红黑树根节点
    TreeNode<K,V> p = this;
    // 2.从p节点开始向下遍历
    do {
    
    
        int ph, dir; K pk;
        TreeNode<K,V> pl = p.left, pr = p.right, q;
        // 3.如果传入的hash值小于p节点的hash值,则往p节点的左边遍历
        if ((ph = p.hash) > h)
            p = pl;
        else if (ph < h) // 4.如果传入的hash值大于p节点的hash值,则往p节点的右边遍历
            p = pr;
        // 5.如果传入的hash值和key值等于p节点的hash值和key值,则p节点为目标节点,返回p节点
        else if ((pk = p.key) == k || (k != null && k.equals(pk)))
            return p;
        else if (pl == null)    // 6.p节点的左节点为空则将向右遍历
            p = pr;
        else if (pr == null)    // 7.p节点的右节点为空则向左遍历
            p = pl;
        // 8.将p节点与k进行比较
        else if ((kc != null ||
                  (kc = comparableClassFor(k)) != null) && // 8.1 kc不为空代表k实现了Comparable
                 (dir = compareComparables(kc, k, pk)) != 0)// 8.2 k<pk则dir<0, k>pk则dir>0
            // 8.3 k<pk则向左遍历(p赋值为p的左节点), 否则向右遍历
            p = (dir < 0) ? pl : pr;
        // 9.代码走到此处, 代表key所属类没有实现Comparable, 直接指定向p的右边遍历
        else if ((q = pr.find(h, k, kc)) != null)
            return q;
        // 10.代码走到此处代表“pr.find(h, k, kc)”为空, 因此直接向左遍历
        else
            p = pl;
    } while (p != null);
    return null;
    }

8. Compare p nodes with k. If the class to which the incoming key (that is, the parameter k in the code) belongs implements the Comparable interface (kc is not empty, see the detailed explanation of the comparableClassFor method in code block 3), then the keys of the k and p nodes are compared (kc implements Comparable interface, so compare through the comparison method of kc), and assign the comparison result to dir. If dir<0, it means k<pk, then traverse to the left of the p node (pl); otherwise, traverse to the right of the p node (pl) pr).

Code block 3: comparableClassFor

/**
 * 从调用此方法的节点开始查找, 通过hash值和key找到对应的节点
 * 此方法是红黑树节点的查找, 红黑树是特殊的自平衡二叉查找树
 * 平衡二叉查找树的特点:左节点<根节点<右节点
    */
    final TreeNode<K,V> find(int h, Object k, Class<?> kc) {
    
    
    // 1.将p节点赋值为调用此方法的节点,即为红黑树根节点
    TreeNode<K,V> p = this;
    // 2.从p节点开始向下遍历
    do {
    
    
        int ph, dir; K pk;
        TreeNode<K,V> pl = p.left, pr = p.right, q;
        // 3.如果传入的hash值小于p节点的hash值,则往p节点的左边遍历
        if ((ph = p.hash) > h)
            p = pl;
        else if (ph < h) // 4.如果传入的hash值大于p节点的hash值,则往p节点的右边遍历
            p = pr;
        // 5.如果传入的hash值和key值等于p节点的hash值和key值,则p节点为目标节点,返回p节点
        else if ((pk = p.key) == k || (k != null && k.equals(pk)))
            return p;
        else if (pl == null)    // 6.p节点的左节点为空则将向右遍历
            p = pr;
        else if (pr == null)    // 7.p节点的右节点为空则向左遍历
            p = pl;
        // 8.将p节点与k进行比较
        else if ((kc != null ||
                  (kc = comparableClassFor(k)) != null) && // 8.1 kc不为空代表k实现了Comparable
                 (dir = compareComparables(kc, k, pk)) != 0)// 8.2 k<pk则dir<0, k>pk则dir>0
            // 8.3 k<pk则向左遍历(p赋值为p的左节点), 否则向右遍历
            p = (dir < 0) ? pl : pr;
        // 9.代码走到此处, 代表key所属类没有实现Comparable, 直接指定向p的右边遍历
        else if ((q = pr.find(h, k, kc)) != null)
            return q;
        // 10.代码走到此处代表“pr.find(h, k, kc)”为空, 因此直接向左遍历
        else
            p = pl;
    } while (p != null);
    return null;
    }

put method

public V put(K key, V value) {
    
    
    return putVal(hash(key), key, value, false, true);
}

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
               boolean evict) {
    
    
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    // 1.校验table是否为空或者length等于0,如果是则调用resize方法进行初始化
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    // 2.通过hash值计算索引位置,将该索引位置的头节点赋值给p,如果p为空则直接在该索引位置新增一个节点即可
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
    
    
        // table表该索引位置不为空,则进行查找
        Node<K,V> e; K k;
        // 3.判断p节点的key和hash值是否跟传入的相等,如果相等, 则p节点即为要查找的目标节点,将p节点赋值给e节点
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        // 4.判断p节点是否为TreeNode, 如果是则调用红黑树的putTreeVal方法查找目标节点
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
    
    
            // 5.走到这代表p节点为普通链表节点,则调用普通的链表方法进行查找,使用binCount统计链表的节点数
            for (int binCount = 0; ; ++binCount) {
    
    
                // 6.如果p的next节点为空时,则代表找不到目标节点,则新增一个节点并插入链表尾部
                if ((e = p.next) == null) {
    
    
                    p.next = newNode(hash, key, value, null);
                    // 7.校验节点数是否超过8个,如果超过则调用treeifyBin方法将链表节点转为红黑树节点,
                    // 减一是因为循环是从p节点的下一个节点开始的
                    if (binCount >= TREEIFY_THRESHOLD - 1)
                        treeifyBin(tab, hash);
                    break;
                }
                // 8.如果e节点存在hash值和key值都与传入的相同,则e节点即为目标节点,跳出循环
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;  // 将p指向下一个节点
            }
        }
        // 9.如果e节点不为空,则代表目标节点存在,使用传入的value覆盖该节点的value,并返回oldValue
        if (e != null) {
    
    
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e); // 用于LinkedHashMap
            return oldValue;
        }
    }
    ++modCount;
    // 10.如果插入节点后节点数超过阈值,则调用resize方法进行扩容
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);  // 用于LinkedHashMap
    return null;
}

1. Check whether the table is empty or the length is equal to 0, if so, call the resize method to initialize, see the detailed explanation of the resize method.

4. If the p node is not the target node, determine whether the p node is a TreeNode. If so, call the putTreeVal method of the red-black tree to find the target node. See code block 4 for details.

7. Check whether the number of nodes exceeds 8. If so, call the treeifyBin method to convert the linked list nodes into red-black tree nodes. See code block 6 for detailed explanation.

Code block 4: putTreeVal

/**
 * 红黑树的put操作,红黑树插入会同时维护原来的链表属性, 即原来的next属性
    */
    final TreeNode<K,V> putTreeVal(HashMap<K,V> map, Node<K,V>[] tab,
                               int h, K k, V v) {
    
    
    Class<?> kc = null;
    boolean searched = false;
    // 1.查找根节点, 索引位置的头节点并不一定为红黑树的根节点
    TreeNode<K,V> root = (parent != null) ? root() : this;
    // 2.将根节点赋值给p节点,开始进行查找
    for (TreeNode<K,V> p = root;;) {
    
    
        int dir, ph; K pk;
        // 3.如果传入的hash值小于p节点的hash值,将dir赋值为-1,代表向p的左边查找树
        if ((ph = p.hash) > h)
            dir = -1;
        // 4.如果传入的hash值大于p节点的hash值, 将dir赋值为1,代表向p的右边查找树
        else if (ph < h)
            dir = 1;
        // 5.如果传入的hash值和key值等于p节点的hash值和key值, 则p节点即为目标节点, 返回p节点
        else if ((pk = p.key) == k || (k != null && k.equals(pk)))
            return p;
        // 6.如果k所属的类没有实现Comparable接口 或者 k和p节点的key相等
        else if ((kc == null &&
                  (kc = comparableClassFor(k)) == null) ||
                 (dir = compareComparables(kc, k, pk)) == 0) {
    
    
            // 6.1 第一次符合条件, 从p节点的左节点和右节点分别调用find方法进行查找, 如果查找到目标节点则返回
            if (!searched) {
    
    
                TreeNode<K,V> q, ch;
                searched = true;
                if (((ch = p.left) != null &&
                     (q = ch.find(h, k, kc)) != null) ||
                    ((ch = p.right) != null &&
                     (q = ch.find(h, k, kc)) != null))
                    return q;
            }
            // 6.2 否则使用定义的一套规则来比较k和p节点的key的大小, 用来决定向左还是向右查找
            dir = tieBreakOrder(k, pk); // dir<0则代表k<pk,则向p左边查找;反之亦然
        }
        
        TreeNode<K,V> xp = p;   // xp赋值为x的父节点,中间变量,用于下面给x的父节点赋值
        // 7.dir<=0则向p左边查找,否则向p右边查找,如果为null,则代表该位置即为x的目标位置
        if ((p = (dir <= 0) ? p.left : p.right) == null) {
    
    
            // 走进来代表已经找到x的位置,只需将x放到该位置即可
            Node<K,V> xpn = xp.next;    // xp的next节点
            // 8.创建新的节点, 其中x的next节点为xpn, 即将x节点插入xp与xpn之间
            TreeNode<K,V> x = map.newTreeNode(h, k, v, xpn);
            // 9.调整x、xp、xpn之间的属性关系
            if (dir <= 0)   // 如果时dir <= 0, 则代表x节点为xp的左节点
                xp.left = x;
            else        // 如果时dir> 0, 则代表x节点为xp的右节点
                xp.right = x;
            xp.next = x;    // 将xp的next节点设置为x
            x.parent = x.prev = xp; // 将x的parent和prev节点设置为xp
            // 如果xpn不为空,则将xpn的prev节点设置为x节点,与上文的x节点的next节点对应
            if (xpn != null)
                ((TreeNode<K,V>)xpn).prev = x;
            // 10.进行红黑树的插入平衡调整
            moveRootToFront(tab, balanceInsertion(root, x));
            return null;
        }
    }
    }

6.1 When the conditions are met for the first time, call the find method (see code block 2 for details) from the left node and right node of the p node respectively. If the target node is found, return 6.2. Otherwise, use the defined set of rules to compare k and p
. The size of the key of the node is used to determine whether to search left or right, see code block 5 for details.

10 Adjust the insertion balance of the red-black tree, see explanation 2 at the end of the article.

Code Block 5: tieBreakOrder

// 用于不可比较或者hashCode相同时进行比较的方法, 只是一个一致的插入规则,用来维护重定位的等价性。
static int tieBreakOrder(Object a, Object b) {
    
    
    int d;
    if (a == null || b == null ||
        (d = a.getClass().getName().
         compareTo(b.getClass().getName())) == 0)
        d = (System.identityHashCode(a) <= System.identityHashCode(b) ?
             -1 : 1);
    return d;
}

Define a set of rules for comparing the size of two parameters in extreme cases.

Code block 6: treeifyBin

/**
 * 将链表节点转为红黑树节点
    */
    final void treeifyBin(Node<K,V>[] tab, int hash) {
    int n, index; Node<K,V> e;
    // 1.如果table为空或者table的长度小于64, 调用resize方法进行扩容
    if (tab == null || (n = tab.length) < MIN_TREEIFY_CAPACITY)
        resize();
    // 2.根据hash值计算索引值,将该索引位置的节点赋值给e,从e开始遍历该索引位置的链表
    else if ((e = tab[index = (n - 1) & hash]) != null) {
        TreeNode<K,V> hd = null, tl = null;
        do {
            // 3.将链表节点转红黑树节点
            TreeNode<K,V> p = replacementTreeNode(e, null);
            // 4.如果是第一次遍历,将头节点赋值给hd
            if (tl == null)  // tl为空代表为第一次循环
                hd = p;
            else {
                // 5.如果不是第一次遍历,则处理当前节点的prev属性和上一个节点的next属性
                p.prev = tl;    // 当前节点的prev属性设为上一个节点
                tl.next = p;    // 上一个节点的next属性设置为当前节点
            }
            // 6.将p节点赋值给tl,用于在下一次循环中作为上一个节点进行一些链表的关联操作(p.prev = tl 和 tl.next = p)
            tl = p;
        } while ((e = e.next) != null);
        // 7.将table该索引位置赋值为新转的TreeNode的头节点,如果该节点不为空,则以以头节点(hd)为根节点, 构建红黑树
        if ((tab[index] = hd) != null)
            hd.treeify(tab);
    }
    }

7. Assign the index position of the table as the head node hd of the newly converted TreeNode. If the node is not empty, use hd as the root node to build a red-black tree. See code block 7 for details.

Code block 7: treeify

/**
 * 构建红黑树
    */
    final void treeify(Node<K,V>[] tab) {
    TreeNode<K,V> root = null;
    // 1.将调用此方法的节点赋值给x,以x作为起点,开始进行遍历
    for (TreeNode<K,V> x = this, next; x != null; x = next) {
        next = (TreeNode<K,V>)x.next;   // next赋值为x的下个节点
        x.left = x.right = null;    // 将x的左右节点设置为空
        // 2.如果还没有根节点, 则将x设置为根节点
        if (root == null) {
            x.parent = null;    // 根节点没有父节点
            x.red = false;  // 根节点必须为黑色
            root = x;   // 将x设置为根节点
        }
        else {
            K k = x.key;  // k赋值为x的key
            int h = x.hash;  // h赋值为x的hash值
            Class<?> kc = null;
            // 3.如果当前节点x不是根节点, 则从根节点开始查找属于该节点的位置
            for (TreeNode<K,V> p = root;;) {
                int dir, ph;
                K pk = p.key;
                // 4.如果x节点的hash值小于p节点的hash值,则将dir赋值为-1, 代表向p的左边查找
                if ((ph = p.hash) > h)
                    dir = -1;
                // 5.如果x节点的hash值大于p节点的hash值,则将dir赋值为1, 代表向p的右边查找
                else if (ph < h)
                    dir = 1;
                // 6.走到这代表x的hash值和p的hash值相等,则比较key值
                else if ((kc == null && // 6.1 如果k没有实现Comparable接口 或者 x节点的key和p节点的key相等
                          (kc = comparableClassFor(k)) == null) ||
                         (dir = compareComparables(kc, k, pk)) == 0)
                    // 6.2 使用定义的一套规则来比较x节点和p节点的大小,用来决定向左还是向右查找
                    dir = tieBreakOrder(k, pk);
        
                TreeNode<K,V> xp = p;   // xp赋值为x的父节点,中间变量用于下面给x的父节点赋值
                // 7.dir<=0则向p左边查找,否则向p右边查找,如果为null,则代表该位置即为x的目标位置
                if ((p = (dir <= 0) ? p.left : p.right) == null) {
                    // 8.x和xp节点的属性设置
                    x.parent = xp;  // x的父节点即为最后一次遍历的p节点
                    if (dir <= 0)   // 如果时dir <= 0, 则代表x节点为父节点的左节点
                        xp.left = x;
                    else    // 如果时dir > 0, 则代表x节点为父节点的右节点
                        xp.right = x;
                    // 9.进行红黑树的插入平衡(通过左旋、右旋和改变节点颜色来保证当前树符合红黑树的要求)
                    root = balanceInsertion(root, x);
                    break;
                }
            }
        }
    }
    // 10.如果root节点不在table索引位置的头节点, 则将其调整为头节点
    moveRootToFront(tab, root);
    }

3. If the current node x is not the root node, start searching for the position belonging to the node from the root node. This piece of code is similar to the search code of code block 2 and code block 4.
8. If the root node is not the head node at the table index position, adjust it to the head node. See code block 8 for details.

Code Block 8: moveRootToFront

/**
 * 将root放到头节点的位置
 * 如果当前索引位置的头节点不是root节点, 则将root的上一个节点和下一个节点进行关联,
 * 将root放到头节点的位置, 原头节点放在root的next节点上
    */
    static <K,V> void moveRootToFront(Node<K,V>[] tab, TreeNode<K,V> root) {
    int n;
    // 1.校验root是否为空、table是否为空、table的length是否大于0
    if (root != null && tab != null && (n = tab.length) > 0) {
        // 2.计算root节点的索引位置
        int index = (n - 1) & root.hash;
        TreeNode<K,V> first = (TreeNode<K,V>)tab[index];
        // 3.如果该索引位置的头节点不是root节点,则该索引位置的头节点替换为root节点
        if (root != first) {
            Node<K,V> rn;
            // 3.1 将该索引位置的头节点赋值为root节点
            tab[index] = root;
            TreeNode<K,V> rp = root.prev;   // root节点的上一个节点
            // 3.2 和 3.3 两个操作是移除root节点的过程
            // 3.2 如果root节点的next节点不为空,则将root节点的next节点的prev属性设置为root节点的prev节点
            if ((rn = root.next) != null)
                ((TreeNode<K,V>)rn).prev = rp;
            // 3.3 如果root节点的prev节点不为空,则将root节点的prev节点的next属性设置为root节点的next节点
            if (rp != null)
                rp.next = rn;
            // 3.4 和 3.5 两个操作将first节点接到root节点后面
            // 3.4 如果原头节点不为空, 则将原头节点的prev属性设置为root节点
            if (first != null)
                first.prev = root;
            // 3.5 将root节点的next属性设置为原头节点
            root.next = first;
            // 3.6 root此时已经被放到该位置的头节点位置,因此将prev属性设为空
            root.prev = null;
        }
        // 4.检查树是否正常
        assert checkInvariants(root);
    }
    }

4. Check whether the tree is normal, see code block 9 for details.

Code block 9: checkInvariants

/**
 * Recursive invariant check
    */
    static <K,V> boolean checkInvariants(TreeNode<K,V> t) { // 一些基本的校验
    TreeNode<K,V> tp = t.parent, tl = t.left, tr = t.right,
        tb = t.prev, tn = (TreeNode<K,V>)t.next;
    if (tb != null && tb.next != t)
        return false;
    if (tn != null && tn.prev != t)
        return false;
    if (tp != null && t != tp.left && t != tp.right)
        return false;
    if (tl != null && (tl.parent != t || tl.hash > t.hash))
        return false;
    if (tr != null && (tr.parent != t || tr.hash < t.hash))
        return false;
    if (t.red && tl != null && tl.red && tr != null && tr.red)  // 如果当前节点为红色, 则该节点的左右节点都不能为红色
        return false;
    if (tl != null && !checkInvariants(tl))
        return false;
    if (tr != null && !checkInvariants(tr))
        return false;
    return true;
    }

Use the incoming node as the root node, traverse all nodes, and verify the legality of the node, mainly to ensure that the tree conforms to the rules of the red-black tree.

resize method

final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    // 1.老表的容量不为0,即老表不为空
    if (oldCap > 0) {
        // 1.1 判断老表的容量是否超过最大容量值:如果超过则将阈值设置为Integer.MAX_VALUE,并直接返回老表,
        // 此时oldCap * 2比Integer.MAX_VALUE大,因此无法进行重新分布,只是单纯的将阈值扩容到最大
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        // 1.2 将newCap赋值为oldCap的2倍,如果newCap<最大容量并且oldCap>=16, 则将新阈值设置为原来的两倍
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    }
    // 2.如果老表的容量为0, 老表的阈值大于0, 是因为初始容量被放入阈值,则将新表的容量设置为老表的阈值
    else if (oldThr > 0)
        newCap = oldThr;
    else {
        // 3.老表的容量为0, 老表的阈值为0,这种情况是没有传初始容量的new方法创建的空表,将阈值和容量设置为默认值
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    // 4.如果新表的阈值为空, 则通过新的容量*负载因子获得阈值
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    // 5.将当前阈值设置为刚计算出来的新的阈值,定义新表,容量为刚计算出来的新容量,将table设置为新定义的表。
    threshold = newThr;
    @SuppressWarnings({"rawtypes","unchecked"})
    Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;
    // 6.如果老表不为空,则需遍历所有节点,将节点赋值给新表
    if (oldTab != null) {
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if ((e = oldTab[j]) != null) {  // 将索引值为j的老表头节点赋值给e
                oldTab[j] = null; // 将老表的节点设置为空, 以便垃圾收集器回收空间
                // 7.如果e.next为空, 则代表老表的该位置只有1个节点,计算新表的索引位置, 直接将该节点放在该位置
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                // 8.如果是红黑树节点,则进行红黑树的重hash分布(跟链表的hash分布基本相同)
                else if (e instanceof TreeNode)
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else { // preserve order
                    // 9.如果是普通的链表节点,则进行普通的重hash分布
                    Node<K,V> loHead = null, loTail = null; // 存储索引位置为:“原索引位置”的节点
                    Node<K,V> hiHead = null, hiTail = null; // 存储索引位置为:“原索引位置+oldCap”的节点
                    Node<K,V> next;
                    do {
                        next = e.next;
                        // 9.1 如果e的hash值与老表的容量进行与运算为0,则扩容后的索引位置跟老表的索引位置一样
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null) // 如果loTail为空, 代表该节点为第一个节点
                                loHead = e; // 则将loHead赋值为第一个节点
                            else
                                loTail.next = e;    // 否则将节点添加在loTail后面
                            loTail = e; // 并将loTail赋值为新增的节点
                        }
                        // 9.2 如果e的hash值与老表的容量进行与运算为1,则扩容后的索引位置为:老表的索引位置+oldCap
                        else {
                            if (hiTail == null) // 如果hiTail为空, 代表该节点为第一个节点
                                hiHead = e; // 则将hiHead赋值为第一个节点
                            else
                                hiTail.next = e;    // 否则将节点添加在hiTail后面
                            hiTail = e; // 并将hiTail赋值为新增的节点
                        }
                    } while ((e = next) != null);
                    // 10.如果loTail不为空(说明老表的数据有分布到新表上“原索引位置”的节点),则将最后一个节点
                    // 的next设为空,并将新表上索引位置为“原索引位置”的节点设置为对应的头节点
                    if (loTail != null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    // 11.如果hiTail不为空(说明老表的数据有分布到新表上“原索引+oldCap位置”的节点),则将最后
                    // 一个节点的next设为空,并将新表上索引位置为“原索引+oldCap”的节点设置为对应的头节点
                    if (hiTail != null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    // 12.返回新表
    return newTab;
}

2. The capacity of the old table is 0, and the threshold of the old table is greater than 0: In this case, the initial capacity is passed when creating a new HashMap, for example: new HashMap<>(32). When using this method to create a new HashMap, because the HashMap does not capacity attribute, so the capacity at this time will be temporarily stored in the threshold attribute. Therefore, the value of threshold at this time is the capacity of the newly created HashMap, so the capacity of the new table is set to threshold.

4. If the threshold of the new table is empty, the threshold is obtained through the new capacity * load factor (in this case, the initial capacity is passed during initialization, which is the same as point 2, or the initial capacity is set too small, causing the old table to fail) capacity does not exceed 16).

8. If it is a red-black tree node, perform rehash distribution of the red-black tree. See code block 10 for details.

9.1 If the bitwise AND operation between the hash value of e and the capacity of the old table is 0, it means that the index position of the e node after expansion is the same as the index position of the old table (see Example 1 for details), and the linked list splicing operation is performed: If loTail is empty , indicating that the node is the first node, then loHead is assigned the value of this node; otherwise, the node is added after loTail, and loTail is assigned the value of the new node.

9.2 If the bitwise AND operation between the hash value of e and the capacity of the old table is 1, it means that the index position of the e node after expansion is: the index position of the old table + oldCap (see Example 1 for details), perform the linked list splicing operation: if hiTail is If empty, it means that the node is the first node, then hiHead will be assigned to this node; otherwise, the node will be added after hiTail, and hiTail will be assigned to the new node.

Code block 10: split

/**

 * 扩容后,红黑树的hash分布,只可能存在于两个位置:原索引位置、原索引位置+oldCap
    */
    final void split(HashMap<K,V> map, Node<K,V>[] tab, int index, int bit) {
    TreeNode<K,V> b = this;  // 拿到调用此方法的节点
    TreeNode<K,V> loHead = null, loTail = null; // 存储索引位置为:“原索引位置”的节点
    TreeNode<K,V> hiHead = null, hiTail = null; // 存储索引位置为:“原索引+oldCap”的节点
    int lc = 0, hc = 0;
    // 1.以调用此方法的节点开始,遍历整个红黑树节点
    for (TreeNode<K,V> e = b, next; e != null; e = next) {  // 从b节点开始遍历
        next = (TreeNode<K,V>)e.next;   // next赋值为e的下个节点
        e.next = null;  // 同时将老表的节点设置为空,以便垃圾收集器回收
        // 2.如果e的hash值与老表的容量进行与运算为0,则扩容后的索引位置跟老表的索引位置一样
        if ((e.hash & bit) == 0) {
            if ((e.prev = loTail) == null)  // 如果loTail为空, 代表该节点为第一个节点
                loHead = e; // 则将loHead赋值为第一个节点
            else
                loTail.next = e;    // 否则将节点添加在loTail后面
            loTail = e; // 并将loTail赋值为新增的节点
            ++lc;   // 统计原索引位置的节点个数
        }
        // 3.如果e的hash值与老表的容量进行与运算为1,则扩容后的索引位置为:老表的索引位置+oldCap
        else {
            if ((e.prev = hiTail) == null)  // 如果hiHead为空, 代表该节点为第一个节点
                hiHead = e; // 则将hiHead赋值为第一个节点
            else
                hiTail.next = e;    // 否则将节点添加在hiTail后面
            hiTail = e; // 并将hiTail赋值为新增的节点
            ++hc;   // 统计索引位置为原索引+oldCap的节点个数
        }
    }
    // 4.如果原索引位置的节点不为空
    if (loHead != null) {   // 原索引位置的节点不为空
        // 4.1 如果节点个数<=6个则将红黑树转为链表结构
        if (lc <= UNTREEIFY_THRESHOLD)
            tab[index] = loHead.untreeify(map);
        else {
            // 4.2 将原索引位置的节点设置为对应的头节点
            tab[index] = loHead;
            // 4.3 如果hiHead不为空,则代表原来的红黑树(老表的红黑树由于节点被分到两个位置)
            // 已经被改变, 需要重新构建新的红黑树
            if (hiHead != null)
                // 4.4 以loHead为根节点, 构建新的红黑树
                loHead.treeify(tab);
        }
    }
    // 5.如果索引位置为原索引+oldCap的节点不为空
    if (hiHead != null) {   // 索引位置为原索引+oldCap的节点不为空
        // 5.1 如果节点个数<=6个则将红黑树转为链表结构
        if (hc <= UNTREEIFY_THRESHOLD)
            tab[index + bit] = hiHead.untreeify(map);
        else {
            // 5.2 将索引位置为原索引+oldCap的节点设置为对应的头节点
            tab[index + bit] = hiHead;
            // 5.3 loHead不为空则代表原来的红黑树(老表的红黑树由于节点被分到两个位置)
            // 已经被改变, 需要重新构建新的红黑树
            if (loHead != null)
                // 5.4 以hiHead为根节点, 构建新的红黑树
                hiHead.treeify(tab);
        }
    }
    }

2. If the bit-AND operation between the hash value of e and the capacity of the old table is 0, it means that the index position of the expanded e node is the same as the index position of the old table (see Example 1 for details), and the linked list splicing operation is performed: if loTail is If empty, it means that the node is the first node, then loHead is assigned to this node; otherwise, the node is added after loTail, and loTail is assigned to the new node, and the number of nodes at the original index position is counted.
3. If the bitwise AND operation between the hash value of e and the capacity of the old table is 1, it means that the index position of the e node after expansion is: the index position of the old table + oldCap (see Example 1 for details), perform the linked list splicing operation: If hiTail is empty, it means that the node is the first node, then hiHead is assigned to this node; otherwise, the node is added after hiTail, and hiTail is assigned to the new node, and the number of nodes whose index position is the original index + oldCap is counted. number.
4.1 If the number of nodes <= 6, the red-black tree will be converted into a linked list structure. See code block 11 for details.
4.4 Use loHead as the root node to construct a new red-black tree. See code block 7 for details.

Code block 11: untreeify

/**
 * 将红黑树节点转为链表节点, 当节点<=6个时会被触发
    */
    final Node<K,V> untreeify(HashMap<K,V> map) {
    Node<K,V> hd = null, tl = null; // hd指向头节点, tl指向尾节点
    // 1.从调用该方法的节点, 即链表的头节点开始遍历, 将所有节点全转为链表节点
    for (Node<K,V> q = this; q != null; q = q.next) {
        // 2.调用replacementNode方法构建链表节点
        Node<K,V> p = map.replacementNode(q, null);
        // 3.如果tl为null, 则代表当前节点为第一个节点, 将hd赋值为该节点
        if (tl == null)
            hd = p;
        // 4.否则, 将尾节点的next属性设置为当前节点p
        else
            tl.next = p;
        tl = p; // 5.每次都将tl节点指向当前节点, 即尾节点
    }
    // 6.返回转换后的链表的头节点
    return hd;
    }

Example 1: After expansion, why can node rehash be distributed only in the "original index position" and "original index + oldCap position"?
In the expansion code, the hash value of the e node is used to perform a bitwise AND operation with oldCap, so as to decide to distribute the nodes to the "original index position" or "original index + oldCap position". Why is this?

Assume that the capacity of the old table is 16, that is, oldCap = 16, then the capacity of the new table is 16 * 2 = 32. Assume that the hash value of node 1 is: 0000 0000 0000 0000 0000 1111 0000 1010, and the hash value of node 2 is: 0000 0000 0000 0000 0000 1111 0001 1010, then the index positions of node 1 and node 2 in the old table are calculated as shown in the figure below. 1. Due to the length limit of the old table, the index positions of node 1 and node 2 only depend on the last 4 digits of the node hash value. .

Look at calculation 2 again. Calculation 2 is the index calculation of the new table. You can know that if the index positions of the two nodes in the old table are the same, the index position of the new table only depends on the fifth digit from the bottom of the node hash value, and this position The value of is exactly the capacity value of the old table, 16. At this time, there are only two situations for the index position of the node in the new table: "original index position" and "original index + oldCap position", which are 10 and 10 + 16 in this example. =26.

Since the result only depends on the penultimate 5th bit of the hash value of the node, and the value of this position is exactly the capacity value of the old table 16, so the calculation of the index position of the new table can be replaced by calculation 3, directly using the hash value of the node Perform a location operation with the capacity 16 of the old table. If the result is 0, the index position of the node in the new table is the original index position. Otherwise, the index position of the node in the new table is "original index + oldCap position".

remove method

/**
 * 移除某个节点
    */
    public V remove(Object key) {
    
    
    Node<K,V> e;
    return (e = removeNode(hash(key), key, null, false, true)) == null ?
        null : e.value;
    }

final Node<K,V> removeNode(int hash, Object key, Object value,
                           boolean matchValue, boolean movable) {
    
    
    Node<K,V>[] tab; Node<K,V> p; int n, index;
    // 1.如果table不为空并且根据hash值计算出来的索引位置不为空, 将该位置的节点赋值给p
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (p = tab[index = (n - 1) & hash]) != null) {
    
    
        Node<K,V> node = null, e; K k; V v;
        // 2.如果p的hash值和key都与入参的相同, 则p即为目标节点, 赋值给node
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
            node = p;
        else if ((e = p.next) != null) {
    
    
            // 3.否则将p.next赋值给e,向下遍历节点
            // 3.1 如果p是TreeNode则调用红黑树的方法查找节点
            if (p instanceof TreeNode)
                node = ((TreeNode<K,V>)p).getTreeNode(hash, key);
            else {
    
    
                // 3.2 否则,进行普通链表节点的查找
                do {
    
    
                    // 当节点的hash值和key与传入的相同,则该节点即为目标节点
                    if (e.hash == hash &&
                        ((k = e.key) == key ||
                         (key != null && key.equals(k)))) {
    
    
                        node = e;  // 赋值给node, 并跳出循环
                        break;
                    }
                    p = e;  // p节点赋值为本次结束的e,在下一次循环中,e为p的next节点
                } while ((e = e.next) != null); // e指向下一个节点
            }
        }
        // 4.如果node不为空(即根据传入key和hash值查找到目标节点),则进行移除操作
        if (node != null && (!matchValue || (v = node.value) == value ||
                             (value != null && value.equals(v)))) {
    
    
            // 4.1 如果是TreeNode则调用红黑树的移除方法
            if (node instanceof TreeNode)
                ((TreeNode<K,V>)node).removeTreeNode(this, tab, movable);
            // 4.2 如果node是该索引位置的头节点则直接将该索引位置的值赋值为node的next节点,
            // “node == p”只会出现在node是头节点的时候,如果node不是头节点,则node为p的next节点
            else if (node == p)
                tab[index] = node.next;
            // 4.3 否则将node的上一个节点的next属性设置为node的next节点,
            // 即将node节点移除, 将node的上下节点进行关联(链表的移除)
            else
                p.next = node.next;
            ++modCount;
            --size;
            afterNodeRemoval(node); // 供LinkedHashMap使用
            // 5.返回被移除的节点
            return node;
        }
    }
    return null;
}

3.1 If p is a TreeNode, call the red-black tree method to find the node, see code block 1 for details.
4.1 If it is a TreeNode, call the removal method of the red-black tree, see code block 12 for details.

Code block 12: removeTreeNode
This code is relatively long, the purpose is to remove the node that calls this method, that is, the this node in this method. Removal includes processing of linked lists and processing of red-black trees. It can be understood in conjunction with the diagram below.

/**
 * 红黑树的节点移除
    */
    final void removeTreeNode(HashMap<K,V> map, Node<K,V>[] tab,
                          boolean movable) {
    // --- 链表的处理start ---
    int n;
    // 1.table为空或者length为0直接返回
    if (tab == null || (n = tab.length) == 0)
        return;
    // 2.根据hash计算出索引的位置
    int index = (n - 1) & hash;
    // 3.将索引位置的头节点赋值给first和root
    TreeNode<K,V> first = (TreeNode<K,V>)tab[index], root = first, rl;
    // 4.该方法被将要被移除的node(TreeNode)调用, 因此此方法的this为要被移除node节点,
    // 将node的next节点赋值给succ节点,prev节点赋值给pred节点
    TreeNode<K,V> succ = (TreeNode<K,V>)next, pred = prev;
    // 5.如果pred节点为空,则代表要被移除的node节点为头节点,
    // 则将table索引位置的值和first节点的值赋值为succ节点(node的next节点)即可
    if (pred == null)
        tab[index] = first = succ;
    else
        // 6.否则将pred节点的next属性设置为succ节点(node的next节点)
        pred.next = succ;
    // 7.如果succ节点不为空,则将succ的prev节点设置为pred, 与前面对应
    if (succ != null)
        succ.prev = pred;
    // 8.如果进行到此first节点为空,则代表该索引位置已经没有节点则直接返回
    if (first == null)
        return;
    // 9.如果root的父节点不为空, 则将root赋值为根节点
    if (root.parent != null)
        root = root.root();
    // 10.通过root节点来判断此红黑树是否太小, 如果是则调用untreeify方法转为链表节点并返回
    // (转链表后就无需再进行下面的红黑树处理)
    if (root == null || root.right == null ||
        (rl = root.left) == null || rl.left == null) {
        tab[index] = first.untreeify(map);  // too small
        return;
    }
    // --- 链表的处理end ---

    // --- 以下代码为红黑树的处理 ---
    // 11.将p赋值为要被移除的node节点,pl赋值为p的左节点,pr赋值为p 的右节点
    TreeNode<K,V> p = this, pl = left, pr = right, replacement;
    // 12.如果p的左节点和右节点都不为空时
    if (pl != null && pr != null) {
        // 12.1 将s节点赋值为p的右节点
        TreeNode<K,V> s = pr, sl;
        // 12.2 向左一直查找,跳出循环时,s为没有左节点的节点
        while ((sl = s.left) != null)
            s = sl;
        // 12.3 交换p节点和s节点的颜色
        boolean c = s.red; s.red = p.red; p.red = c;
        TreeNode<K,V> sr = s.right; // s的右节点
        TreeNode<K,V> pp = p.parent;    // p的父节点
        // --- 第一次调整和第二次调整:将p节点和s节点进行了位置调换 ---
        // 12.4 第一次调整
        // 如果p节点的右节点即为s节点,则将p的父节点赋值为s,将s的右节点赋值为p
        if (s == pr) {
            p.parent = s;
            s.right = p;
        }
        else {
            // 将sp赋值为s的父节点
            TreeNode<K,V> sp = s.parent;
            // 将p的父节点赋值为sp
            if ((p.parent = sp) != null) {
                // 如果s节点为sp的左节点,则将sp的左节点赋值为p节点
                if (s == sp.left)
                    sp.left = p;
                // 否则s节点为sp的右节点,则将sp的右节点赋值为p节点
                else
                    sp.right = p;
            }
            // s的右节点赋值为p节点的右节点
            if ((s.right = pr) != null)
                // 如果pr不为空,则将pr的父节点赋值为s
                pr.parent = s;
        }
        // 12.5 第二次调整
        // 将p的左节点赋值为空,pl已经保存了该节点
        p.left = null;
        // 将p节点的右节点赋值为sr,如果sr不为空,则将sr的父节点赋值为p节点
        if ((p.right = sr) != null)
            sr.parent = p;
        // 将s节点的左节点赋值为pl,如果pl不为空,则将pl的父节点赋值为s节点
        if ((s.left = pl) != null)
            pl.parent = s;
        // 将s的父节点赋值为p的父节点pp
        // 如果pp为空,则p节点为root节点, 交换后s成为新的root节点
        if ((s.parent = pp) == null)
            root = s;
        // 如果p不为root节点, 并且p是pp的左节点,则将pp的左节点赋值为s节点
        else if (p == pp.left)
            pp.left = s;
        // 如果p不为root节点, 并且p是pp的右节点,则将pp的右节点赋值为s节点
        else
            pp.right = s;
        // 12.6 寻找replacement节点,用来替换掉p节点
        // 12.6.1 如果sr不为空,则replacement节点为sr,因为s没有左节点,所以使用s的右节点来替换p的位置
        if (sr != null)
            replacement = sr;
        // 12.6.1 如果sr为空,则s为叶子节点,replacement为p本身,只需要将p节点直接去除即可
        else
            replacement = p;
    }
    // 13.承接12点的判断,如果p的左节点不为空,右节点为空,replacement节点为p的左节点
    else if (pl != null)
        replacement = pl;
    // 14.如果p的右节点不为空,左节点为空,replacement节点为p的右节点
    else if (pr != null)
        replacement = pr;
    // 15.如果p的左右节点都为空, 即p为叶子节点, replacement节点为p节点本身
    else
        replacement = p;
    // 16.第三次调整:使用replacement节点替换掉p节点的位置,将p节点移除
    if (replacement != p) { // 如果p节点不是叶子节点
        // 16.1 将p节点的父节点赋值给replacement节点的父节点, 同时赋值给pp节点
        TreeNode<K,V> pp = replacement.parent = p.parent;
        // 16.2 如果p没有父节点, 即p为root节点,则将root节点赋值为replacement节点即可
        if (pp == null)
            root = replacement;
        // 16.3 如果p不是root节点, 并且p为pp的左节点,则将pp的左节点赋值为替换节点replacement
        else if (p == pp.left)
            pp.left = replacement;
        // 16.4 如果p不是root节点, 并且p为pp的右节点,则将pp的右节点赋值为替换节点replacement
        else
            pp.right = replacement;
        // 16.5 p节点的位置已经被完整的替换为replacement, 将p节点清空, 以便垃圾收集器回收
        p.left = p.right = p.parent = null;
    }
    // 17.如果p节点不为红色则进行红黑树删除平衡调整
    // (如果删除的节点是红色则不会破坏红黑树的平衡无需调整)
    TreeNode<K,V> r = p.red ? root : balanceDeletion(root, replacement);

    // 18.如果p节点为叶子节点, 则简单的将p节点去除即可
    if (replacement == p) {
        TreeNode<K,V> pp = p.parent;
        // 18.1 将p的parent属性设置为空
        p.parent = null;
        if (pp != null) {
            // 18.2 如果p节点为父节点的左节点,则将父节点的左节点赋值为空
            if (p == pp.left)
                pp.left = null;
            // 18.3 如果p节点为父节点的右节点, 则将父节点的右节点赋值为空
            else if (p == pp.right)
                pp.right = null;
        }
    }
    if (movable)
        // 19.将root节点移到索引位置的头节点
        moveRootToFront(tab, r);
    }

7. If the succ node is not empty, set the prev node of succ to pred, which corresponds to the previous (removal of the TreeNode linked list, see point 8 at the beginning).

12.6 Find the replacement, which is used to replace the p node. Why is sr the first choice for replacement and p is the alternative? See Explanation 1.

PS: The first and second adjustments in the code are to swap the positions of the p node and the s node, and then find the replacement to replace the p node; the third adjustment is to overwrite the p node with the replacement node; The logic of this part of the code is very complicated, it is recommended to draw the simulation by yourself. (Illustration 1 below is an example of these three adjustments)

**Explanation 1:** Why is sr the first choice for replacement and p the alternative?
Analysis: First, let’s look at what sr is? It can be seen from the code that when sr is assigned a value for the first time, it is after the left exhaustive traversal of the s node ends. Therefore, the s node has no left node at this time, and sr is the right node of the s node. From the first adjustment and the second adjustment above, we know that the p node has exchanged positions with the s node, so at this time sr is actually the right node of the p node, and the p node has no left node, so it needs to be removed. p node, you only need to cover the p node with the right node sr of the p node. Therefore, sr is the first choice for replacement. If sr is empty, it means that the p node is a leaf node. At this time, the p node can be removed directly. .

Diagram 1: removeTreeNode Diagram

Note that this illustration ignores the color of the red-black tree.

The diagram below is the most complex situation in the code, that is, the one with the longest process. The p node is not the root node, the p node has left and right nodes, the s node is not the pr node, and the s node has a right node.

In addition, the first adjustment and the second adjustment are set by me based on the code. It will be easier to understand when the first adjustment and the second adjustment are combined, as follows: first adjustment + second
adjustment : The positions of the p node and the s node were exchanged, and the replacement node to be replaced was selected. The
third adjustment: the replacement node covered the p node.

Explanation 2: Regarding the balance adjustment of the red-black tree?

Answer: The operations involved in the operation of red-black trees are relatively complex and cannot be explained clearly in a few words. If you are interested, you can study it separately. Due to space constraints, this article will not introduce the specific operations of the red-black tree in detail. Here is a brief introduction: the red-black tree is a self-balancing binary tree, which has excellent query and insertion/deletion performance and is widely used. Applies to associative arrays.

Compared with AVL trees, AVL requires that the absolute value (balance factor) of the height difference between the left and right subtrees of each node is at most 1, while the red-black tree lowers this condition appropriately (the red-black tree limits the maximum height from the root to the leaves). The long possible path is no more than twice as long as the shortest possible path, and the result is that the tree is roughly balanced), thereby reducing the time-consuming balance adjustment during insertion/deletion, thereby obtaining better performance, and this Although the red-black tree query will be slightly slower than AVL, compared with the time obtained during insertion/deletion, this effort is obviously worth it in most cases.

Application in HashMap: HashMap may trigger the insertion balance adjustment (balanceInsertion method) or deletion balance adjustment (balanceDeletion method) of the red-black tree when inserting and deleting. The adjustment methods mainly include the following means: left rotation (rotateLeft method) ), rotate right (rotateRight method), change node color (x.red = false, x.red = true), the reason for adjustment is to maintain the data structure of the red-black tree.

Infinite loop problem

Before JDK 1.8, the Java language used HashMap under concurrent conditions to cause Race Condition, resulting in an infinite loop. The program often takes up 100% of the CPU. Check the stack and you will find that the program is hanging on the "HashMap.get()" method. The problem disappears after restarting the program. Someone reported this issue to Sun as a bug, but Sun believes that this is not a bug because HashMap does not guarantee concurrent thread safety. Under concurrency, ConcurrentHashMap must be used instead.

So, has this problem been solved in JDK 1.8?
We know that before JDK 1.8, the main cause of the infinite loop is that after expansion, the order of nodes will be reversed, as shown in the figure below: node A is in front of node C before expansion, and node C is in front of node A after expansion.

JDK 1.8 expansion process

The expansion code of JDK1.8 ordinary linked list, as shown in the figure below, has been analyzed above: it mainly processes all nodes at the same position in one do/while.

For example,
premise: We assume that there are 3 nodes, node A, node B, and node C, and assuming that their hash value is equal to the key value, the expansion process according to the above figure is simulated as follows.
Let’s first look at the process of calculating the index position of the old table and the new table: (the first 28 bits of 0 are omitted in the hash calculation, and only the last 4 bits are looked at)

Specific expansion process

Result: It can be seen that after expansion, the order of node A and node C is the same as before expansion. Therefore, even if multiple threads are concurrently expanded at this time, an infinite loop will not occur. Of course, this still does not change that HashMap is still non-concurrency safe. Under concurrency, ConcurrentHashMap must be used instead.

The difference between HashMap and Hashtable

  • HashMap allows key and value to be null, but Hashtable does not.

  • The default initial capacity of HashMap is 16 and Hashtable is 11.

  • The expansion of HashMap is 2 times of the original, and the expansion of Hashtable is 2 times of the original plus 1.

  • HashMap is not thread-safe, Hashtable is thread-safe.

  • The hash value of HashMap has been recalculated, and Hashtable uses hashCode directly.

  • HashMap removes the contains method in Hashtable.

  • HashMap inherits from AbstractMap class, Hashtable inherits from Dictionary class.

Summarize

1. The underlying data structure of HashMap is a Node array (Node<K,V>[] table). At a certain index position of the array, if there are multiple nodes, these nodes exist in the form of a red-black tree or a linked list .

2. When searching, adding, updating, and deleting key-value pairs, locating the location of the hash bucket array is a critical step. In the source code, the following three operations are used to complete this operation: 1) Get the hashCode value of the key; 2) Perform XOR operation on hashCode and the high 16 bits of hashCode to obtain a new hash value; 3) Perform "&" operation on the calculated hash value and "table.length - 1".

3. The default initial capacity of HashMap is 16, and the capacity must be a power of 2. If the initial capacity you pass in does not meet the conditions, a minimum and qualified capacity will be calculated. For example, if you pass in is 15, then the capacity is 16; the default load factor (load factor) is 0.75; the actual number of nodes that can be stored (threshold, that is, the threshold that triggers expansion) = capacity * load factor.

4. After HashMap triggers the expansion, under normal circumstances, the threshold will become twice the original value, and all nodes will be re-hash distributed. After re-hash distribution, there may only be two distribution positions of nodes in the new table: "Original index + oldCap position" or "Original index position". For example, if the capacity is 32 and the node at index position 3 is re-hash distributed, it can only be distributed in two positions in the new table: "index position 3" and "index position 35 (3 + 32)".

5. The fundamental reason why after HashMap expansion, the rehash of nodes at the same index position will only be distributed in two positions is: 1) The length of the table is always 2 to the nth power; 2) The calculation method of the index position is "( table. length - 1) & hash". HashMap expansion is a relatively time-consuming operation. Try to give a reasonable initial capacity value when defining HashMap.

6.HashMap has loadFactor attribute and threshold attribute, but no capacity attribute. When creating a new HashMap object, if the initial capacity value is passed, the value will be temporarily stored in the threshold attribute, and the table will not be initialized until the first put operation is performed. During initialization, the threshold value will be used as the capacity of the new table. value, and then use capacity and loadFactor to calculate the real threshold value of the new table.

7. When the number of nodes at the same index position reaches 9 after being added, and the length of the table is greater than or equal to 64, it will trigger the conversion of the linked list node (Node) into a red-black tree node (TreeNode), and then convert it into a red-black tree node. Finally, the structure of the linked list still exists and is maintained through the next attribute. The specific method for converting linked list nodes to red-black tree nodes is the treeifyBin method in the source code. And if the array length is less than 64, the linked list will not be triggered to convert to a red-black tree, but will be expanded.

8. When the number of nodes at the same index position reaches 6 after removal, and the node at this index position is a red-black tree node, it will trigger the conversion of the red-black tree node to a linked list node. The specific method of converting red-black tree nodes to linked list nodes is the untreeify method in the source code.

9. HashMap no longer has the problem of infinite loops after JDK 1.8. The root cause of the existence of infinite loops before JDK 1.8 is that the order of nodes at the same index position will be reversed after expansion.

10.HashMap is not thread-safe. ConcurrentHashMap can be used instead in scenarios where thread safety or concurrency needs to be ensured.

From: The most detailed JDK 1.8 HashMap source code analysis in history

Guess you like

Origin blog.csdn.net/liufang_imei/article/details/132698357