Detailed and detailed source code logic analysis of the HashMap bottom-level implementation of the java container

About HashMap

This article is mainly written around the realization of HashMap.
Simple questions about what is a HashMap, how to use it, and how to hash will be skipped.

At the same time, before and after jdk1.8, the logic of the bottom layer implementation of HashMap is different, so the corresponding method logic is also different. This article focuses on the jdk1.8 after, and at the same time for comparison.

At the same time, this article mainly grasps the logic of the source code, first shows the logic of the source code, and puts the previous space as a logical display, and the source code + comments are placed at the back, so that it is easy to learn, the context is clear, first clarify the logic, and then combine the source code taste .

The basic structure of HashMap

First of all, the bottom layer of HashMap is an array, no matter what version it is, it is an array

transient Node<K,V>[] table;

Internal contains a Node type of array table. Node inherits from Entry (not more about this)

Before jdk1.8,
the underlying implementation of HashMap was an array + linked list.

  • The array is used as a bucket to store elements. After we determine the bucket by the hash value of the key, we put the elements in the corresponding bucket (array position).
  • Then, it will naturally appear that two different keys have the same hashcode, and there is also the possibility that two different hashcodes correspond to the same bucket. When any one of them is possible, it will cause hash conflicts. At this time, the linked list will come out to solve the problem.
  • In a bucket, if nodes exist as a linked list, there will be no conflicts

As shown in the figure below
Insert picture description here
(https://blog.csdn.net/samniwu/article/details/90550196)

But this also has a problem. When querying a linked list in a bucket, the time complexity is O(M), which is too high and obviously does not conform to the HashMap's approximate O(1) concept.
So it was modified after jdk1.8.
Replace the above linked list with a linked list/red-black tree structure.
This ensures that when there are too many nodes in the bucket, the red-black tree (balanced binary search tree) is used to query, reducing the time complexity from O(M) to O(logM), and the performance is greatly improved.
Insert picture description here
So we now know What kind of structure is hashmap.

Important fields and attributes inside HashMap

  • table: is our array
  • capacity : This is not a field, it is actually the length of the array, but it is more important, so I posted it
  • loadFactor: load factor, a factor for calculating the threshold, which can be used to represent the sparseness of the data in the array, the closer to 1, the denser, the closer to 0, the sparser
  • threshold: Threshold, the threshold for judging expansion. threshold = capacity * loadFactor

node:

  • Linked list type node
  • Tree type node: inherited from linked list

How to determine the bucket based on the hashcode

First obtain its hashcode according to the hash method of the key object
and then perturb it once (h = key.hashCode()) ^ (h >>> 16)(high 16 bits remain unchanged, low 16 bits^=high 16 bits).
The purpose of the perturbation is to further hash the hash value to avoid conflicts. (>>> is a logical shift to the right, the high bit is filled with 0)
and then it is passed into the putVal method (the core of put) and
then (n-1) & hash is used to determine which bucket it is
(here when n is 2^n, The binary of n-1 is all 1, which is equivalent to hash%n-1, so it can be more evenly distributed in each bucket)

When to expand and convert red-black trees

Expansion

  • size>= capacity * loadFactor: When the number of elements in the current HashMap is greater than the threshold, the capacity needs to be expanded
  • When the linked list nodes in a bucket reach 8 (default value), and the array length is not enough to 64

Convert red-black tree

  • When the linked list node in a bucket reaches 8 (default value), and the array length reaches 64: At this point, the length of the array is long enough, and there are too many nodes in the current bucket, and it needs to be converted into a red-black tree

Logical analysis of common methods of HashMap

The following will analyze the source code logic of common methods to show how HashMap performs put operations, how to expand, and how to obtain them.
Mainly involves the put, get, and resize methods.


The core of put put method is putVal method, so here we mainly look at this method.

putVal:
	iftable数组为空,或者长度为0resize()
	根据传入key定位到哪一个桶((n - 1) & hash)
    if 桶是空的:
    	直接插入
    else 桶不为空:
    	e = null
    	if桶中第一个元素和当前元素相同(桶相同且hash相同且key相同):
    		用e来记录这个节点
    	else if 桶中第一个元素是树节点:
    		调用红黑树插入方法插入方法(涉及到平衡):
    			if 存在一个节点,和当前插入元素相同:
    				e = 该节点
    				跳出当前方法
    			红黑树插入
    	else 桶中第一个元素是链表节点:
    		if 存在一个节点,和当前插入元素相同:
    				e = 该节点
    				跳出当前else
    		链表插入当前插入元素:
    		判断是否需要转换红黑树or扩容
    	
    	if e!=null(当前插入元素已经存在):
    		新值换旧值
    if 当前大小超过了阈值:
    	resize()
    	
    return

The specific logic is shown above and
summarized here:

  • Find the bucket first
  • The bucket is empty and insert directly, return
  • The bucket is not empty, check to see if it already exists, if it is, replace it
  • If it is not, add (add linked list or red-black tree add)
  • Determine whether you need to expand

get
mainly calls the getNode method to get the node

getNode:
	if table为空 || table没有元素 || 对应的桶里没有元素:
		return null
	if 桶里第一个元素就和当前查询元素相同:
		return 该节点
	if 桶里不只一个节点:
		if 节点类型是树:
			return 红黑树查询的结果
		if 节点类型是链表:
			遍历找到节点
			return 目标节点

resize method

resize:
	if 旧容量已经大于等于最大容量了:
		将threshold置为Integer.MAX_VALUE
		return 旧table (不更新)
	else 没有超过最大容量:
		新容量设置为之前的两倍
	更新阈值
	创建新的table数组
	循环遍历旧table:
		重新计算hash,确定桶
		将元素插入桶里
	return 新table
  • The capacity has been updated to twice the previous
  • Need to create a new array, recalculate the hash, and then copy
  • Resize is extremely resource intensive

Source code comments

The code comments in this part are from javaGuide,
there is no mark, they are all jdk1.8

public V put(K key, V value) {
    
    
    return putVal(hash(key), key, value, false, true);
}

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
    
    
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    // table未初始化或者长度为0,进行扩容
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    // (n - 1) & hash 确定元素存放在哪个桶中,桶为空,新生成结点放入桶中(此时,这个结点是放在数组中)
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    // 桶中已经存在元素
    else {
    
    
        Node<K,V> e; K k;
        // 比较桶中第一个元素(数组中的结点)的hash值相等,key相等
        if (p.hash == hash &&
            ((k = p.key) == key || (key != null && key.equals(k))))
                // 将第一个元素赋值给e,用e来记录
                e = p;
        // hash值不相等,即key不相等;为红黑树结点
        else if (p instanceof TreeNode)
            // 放入树中
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        // 为链表结点
        else {
    
    
            // 在链表最末插入结点
            for (int binCount = 0; ; ++binCount) {
    
    
                // 到达链表的尾部
                if ((e = p.next) == null) {
    
    
                    // 在尾部插入新结点
                    p.next = newNode(hash, key, value, null);
                    // 结点数量达到阈值(默认为 8 ),执行 treeifyBin 方法
                    // 这个方法会根据 HashMap 数组来决定是否转换为红黑树。
                    // 只有当数组长度大于或者等于 64 的情况下,才会执行转换红黑树操作,以减少搜索时间。否则,就是只是对数组扩容。
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    // 跳出循环
                    break;
                }
                // 判断链表中结点的key值与插入的元素的key值是否相等
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    // 相等,跳出循环
                    break;
                // 用于遍历桶中的链表,与前面的e = p.next组合,可以遍历链表
                p = e;
            }
        }
        // 表示在桶中找到key值、hash值与插入元素相等的结点
        if (e != null) {
    
    
            // 记录e的value
            V oldValue = e.value;
            // onlyIfAbsent为false或者旧值为null
            if (!onlyIfAbsent || oldValue == null)
                //用新值替换旧值
                e.value = value;
            // 访问后回调
            afterNodeAccess(e);
            // 返回旧值
            return oldValue;
        }
    }
    // 结构性修改
    ++modCount;
    // 实际大小大于阈值则扩容
    if (++size > threshold)
        resize();
    // 插入后回调
    afterNodeInsertion(evict);
    return null;
}
//jdk1.7
public V put(K key, V value)
    if (table == EMPTY_TABLE) {
    
    
    inflateTable(threshold);
}
    if (key == null)
        return putForNullKey(value);
    int hash = hash(key);
    int i = indexFor(hash, table.length);
    for (Entry<K,V> e = table[i]; e != null; e = e.next) {
    
     // 先遍历
        Object k;
        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
    
    
            V oldValue = e.value;
            e.value = value;
            e.recordAccess(this);
            return oldValue;
        }
    }

    modCount++;
    addEntry(hash, key, value, i);  // 再插入
    return null;
}
public V get(Object key) {
    
    
    Node<K,V> e;
    return (e = getNode(hash(key), key)) == null ? null : e.value;
}

final Node<K,V> getNode(int hash, Object key) {
    
    
    Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    if ((tab = table) != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
    
    
        // 数组元素相等
        if (first.hash == hash && // always check first node
            ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
        // 桶中不止一个节点
        if ((e = first.next) != null) {
    
    
            // 在树中get
            if (first instanceof TreeNode)
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            // 在链表中get
            do {
    
    
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    return null;
}
final Node<K,V>[] resize() {
    
    
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {
    
    
        // 超过最大值就不再扩充了,就只好随你碰撞去吧
        if (oldCap >= MAXIMUM_CAPACITY) {
    
    
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        // 没超过最大值,就扩充为原来的2倍
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY && oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    }
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {
    
    
        // signifies using defaults
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    // 计算新的resize上限
    if (newThr == 0) {
    
    
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ? (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;
    @SuppressWarnings({
    
    "rawtypes","unchecked"})
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;
    if (oldTab != null) {
    
    
        // 把每个bucket都移动到新的buckets中
        for (int j = 0; j < oldCap; ++j) {
    
    
            Node<K,V> e;
            if ((e = oldTab[j]) != null) {
    
    
                oldTab[j] = null;
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else {
    
    
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do {
    
    
                        next = e.next;
                        // 原索引
                        if ((e.hash & oldCap) == 0) {
    
    
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }
                        // 原索引+oldCap
                        else {
    
    
                            if (hiTail == null)
                                hiHead = e;
                            else
                                hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);
                    // 原索引放到bucket里
                    if (loTail != null) {
    
    
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    // 原索引+oldCap放到bucket里
                    if (hiTail != null) {
    
    
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}

Reference

JavaGuide

Guess you like

Origin blog.csdn.net/qq_34687559/article/details/114414544