In-depth understanding of the underlying principles of HashMap (b)

Implement the Map interface hash tables. This implementation provides all of the optional map operations, and allows the use null null values and keys. (In addition to allowing the use of null and non-synchronous addition, the HashMap Hashtable class is approximately the same.) This mapping does not guarantee the order, in particular, it does not guarantee that the order constancy. This implementation assuming the hash function element is suitably distributed between the tub, to provide a stable performance for the basic operations (get and put). Iteration time and the desired view HashMap instance collection of "capacity" (the number of buckets) and the size (key - value mappings) proportional. So if iteration performance is important not to set the initial capacity too high (or the load factor too low)
- From "Baidu Encyclopedia"

Foreword

Step by step, a small step, a big step, I believe that the accumulation of knowledge will be a qualitative leap.
Write this article aim is to record their own learning. I hope that this is also recorded in the industry's predecessors, if not found the right place, I hope you can timely criticism and thank!

In this part

In a previous article, a rough introduction to the HashMap in several major basic concepts.
Interested friends can go to have supported it. If it is found the problem, I hope you will be promptly corrected.
Benpian content is then the article continue in-depth look at HashMap expansion mechanism.
Since learned expansion, we certainly can not do without data structures. HashMap list composed by a number of groups + before JDK1.8, the introduction of the purpose of the list is mainly to resolve hash collision (when I put the time HashMap will be our key hashing algorithm, then this hash value as an index of deposit into an array, since it is the presence of randomly generated numbers will probabilistically, when the key hash is equal to two will be introduced into the case when the list), after solving JDK1.8 hash collision has made significant changes, when chain length is greater than the threshold value (default 8) and the current length of the array is greater than 64 times, then the data on the index tree is stored into a red-black, or otherwise for expansion

In JDK1.8 must satisfy the above two conditions.
Take a look at the JDK source.

When the number reaches a threshold value chain will be converted into red-black tree
Whether the length will be judged again reached MIN_TREEIFY_CAPACITY
That is, when the length of the list reaches a given threshold value 8, the method calls treeifyBin, will once again treeifyBin determination process, if not the minimum length of the array will be called a resize () method for expansion.
This article is mainly aimed resize (), HashMap first look at how the expansion is performed.

First, when we put (k, v) What happened when?

//这是JDK中的方法。
/**
*	参数:
*		hash(key):把key进行hash计算,用来当做索引
*		key | value : 传入键值对
*		onlyIfAbsent:boolean类型,如果为true则不能对值进行覆盖
*	 	evict:如果为false表示处于创建模式。
*		onlyIfAbsent和evict是一个判断的依据不需要我们进行传参,因此我们也只需要它的用处即可。
*/
public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
    }

We can see inside the put method is actually called putVal () method.

Two, putVal () is how the expansion.

In fact for expansion operation is not putVal (), but resize (), we also dig down step by step in the habit of reading the source code. So we follow this logic. First look at how putVal () is invoked Resize ().
Here Insert Picture Description

		//声明两个Node数组,JDK1.8是Node 1.8之前使用的Entry
        Node<K,V>[] tab; Node<K,V> p; int n, i;
		//新数组,进行初始化得到数组
        if ((tab = table) == null || (n = tab.length) == 0)
		//得到初始化数组和长度,第一次扩容
        n = (tab = resize()).length;

Content pubVal method more than that, of course, before that we need to understand resize () of the expansion mechanism, the next step should be grateful if you move down a move to look at the contents of the third quarter (for any inconvenience, also please forgive. )

The following look at putVal method, the most common method is nothing special logic, noting that in the final again will call resize ().

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
		//声明两个Node数组,JDK1.8是Node 1.8之前使用的Entry
        Node<K,V>[] tab; Node<K,V> p; int n, i;
		//新数组,进行初始化得到数组
        if ((tab = table) == null || (n = tab.length) == 0)
		//得到初始化数组和长度
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)//根据hash值获取该节点是否为空,是的话新建一个节点
            tab[i] = newNode(hash, key, value, null);
        else {
		
            Node<K,V> e; K k;
			//说明hash值入新put的元素hash相等,key相等
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
			//符合红黑树则转为红黑树
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
					//表示此处只有一个头节点
                    if ((e = p.next) == null) {
						//那么直接新建一个节点
                        p.next = newNode(hash, key, value, null);
						//如果链表的长度大于8,则将链表转为红黑树
                        if (binCount >= TREEIFY_THRESHOLD - 1) 
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
					//更新p
                    p = e;
                }
            }
			//如果存在那么就覆盖
            if (e != null) { 
                V oldValue = e.value;
				// 判断是否允许覆盖,并且value是否为空
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
					 // 回调以允许LinkedHashMap后置操作
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount; // 更改操作次数
        if (++size > threshold) // 大于临界值
		// 将数组大小设置为原来的2倍,并将原先的数组中的元素放到新数组中
            // 因为有链表,红黑树之类,因此还要调整他们
            resize(); //扩容
		  // 回调以允许LinkedHashMap后置操作
        afterNodeInsertion(evict);
        return null;
    }

Three, resize () expansion.

final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table; //未使用put之前的原始数组
        int oldCap = (oldTab == null) ? 0 : oldTab.length; //第一次初始化时为0,否则返回原始数组容量
        int oldThr = threshold; //临界值,
        int newCap, newThr = 0; //声明一个新的容量和阈值
        if (oldCap > 0) { //原始数组长度大于0
		//原始容量大于等于最大容量,则返回原数组,并且扩容临界值保持不变,否则进行扩容一倍
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
			//oldCap<<1 扩大一倍长度小于最大容量并且原始数组要大于等于默认容量
			//将临界值翻倍
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
			
        else if (oldThr > 0) // 将初始容量设置为阈值
            newCap = oldThr;
        else {           
			//第一次初始化使用默认的初始容量和阈值
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
		//第一次扩容初始化阈值
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
		/*
			上面的操作的结果就是以下两点
            1.如果是第一次进行put的时候(新数组):
                1.1、如果初始化的时候带了参数
                    (HashMap(int initialCapacity, float loadFactor)),
                    那么newCap就是你的initialCapacity参数
                    threshold就是 (int)(initialCapacity*loadFactor)
            	1.2、否则就按默认的算 initialCapacity = 16,threshold = 12
            2.如果已经有元素了,那么直接扩容2倍,如果
            oldCap >= DEFAULT_INITIAL_CAPACITY了,那么threshold也扩大两倍
		*/
		//创建hash表
        @SuppressWarnings({"rawtypes","unchecked"})
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        if (oldTab != null) {
			//变量hash表
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;//临时节点
                if ((e = oldTab[j]) != null) { //如果旧的hash表当前存在节点,赋值给临时节点e
                    oldTab[j] = null;//删除旧的节点
                    if (e.next == null)//取出来的节点不存在下一个值,将该节点赋值给新的数组,并重新计算hash值作为key,e.next == null 也就是说该节点上只有这一个值,不存在链表
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode) //转为红黑树
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // 否则维护现有的链表
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            if ((e.hash & oldCap) == 0) { //一个判断的依据,可以理解为索引位置不变,不做位移
                                if (loTail == null) //初次赋值 将桶中的头结点添加到lohead中
                                    loHead = e;
                                else //链表追加
                                    loTail.next = e;
                                loTail = e; //lotail
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null); //该链表中还存在next时继续遍历
						//遍历结束后
                        if (loTail != null) {
							//在lo中的元素索引位置不变,赋值给新数组
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
							//在hi中的元素位置改变,为原始容量+原始索引
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

Interpretation of the above methods (from top to bottom if it does not say that is a personal opinion, I hope in time noted that the recommendations and a look at the source code):

/**
*	1.
*	如果oldCap=0 说明是一个新的数组,否则是一个有长度的数组。 
*	oldTable是我们原始数组(put之前的)
*/
int oldCap = (oldTab == null) ? 0 : oldTab.length
/**
*	2.
*	oldCap大于0说明是有长度的数组。
*	判断为true:原始容量大于等于最大容量,则返回原数组,并且扩容临界值保持不变,否则进行扩容一倍
*	为false:oldCap<<1 扩大一倍长度小于最大容量并且原始数组要大于等于默认容量,将临界值翻倍
*	关于位运算符这里简单说下,大佬可以忽略
*	例如:十进制的2
*		0000 0010
*	<<1	0000 0100 ->十进制4
*	<<1 0000 1000 ->十进制8
*	<<1 0001 0000 ->十进制16
*	可以总结一下以上规律:
*	移一次:2*2->移一次:2*2*2->移一次:2*2*2*2等价于
*   2*(2^1)->2*(2^2)->2*(2^3).....2*(2^n)
* 	明白这个之后对于<<1结果是扩大一倍相信应该不难理解了。
*/
if (oldCap > 0) {return "为了不占位置省略代码,可以对照上面源码看,建议用idea打开源码观看"}
/**
*	3.
*	既然是else if 说明 if(oldCap>0)为false,即:这是一个新的数组
*	将初始容量设置为阈值(此时的阈值还没有与加载因子相乘)
*/
else if (oldThr > 0)
else{
	//第一次初始化和第一次扩容
}
/**
* 4.以下重点
* 主要对if(oldTab != null){}进行说明,大致的意思在本节开始都做了注释。下面会着重说一下
* 重要的部分
*/
//原始数组不为空
if (oldTab != null) {
	//1.
	 else if (e instanceof TreeNode) //转为红黑树
         ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
  	 else {
		//此处是扩容的重点,
		/**
		*	扩容之后就是将现有数组节点的内容移到新的数组中,链表的设计原则是继续上下节点的
		*	信息。那么HashMap是如何实现的呢?
		*	在此处先举个例子,我们在上面的也可以看到HashMap中的索引的计算是
		*	[e.hash&(newcap-1)]
		*	扩容之后的大小时原来的2倍,假设现在的容量是16扩容到32,原数组中的索引只会有2
		*	种可能,变或不变。
		*	不变的索引:存放在低拉链表中,也就中else中声明的
		*	(Node<K,V> loHead = null, loTail = null),低拉链表中的数据会转移到新数组中且
		*	索引保持不变
		*	变的索引:存放在高拉链表中,也就是esle中声明的
		*	(Node<K,V> hiHead = null, hiTail = null),高拉链表中的数据在新数组中且新索引
		*	是旧的索引+oldCap旧的容量
		*	--假设旧的节点中某个节点的hash是15,旧的容量是16.(e.hash & oldCap) == 0肯定是
		*	true为什么呢,对& 与做个说明(大佬可以忽略此处):两个都为1才是1
		*	0000 0010 ->属性2的次幂值,都只有一个高位1,其余位肯定是0.
		*	16-> 0001 0000
		*	15-> 0000 1111
		*		 0000 0000
		*	如果(e.hash & oldCap) == 0为false说明此索引的位置需要发生变化,发生变化的索引
		*	位置是旧的索引15+旧的容量16 = 31 位置相同
		*	在此处你能发现 [e.hash&(newcap-1)]的妙处了,为什么不是[e.hash&(oldcap-1)]
		*	假设hash此时为16,[e.hash&(oldcap-1)] = 0  [e.hash&(newcap-1)] = 16
		*	这样处理在扩容的同时解决了hash的冲突并且可以很快的计算出新的索引位置.
		*	不用纠结为什么要(e.hash & oldCap) == 0这样判断,因为你不得不承认JDK作者位运算符用的真的给跪了.
		*	在遍历结束后会将在lo中的元素索引位置不变,赋值给新数组
		*	在hi中的元素位置改变,为原始容量+原始索引
		*	
		*	再回到第二节中继续看putVal方法
		*	
		*/
		
	}
}

Previous: in-depth understanding of the underlying principles of HashMap (a) rough introduces the main variables and methods in HashMap, interested friends can help correct place to discuss deficiencies.

If incorrect, hope bigwigs can be promptly corrected, thanks again.

Published 17 original articles · won praise 18 · views 1027

Guess you like

Origin blog.csdn.net/qq_40409260/article/details/104906133