Transfer: Programmer small gray (Micro Signal: chengxuyuanxiaohui)
————————————
As we all know, HashMap is a collection for the Key-Value store key-value pairs, each key-value pair also called Entry. These key-value pairs (the Entry) dispersion which is stored in an array, the array is the backbone of the HashMap.
The initial value of each element of the array HashMap is Null.
For HashMap, we are the two most commonly used methods: Get and Put.
Principle 1, Put method
What to call the Put method when it happened?
For example, calling hashMap.put ( "apple", 0) , insert a Key for the "apple" elements. This time we need to use a hash function to determine Entry insertion position (index):
index = Hash ( "the Apple")
Finally, assume that the index is calculated by 2, then the results were as follows:
However, because the length of HashMap is limited, when inserted Entry rises, then a perfect Hash function will inevitably situation index conflict occurs. For example, the following:
This time how to do it? We can use the list to resolve.
Each element of the array is more than a HashMap Entry object, a head of the list is a node. Each Entry points to the next object is a pointer Entry Next node. When a new position of the array is mapped to the Entry conflict, only it needs to be inserted into the corresponding link:
Note that, the new Entry node in the linked list, using the "head interpolation." As for why not insert the tail of the list, there will be explained later.
2, the principle of the Get method
Use the Get method according to the Key Value to find the time, what happened then?
First, the input will do a Hash Key mapping, to give the corresponding index:
index = Hash ( "Apple")
由于刚才所说的Hash冲突,同一个位置有可能匹配到多个Entry,这时候就需要顺着对应链表的头节点,一个一个向下来查找。假设我们要查找的Key是“apple”:
第一步,我们查看的是头节点Entry6,Entry6的Key是banana,显然不是我们要找的结果。
第二步,我们查看的是Next节点Entry1,Entry1的Key是apple,正是我们要找的结果。
之所以把Entry6放在头节点,是因为HashMap的发明者认为,后插入的Entry被查找的可能性更大。
3、HashMap的初始长度
之前说过,从Key映射到HashMap数组的对应位置,会用到一个Hash函数:
index = Hash(“apple”)
如何实现一个尽量均匀分布的Hash函数呢?我们通过利用Key的HashCode值来做某种运算。
index = HashCode(Key) % Length ?
如何进行位运算呢?有如下的公式(Length是HashMap的长度):
index = HashCode(Key) & (Length - 1)
下面我们以值为“book”的Key来演示整个过程:
1.计算book的hashcode,结果为十进制的3029737,二进制的101110001110101110 1001。
2.假定HashMap长度是默认的16,计算Length-1的结果为十进制的15,二进制的1111。
3.把以上两个结果做与运算,101110001110101110 1001 & 1111 = 1001,十进制是9,所以 index=9。
可以说,Hash算法最终得到的index结果,完全取决于Key的Hashcode值的最后几位。
假设HashMap的长度是10,重复刚才的运算步骤:
单独看这个结果,表面上并没有问题。我们再来尝试一个新的HashCode 101110001110101110 1011:
Let us try to find another HashCode 101110001110101110 1111:
Yes, while a second reciprocal HashCode into a third 1 from 0, the result of the operation is 1001. That is, when HashMap length of 10, when the odds of some index result will be greater, and some index results never occur (such as 0111)!
This is clearly inconsistent with the principle of Hash algorithm evenly distributed.
Other hand, a length of 16 or a power of 2, Length-1 values is that all bits are all 1, in this case, the result is equivalent to the index of the value of several bits HashCode. HashCode itself as long as the input distribution, the result Hash algorithm is uniform.