Comic: What is HashMap? (Underlying principles)

 
Original link: https://mp.weixin.qq.com/s/HzRH9ZJYmidzW5jrMvEi4w

Transfer: Programmer small gray (Micro Signal: chengxuyuanxiaohui)

Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description

————————————
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description

As we all know, HashMap is a collection for the Key-Value store key-value pairs, each key-value pair also called Entry. These key-value pairs (the Entry) dispersion which is stored in an array, the array is the backbone of the HashMap.

The initial value of each element of the array HashMap is Null.

Here Insert Picture Description

For HashMap, we are the two most commonly used methods: Get and Put.

Principle 1, Put method

What to call the Put method when it happened?

For example, calling hashMap.put ( "apple", 0) , insert a Key for the "apple" elements. This time we need to use a hash function to determine Entry insertion position (index):
index = Hash ( "the Apple")

Finally, assume that the index is calculated by 2, then the results were as follows:

Here Insert Picture Description

However, because the length of HashMap is limited, when inserted Entry rises, then a perfect Hash function will inevitably situation index conflict occurs. For example, the following:

Here Insert Picture Description

This time how to do it? We can use the list to resolve.

Each element of the array is more than a HashMap Entry object, a head of the list is a node. Each Entry points to the next object is a pointer Entry Next node. When a new position of the array is mapped to the Entry conflict, only it needs to be inserted into the corresponding link:

Here Insert Picture Description

Note that, the new Entry node in the linked list, using the "head interpolation." As for why not insert the tail of the list, there will be explained later.

2, the principle of the Get method

Use the Get method according to the Key Value to find the time, what happened then?

First, the input will do a Hash Key mapping, to give the corresponding index:
index = Hash ( "Apple")

由于刚才所说的Hash冲突,同一个位置有可能匹配到多个Entry,这时候就需要顺着对应链表的头节点,一个一个向下来查找。假设我们要查找的Key是“apple”:

Here Insert Picture Description

第一步,我们查看的是头节点Entry6,Entry6的Key是banana,显然不是我们要找的结果。

第二步,我们查看的是Next节点Entry1,Entry1的Key是apple,正是我们要找的结果。

之所以把Entry6放在头节点,是因为HashMap的发明者认为,后插入的Entry被查找的可能性更大。

Here Insert Picture Description

Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description

3、HashMap的初始长度

Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description
Here Insert Picture Description

之前说过,从Key映射到HashMap数组的对应位置,会用到一个Hash函数:
index = Hash(“apple”)

如何实现一个尽量均匀分布的Hash函数呢?我们通过利用Key的HashCode值来做某种运算。

Here Insert Picture Description
index = HashCode(Key) % Length ?

Here Insert Picture Description
如何进行位运算呢?有如下的公式(Length是HashMap的长度):
index = HashCode(Key) & (Length - 1)

下面我们以值为“book”的Key来演示整个过程:

1.计算book的hashcode,结果为十进制的3029737,二进制的101110001110101110 1001。

2.假定HashMap长度是默认的16,计算Length-1的结果为十进制的15,二进制的1111。

3.把以上两个结果做与运算,101110001110101110 1001 & 1111 = 1001,十进制是9,所以 index=9。

可以说,Hash算法最终得到的index结果,完全取决于Key的Hashcode值的最后几位。

Here Insert Picture Description
Here Insert Picture Description
假设HashMap的长度是10,重复刚才的运算步骤:

Here Insert Picture Description

单独看这个结果,表面上并没有问题。我们再来尝试一个新的HashCode 101110001110101110 1011:

Here Insert Picture Description

Let us try to find another HashCode 101110001110101110 1111:

Here Insert Picture Description

Yes, while a second reciprocal HashCode into a third 1 from 0, the result of the operation is 1001. That is, when HashMap length of 10, when the odds of some index result will be greater, and some index results never occur (such as 0111)!

This is clearly inconsistent with the principle of Hash algorithm evenly distributed.

Other hand, a length of 16 or a power of 2, Length-1 values ​​is that all bits are all 1, in this case, the result is equivalent to the index of the value of several bits HashCode. HashCode itself as long as the input distribution, the result Hash algorithm is uniform.

Here Insert Picture Description
Here Insert Picture Description

Applications

Guess you like

Origin www.cnblogs.com/skycto/p/11470263.html