HashMap whole process of expansion

 1. If the HashMap size exceeds the load factor (load factor) defined capacity, how to do?

The default size of the load factor is 0.75, that is, when a map filled with 75 percent of the time bucket, and other collections (such as ArrayList, etc.), as will create an array of bucket twice the size of the original HashMap to re resize the map, and the original object into the new bucket array. This process is called rehashing, since it calls the hash method to find new bucket position. The location of this value can only be in two places, at the original target position a, the other is at an index <index + original capacity of the original> of

  2. What problems HashMap re-adjust the size of it?

  • When re-adjust the size of the HashMap time, the existence of conditions of competition, because if two threads are found HashMap need to resize, and they will also try to adjust the size. In the resizing process, the order of elements stored in a linked list in turn, because moving to a new location when the bucket, HashMap does not take an element on the end of the list, but on the head, which is In order to avoid traversing the tail (tail traversing). If the conditions of competition occurs, then the cycle of death. (Do not use a multithreaded environment HashMap)
  • Why multi-threading can cause an infinite loop, which is how it happened?

  HashMap capacity is limited. After several elements when inserted, so that the HashMap reaches a certain saturation, the probability of conflict Key mapping positions will gradually increase. At this time, HashMap need to expand its length, which is carried Resize.

  Resize What is that? First, let's recognize two variables 

  1.Capacity

  The current length of the HashMap. HashMap length is a power of 2.

  2.LoadFactor

  HashMap load factor, the default value is 0.75f.

  Resize HashMap assess whether conditions were as follows:

  HashMap.Size >= Capacity * LoadFactor

   Resize steps

  1. Expansion: Create a new empty array Entry, 2 times the original length of the array.

  2.ReHash: Entry traverse the original array, all the re-Hash Entry to the new array. Why re-Hash of it? Because after the length of the expansion, Hash rules also changed.

  hash公式:index = HashCode(Key) & (Length - 1)

  We assume HashMap before rehash like this

  

 

  It may be the case after rehash

  

 

  Code like this

  

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
/**
  * Transfers all entries from current table to newTable.
  */
void transfer(Entry[] newTable, boolean rehash) {
     int newCapacity = newTable.length;
     for (Entry<K,V> e : table) {
         while ( null != e) {
             Entry<K,V> next = e.next;
             if (rehash) {
                 e.hash = null == e.key ? 0 : hash(e.key);
             }
             int i = indexFor(e.hash, newCapacity);
             e.next = newTable[i];
             newTable[i] = e;
             e = next;
         }
     }
}

  Now assume a scenario, there follows a hashmap

  

When there are A, B two threads to be put operation on the hash map

This time due to lack of space, which will be the expansion hashmap

At this time, if the thread B Entry3 traversed the object, the red box just after executing this line of code, the thread is suspended. For thread B is:

e = Entry3

next = Entry2

This time the thread A rehash unobstructed for, when ReHash complete, the result is as follows (figures e and next, two reference representative thread B):

直到这一步,看起来没什么毛病。接下来线程B恢复,继续执行属于它自己的ReHash。线程B刚才的状态是:

 

e = Entry3

next = Entry2

 

我们继续执代码,Entry3放入了线程B的数组下标为3的位置,并且e指向了Entry2。此时e和next的指向如下:

e = Entry2

next = Entry2

 

接下来用头插法把Entry2插入到了线程B的数组的头结点

e = Entry2

next = Entry3

e = Entry3

next = Entry3.next = null

 

newTable[i] = Entry2这里若果是正常情况是newTable[i] =null,但是由于Entry2的hash被定为带同一个数组地址

e = Entry3

Entry2.next = Entry3

Entry3.next = Entry2

链表出现了环形!导致了死循环(多线程下请使用CocurrentHashMap)

 

 

 

默认的负载因子大小为0.75,也就是说,当一个map填满了75%的bucket时候,和其它集合类(如ArrayList等)一样,将会创建原来HashMap大小的两倍的bucket数组,来重新调整map的大小,并将原来的对象放入新的bucket数组中。这个过程叫作rehashing,因为它调用hash方法找到新的bucket位置。这个值只可能在两个地方,一个是原下标的位置,另一种是在下标为<原下标+原容量>的位置

  2.重新调整HashMap大小存在什么问题吗?

  • 当重新调整HashMap大小的时候,确实存在条件竞争,因为如果两个线程都发现HashMap需要重新调整大小了,它们会同时试着调整大小。在调整大小的过程中,存储在链表中的元素的次序会反过来,因为移动到新的bucket位置的时候,HashMap并不会将元素放在链表的尾部,而是放在头部,这是为了避免尾部遍历(tail traversing)。如果条件竞争发生了,那么就死循环了。(多线程的环境下不使用HashMap)
  • 为什么多线程会导致死循环,它是怎么发生的?

  HashMap的容量是有限的。当经过多次元素插入,使得HashMap达到一定饱和度时,Key映射位置发生冲突的几率会逐渐提高。这时候,HashMap需要扩展它的长度,也就是    进行Resize。

  Resize是什么?首先我们先认识2个变量 

  1.Capacity

  HashMap的当前长度。HashMap的长度是2的幂。

  2.LoadFactor

  HashMap负载因子,默认值为0.75f。

  衡量HashMap是否进行Resize的条件如下:

  HashMap.Size >= Capacity * LoadFactor

   Resize步骤

  1.扩容:创建一个新的Entry空数组,长度是原数组的2倍。

  2.ReHash:遍历原Entry数组,把所有的Entry重新Hash到新数组。为什么要重新Hash呢?因为长度扩大以后,Hash的规则也随之改变。

  hash公式:index = HashCode(Key) & (Length - 1)

  我们假设rehash之前的HashMap是这样的

  

 

  那么rehash之后可能是这样

  

 

  代码是这样的

  

?
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
/**
  * Transfers all entries from current table to newTable.
  */
void transfer(Entry[] newTable, boolean rehash) {
     int newCapacity = newTable.length;
     for (Entry<K,V> e : table) {
         while ( null != e) {
             Entry<K,V> next = e.next;
             if (rehash) {
                 e.hash = null == e.key ? 0 : hash(e.key);
             }
             int i = indexFor(e.hash, newCapacity);
             e.next = newTable[i];
             newTable[i] = e;
             e = next;
         }
     }
}

  现在假设一个场景,有一个hashmap如下

  

当有A,B这两个线程要对该hash map进行put操作

此时由于空间的不足,该hashmap必将进行扩容

假如此时线程B遍历到Entry3对象,刚执行完红框里的这行代码,线程就被挂起。对于线程B来说:

e = Entry3

next = Entry2

这时候线程A畅通无阻地进行着Rehash,当ReHash完成后,结果如下(图中的e和next,代表线程B的两个引用):

直到这一步,看起来没什么毛病。接下来线程B恢复,继续执行属于它自己的ReHash。线程B刚才的状态是:

 

e = Entry3

next = Entry2

 

我们继续执代码,Entry3放入了线程B的数组下标为3的位置,并且e指向了Entry2。此时e和next的指向如下:

e = Entry2

next = Entry2

 

接下来用头插法把Entry2插入到了线程B的数组的头结点

e = Entry2

next = Entry3

e = Entry3

next = Entry3.next = null

 

newTable[i] = Entry2这里若果是正常情况是newTable[i] =null,但是由于Entry2的hash被定为带同一个数组地址

e = Entry3

Entry2.next = Entry3

Entry3.next = Entry2

链表出现了环形!导致了死循环(多线程下请使用CocurrentHashMap)

 

 

 

Guess you like

Origin www.cnblogs.com/ustc-anmin/p/11891856.html