HashMap-resize relocation

Constant

 // 默认初始化容量 16static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // 最大容量static final int MAXIMUM_CAPACITY = 1 << 30;// 默认加载因子static final float DEFAULT_LOAD_FACTOR = 0.75f;// 链 转 tree 的 节点个数 下限阈值static final int TREEIFY_THRESHOLD = 8;// tree 转 链 的 节点个数 上限阈值static final int UNTREEIFY_THRESHOLD = 6;// 链 转 tree 时 存储数组table的容量下限阈值. table.length小于此值时 resize()扩容。static final int MIN_TREEIFY_CAPACITY = 64;

variable

// Node存储数组,在resize方法中初始化或扩容. 长度一定是 2的次方！transient Node<K,V>[] table;// 内部类 EntrySet,值对缓存transient Set<Map.Entry<K,V>> entrySet;// table中Node的数量transient int size;// 结构更改的次数。与AbstractList类似 (See ConcurrentModificationException) transient int modCount;// 下次需扩容size阈值: capacity * loadFactor, 或 外部指定initCap时tableSizeFor方法计算出的初始容量int threshold;// 加载因子 用于确定thresholdfinal float loadFactor;

loadFactor

Load factor indicating the degree of fill in the hash table elements, the default is 0.75. The larger the factor, the more filled elements, the benefits are: high space utilization, but increased the chance of conflict; the smaller factor, and vice versa.

The greater the chance of conflict bring greater cost to find, so it is necessary to strike a balance between the two.

threshold

When the constructor specifies initialCapacity, by tableSizeFor calculated initial capacity value method;

Other times to be said in the next expansion of variable size threshold Thr eshold = Capacity * loadFactor

Constructor

Just a few member variables to determine a good initial value, and is not an example of the Table . Examples of in real resize process

  public HashMap(int initialCapacity, float loadFactor) {    if (initialCapacity < 0)      throw new IllegalArgumentException("Illegal initial capacity: " + initialCapacity);    if (initialCapacity > MAXIMUM_CAPACITY)      initialCapacity = MAXIMUM_CAPACITY;    if (loadFactor <= 0 || Float.isNaN(loadFactor))      throw new IllegalArgumentException("Illegal load factor: " + loadFactor);    this.loadFactor = loadFactor;    this.threshold = tableSizeFor(initialCapacity);  }

There is an important method tableSizeFor , guaranteed table of the initial capacity is the power of 2

n | = n >>> 1: >>> first calculated, or after the assignment bit equivalent to n = n | (n >>> 1).

  // 返回 2 的 次方  static final int tableSizeFor(int cap) {    int n = cap - 1;    n |= n >>> 1;    n |= n >>> 2;    n |= n >>> 4;    n |= n >>> 8;    n |= n >>> 16;    return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;  }

When externally specified initialCapacity, the method returns to> = initialCapacity nearest power of two .

Node

Value stored in the object list structure, key stored in the hash the hash value obtained by the method, and a link to the next Node

static class Node<K,V> implements Map.Entry<K,V> {final int hash;final K key;V value;Node<K,V> next;Node(int hash, K key, V value, Node<K,V> next) {    this.hash = hash;    this.key = key;    this.value = value;    this.next = next;}        ...

TreeNode

Red-black tree storage objects under construction, structure inherited from LinkedHashMap.Entry still can maintain two-way linked list. Superclass is still HashMap.Node

static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {    TreeNode<K,V> parent;  // red-black tree links    TreeNode<K,V> left;    TreeNode<K,V> right;    TreeNode<K,V> prev;    // needed to unlink next upon deletion    boolean red;    TreeNode(int hash, K key, V val, Node<K,V> next) {    super(hash, key, val, next);        }        ...     }static class Entry<K, V> extends HashMap.Node<K, V> {  Entry<K, V> before, after;  Entry(int hash, K key, V value, Node<K, V> next) {    super(hash, key, value, next);  }}

Tree

Operation list O (n) performance with the growth of N worse, jdk8 chain will reach the switching threshold of the tree.

putVal

Entry calls PUT , Merge , Compute and other methods.

After the end of the linked list the new node, it is determined: chain length> TREEIFY_THRESHOLD, executing treeifyBin method:

putVal -> treeifyBin

判定 table.length < MIN_TREEIFY_CAPACITY :

true then a resize () expansion; is converted into a tree structure to false -> treeify method .

resize

Examples of table

putVal method determination table is empty performs a resize () is initially instantiated
putVal method of performing the end, it is determined size> threshold perform R & lt ESIZE () Expansion

resize -> split

判定table[index]是TreeNode，执行split方法来处理重定位时TreeNode是否需要树转链。

区分好移动与否的节点集合后：因一定由 index 移动到 index+ oldCap，

所以直接判定各自节点数量 <= UNTREEIFY_THRESHOLD: 转为链结构 -> untreeify方法。

重定位

put操作对于链表是 后插入。

在低版本中，resize()重定位操作移动到同一新index下的Node链是 前插入。并发下，原链 A->B->nil 对于错误线程可能演变为循环链 A->B->A。

在JDK8中优化了重定位方法来保证移动后节点在链表中的相对先后顺序不变。
(node.hash & oldCap) == 0 则index不变；否则在新table上移动：newIndex = oldIndex + oldCap。

推演

resize()：若需要移动，一定是由 index 移到 index + oldCap；

换而言之：table[index+ oldCap] 上的节点一定是由index移动而来。

前提：

table.length 一定是 2 的次方。
（默认是 1 << 4 ; 指定initCap则经过 tableSizeFor处理，保证是2的次方）
table扩容大小翻倍： newCap = oldCap << 1 左移1位
定位：index = node.hash & ( cap -1 )

oldCap = 16   newCap = oldCap << 1 =  32旧下标位置:  e.hash & (oldCap-1) :eg1：hash   二进制值   e.hash =  10     0000 1010 oldCap-1 =  15     0000 1111      &   =  10     0000 1010 eg2：hash   二进制值   e.hash =  17     0001 0001 oldCap-1 =  15     0000 1111      &   =  1      0000 0001比较判定Node在新table的位置是否需要移动:  e.hash & oldCap eg1：hash   二进制值 e.hash  =  10     0000 1010 oldCap  =  16     0001 0000     &   =  0      0000 0000 为0 eg2：hash   二进制值 e.hash  =  17    0001 0001 oldCap  =  16    0001 0000     &   =  1     0001 0000 不为0 新下标位置: e.hash & (newCap-1)eg1:        hash   二进制值   e.hash = 10 0000 1010 newCap-1 = 31 0001 1111       &   = 10 0000 1010结论：下标不变eg1:        hash   二进制值   e.hash = 17 0001 0001 newCap-1 = 31 0001 1111       &   = 17 0001 0001 oldIndex + oldCap = 1 + 16结论：元素位置在扩容后数组中的位置发生了改变，新的下标位置是原下标位置+原数组长度

在上例中:

oldCap       =  16  0001 0000newCap       =  32  0010 0000  oldCap左移1位,末尾补0(oldCap - 1) =  15  0000 1111(newCap - 1) =  31  0001 1111  (oldCap - 1) 左移1位,末尾补1

(newCap - 1) 与 (oldCap - 1) 二者差别在最高位：（oldCap - 1）是 0 ，（newCap -1）是 1 。

所以:

hash & ( oldCap - 1) 与 hash & (newCap -1) 不同之处在最高位；余下位的值是相同的，正好对应oldIndex值 ;

加之：

（newCap -1）与 oldCap 的相同之处是最高位都是1 。且 oldCap 除高位外余下位数固定是 0；

所以 :

（hash & oldCap）运算后只会在oldCap的最高位上结果不同，其余位（即使hash位数大于oldCap位数） "因oldCap除高位外余下位数都是0 " 而为0。

由此推出：newIndex = oldIndex + 最高位&运算结果值！

oldCap 的最高位是 1 ，所以取决于hash值在oldCap最高位上的数值

最高位是 0 则不变 -> newIndex = oldIndex；

最高位是 1 则移动 -> newIndex = oldIndex + oldCap 。

（非2的次方, 除最高位后余下位不一定是0。所以（hash & oldCap）运算后，不仅最高位的结果会不同，余下位的结果也可能不同。无法推出结论等式，不成立）

Q&A

Q: 为什么链表与红黑树互转的阈值是6、8 ？

A：如果选择6和8（如果链表小于等于6树还原转为链表，大于等于8转为树），中间有个差值7可以有效防止链表和树频繁转换。假设一下，如果设计成链表个数超过8则链表转换成树结构，链表个数小于8则树结构转换成链表，如果一个HashMap不停的插入、删除元素，链表个数在8左右徘徊，就会频繁的发生树转链表、链表转树，效率会很低。

Q: 为什么加载因子loadFactor默认是 0.75 ？

A：理想情况下，在随机哈希码下容器中的节点遵循泊松分布

Q: 为什么 HashMap 中 String、Integer 这样的包装类适合作为 key 键？

A：

Q: HashMap 中的 key若 Object类型，则需实现哪些方法

A：

来源：http://daohang.1994july.club/