Java HashMap implementation principles underlying source code analysis Jdk8

In JDK1.6, JDK1.7 in, HashMap using the bit bucket list + realization that the use of linked lists conflict, list the same hash value are stored in a linked list. But when many of the elements in a bucket, that is more equal to the hash value of the element, the low efficiency in order to find the key value. In the JDK1.8, using the HashMap bit bucket chain + + red-black tree to achieve, when the chain length exceeds a threshold value (8), the list might be converted to red-black tree, this greatly reduces the search time.

Simply put under the realization of the principle of HashMap:

First, there is a table array, which each element is a node list, when adding an element (key-value), it first calculates the hash value of the element of the key, with a calculation obtained by the length of the table and the hash key value index, thereby determining the insertion position of the array, but there may be elements of the same hash value has been the same location on the array, then put the end of the chain elements to the hash value of the node list is the same, they are in the array the same position, but the list is formed, Hash value on the same each chain is the same, so that the array is stored in a linked list. When the length of the list is equal to 8, the list may be converted to red-black tree, thereby greatly enhancing the efficiency of the search.

Storage structure

Storage structure

static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next; //可以看得出这是一个链表

        Node(int hash, K key, V value, Node<K,V> next) {
            this.hash = hash;
            this.key = key;
            this.value = value;
            this.next = next;
        }
        *
        *
        *
    }
transient Node<K,V>[] table;
  • Internal HashMap contains a Node type array table, Node inherited by the Map.Entry.
  • Node stores pairs. It contains four fields, we can see the field from the next node is a linked list.
  • Each position in the array table can be used as a barrel, a barrel store a linked list.
  • HashMap zipper method to resolve conflicts, the same store the same hash value Node.

Data Domain

// 序列化ID
private static final long serialVersionUID = 362498820763181265L;  
// 初始化容量,初始化有16个桶
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16  
// 最大容量  1 073 741 824, 10亿多
static final int MAXIMUM_CAPACITY = 1 << 30;
// 默认的负载因子。因此初始情况下,当键值对的数量大于 16 * 0.75 = 12 时,就会触发扩容。
static final float DEFAULT_LOAD_FACTOR = 0.75f; 
// 当put()一个元素到某个桶,其链表长度达到8时有可能将链表转换为红黑树  
static final int TREEIFY_THRESHOLD = 8;  
// 在hashMap扩容时,如果发现链表长度小于等于6,则会由红黑树重新退化为链表。
static final int UNTREEIFY_THRESHOLD = 6;  
// 在转变成红黑树树之前,还会有一次判断,只有键值对数量大于 64 才会发生转换,否者直接扩容。这是为了避免在HashMap建立初期,多个键值对恰好被放入了同一个链表中而导致不必要的转化。
static final int MIN_TREEIFY_CAPACITY = 64;  
// 存储元素的数组  
transient Node<k,v>[] table;
// 存放元素的个数
transient int size;
// 被修改的次数fast-fail机制   
transient int modCount; 
// 临界值 当实际大小(容量*填充比)超过临界值时,会进行扩容   
int threshold;
// 填充比
final float loadFactor;

Constructor

public HashMap() {
        this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}
public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
}
public HashMap(int initialCapacity, float loadFactor) {
        if (initialCapacity < 0)
            throw new IllegalArgumentException("Illegal initial capacity: " +
                                               initialCapacity);
        if (initialCapacity > MAXIMUM_CAPACITY)
            initialCapacity = MAXIMUM_CAPACITY;
        if (loadFactor <= 0 || Float.isNaN(loadFactor))
            throw new IllegalArgumentException("Illegal load factor: " +
                                               loadFactor);
        this.loadFactor = loadFactor;
        // tableSizeFor(initialCapacity)方法计算出接近initialCapacity
        // 参数的2^n来作为初始化容量。
        this.threshold = tableSizeFor(initialCapacity);
}
public HashMap(Map<? extends K, ? extends V> m) {
        this.loadFactor = DEFAULT_LOAD_FACTOR;
        putMapEntries(m, false);
}
  • HashMap constructor allows users to pass capacity is not n-th power of 2, because it can automatically convert the incoming capacity of 2 ^ n.

    put () operation source parsed


public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);
}
static final int hash(Object key) {
        int h;
        // “扰动函数”。参考 https://www.cnblogs.com/zhengwang/p/8136164.html
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}    
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        HashMap.Node<K,V>[] tab; HashMap.Node<K,V> p; int n, i;
        // 未初始化则初始化table
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        // 通过table的长度和hash与运算得到一个index,
        // 然后判断table数组下标为index处是否已经存在node。
        if ((p = tab[i = (n - 1) & hash]) == null)
            // 如果table数组下标为index处为空则新创建一个node放在该处
            tab[i] = newNode(hash, key, value, null);
        else {
            // 运行到这代表table数组下标为index处已经存在node,即发生了碰撞
            HashMap.Node<K,V> e; K k;
            // 检查这个node的key是否跟插入的key是否相同。
            if (p.hash == hash &&
                    ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            // 检查这个node是否已经是一个红黑树
            else if (p instanceof TreeNode)
                // 如果这个node已经是一个红黑树则继续往树种添加节点
                e = ((HashMap.TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    // 在这里循环遍历node链表

                    // 判断是否到达链表尾
                    if ((e = p.next) == null) {
                        // 到达链表尾,直接把新node插入链表,插入链表尾部,在jdk8之前是头插法
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            // 如果node链表的长度大于等于8则可能把这个node转换为红黑树
                            treeifyBin(tab, hash);
                        break;
                    }
                    // 检查这个node的key是否跟插入的key是否相同。
                    if (e.hash == hash &&
                            ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            // 当插入key存在,则更新value值并返回旧value
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        // 修改次数++
        ++modCount;
        // 如果当前大小大于门限,门限原本是初始容量*0.75
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }
  • Here simply under the put () process:
    1. Analyzing key array Table [] is empty, or null, the default size or a resize ();
    2. Key length key according to a hash value calculated and the obtained table and the insertion operation of the array index index, if the tab [index] == null, add node directly according to the new key-value, otherwise the processing shifts 3
    3. Analyzing the presently processed array or linked list of hash conflicts red-black tree (check node type to the first) were treated
  • Why Why interpolation head into the end of play: jdk1.7 time with the head interpolation may be taken into a so-called hot spot data (newly inserted data might be used earlier); the tail of the list to find the time complexity degree is O (n), or the use of additional memory to store the address of the location of the tail of the linked list, the head insert can save time-consuming interpolation. However, when the expansion will change the original order of the list of elements, so that the lead ring concurrent scene list problem.
  • () It can be seen from PutVal source, and no key-value pairs HashMap null do limit the key (hash value is set to 0), i.e. the end null HashMap allow insertion of the key pair. But before JDK1.8 HashMap using the first storage node 0 keys to null pairs.
  • Determining node index: the length and key by the hash table to obtain a calculation with index.
  • Before converted into a red-black tree, there will be a determination, only the number of key-value conversion is greater than 64 will occur, whether by direct expansion. This is to avoid the early establishment of the HashMap, multiple keys to be placed in exactly the same linked list and lead to unnecessary conversions.

get () operation source parsing

public V get(Object key) {
        HashMap.Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }
    
    final HashMap.Node<K,V> getNode(int hash, Object key) {
        HashMap.Node<K,V>[] tab; HashMap.Node<K,V> first, e; int n; K k;
        // table不为空
        if ((tab = table) != null && (n = tab.length) > 0 &&
                // 通过table的长度和hash与运算得到一个index,table
                // 下标位index处的元素不为空,即元素为node链表
                (first = tab[(n - 1) & hash]) != null) {
            // 首先判断node链表中中第一个节点
            if (first.hash == hash && // always check first node
                    // 分别判断key为null和key不为null的情况
                    ((k = first.key) == key || (key != null && key.equals(k))))
                // key相等则返回第一个
                return first;
            // 第一个节点key不同且node链表不止包含一个节点
            if ((e = first.next) != null) {
                // 判断node链表是否转为红黑树。
                if (first instanceof HashMap.TreeNode)
                    // 则在红黑树中进行查找。
                    return ((HashMap.TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    // 循环遍历node链表中的节点,判断key是否相等
                    if (e.hash == hash &&
                            ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        // key在table中不存在则返回null。
        return null;
    }
  • get (key) method first obtains the hash value of the key,
    1. Calculated hash & (table.len - 1) obtained in the position list array,
    2. Analyzing the first node list (tub) Key in the first node is equal to the parameter key,
    3. It is determined whether the unequal has turned red-black tree, if you look into a red-black tree in red-black tree,
    4. If no red-black tree is traversed back into the linked list to find the same key returns the corresponding value of the Value value.

a resize () operation source parsed

// 初始化或者扩容之后的元素调整
    final HashMap.Node<K,V>[] resize() {
        // 获取旧table
        HashMap.Node<K,V>[] oldTab = table;
        // 旧table容量
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        // 旧table扩容临界值
        int oldThr = threshold;
        // 定义新table容量和临界值
        int newCap, newThr = 0;
        // 如果原table不为空
        if (oldCap > 0) {
            // 如果table容量达到最大值,则修改临界值为Integer.MAX_VALUE
            // MAXIMUM_CAPACITY = 1 << 30;
            // Integer.MAX_VALUE = 1 << 31 - 1;
            if (oldCap >= MAXIMUM_CAPACITY) {
                // Map达到最大容量,这时还要向map中放数据,则直接设置临界值为整数的最大值
                // 在容量没有达到最大值之前不会再resize。
                threshold = Integer.MAX_VALUE;
                // 结束操作
                return oldTab;
            }
            // 下面就是扩容操作(2倍)
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                    oldCap >= DEFAULT_INITIAL_CAPACITY)
                // 临界值也变为两倍
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
            /*
             * 进入此if证明创建HashMap时用的带参构造:public HashMap(int initialCapacity)
             * 或 public HashMap(int initialCapacity, float loadFactor)
             * 注:带参的构造中initialCapacity(初始容量值)不管是输入几都会通过
             * tableSizeFor(initialCapacity)方法计算出接近initialCapacity
             * 参数的2^n来作为初始化容量。
             * 所以实际创建的容量并不等于设置的初始容量。
             */
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
            // 进入此if证明创建map时用的无参构造:
            // 然后将参数newCap(新的容量)、newThr(新的扩容阀界值)进行初始化
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            // 进入这代表有两种可能。
            // 1. 说明old table容量大于0但是小于16.
            // 2. 创建HashMap时用的带参构造,根据loadFactor计算临界值。
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                    (int)ft : Integer.MAX_VALUE);
        }
        // 修改临界值
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
        // 根据新的容量生成新的 table
        HashMap.Node<K,V>[] newTab = (HashMap.Node<K,V>[])new HashMap.Node[newCap];
        // 替换成新的table
        table = newTab;
        // 如果oldTab不为null说明是扩容,否则直接返回newTab
        if (oldTab != null) {
            /* 遍历原来的table */
            for (int j = 0; j < oldCap; ++j) {
                HashMap.Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    // 判断这个桶(链表)中就只有一个节点
                    if (e.next == null)
                        // 根据新的容量重新计算在table中的位置index,并把当前元素赋值给他。
                        newTab[e.hash & (newCap - 1)] = e;
                    // 判断这个链表是否已经转为红黑树
                    else if (e instanceof HashMap.TreeNode)
                        // 在split函数中可能由于红黑树的长度小于等于UNTREEIFY_THRESHOLD(6)
                        // 则把红黑树重新转为链表
                        ((HashMap.TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        // 运行到这里证明桶中有多个节点。
                        HashMap.Node<K,V> loHead = null, loTail = null;
                        HashMap.Node<K,V> hiHead = null, hiTail = null;
                        HashMap.Node<K,V> next;
                        do {
                            // 对桶进行遍历
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }
  • Still you can put the data after reaching the maximum MAXIMUM_CAPACITY.
  • Band structure parameter initialization process, the actual capacity is not created equal to the set initial capacity. tableSizeFor () method can automatically convert incoming capacity of 2 ^ n.
  • Red-black tree can degenerate into a linked list.
  • It should be noted that the expansion of the operation need to put all the keys oldTable to reinsert newTable, and therefore, this step is very time-consuming.

Guess you like

Origin www.cnblogs.com/neverth/p/11781491.html