Detailed analysis of HashMap source code

Inheritance of HashMap:

  • HashMap implements the Cloneable interface, so it can be cloned

  • HashMap implements the Serializable interface and can be serialized

  • HashMap inherits AbstractMap and implements the Map interface, with all the functions of the Map interface

Storage structure:

  • Before JDK1.7 (including 1.7), the bottom layer of HashMap is an advanced data structure composed of array + linked list, that is, linked list hashing. The reason why linked list is used is that HashMap stores data through the hashCode of the key after being processed by the disturbance function. hash value, and then use (n - 1) & hash to judge the location where the current element is stored (n here refers to the length of the array), if there is an element in the current location, then judge the hash value of the element and the element to be stored And whether the keys are the same, if they are the same, overwrite them directly, if they are not the same, there will be a Hash conflict, and then resolve the conflict through the zipper method, that is, elements with the same Hash value but different values ​​are placed at the end of the linked list in the form of a linked list. However, if the length of the linked list is too long, the query efficiency will be low, so it is optimized in jdk8

  • After jdk1.8, when the length of the linked list is greater than 8 and the capacity is >=64, it will be treed, that is, the linked list will be converted into a red-black tree. The query efficiency of the array is O(1), the query efficiency of the linked list is O(k), and the query efficiency of the red-black tree is O(log n). Therefore, when the number of elements is very large, converting to a red-black tree can greatly improve query efficiency

Source code analysis (jdk1.8):

/**
     * 默认的初始容量为16
     */
    static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

    /**
     * 最大的容量为2的30次方
     * The maximum capacity, used if a higher value is implicitly specified
     * by either of the constructors with arguments.
     * MUST be a power of two <= 1<<30.
     */
    static final int MAXIMUM_CAPACITY = 1 << 30;

    /**
     * 默认的装载因子
     * The load factor used when none specified in constructor.
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * 当一个桶中的元素个数大于等于8时进行树化
     * The bin count threshold for using a tree rather than list for a
     * bin.  Bins are converted to trees when adding an element to a
     * bin with at least this many nodes. The value must be greater
     * than 2 and should be at least 8 to mesh with assumptions in
     * tree removal about conversion back to plain bins upon
     * shrinkage.
     */
    static final int TREEIFY_THRESHOLD = 8;

    /**
     * 当一个桶中的元素个数小于等于6时把树转化为链表
     * The bin count threshold for untreeifying a (split) bin during a
     * resize operation. Should be less than TREEIFY_THRESHOLD, and at
     * most 6 to mesh with shrinkage detection under removal.
     */
    static final int UNTREEIFY_THRESHOLD = 6;

    /**
     * 当桶的个数达到64的时候才进行树化
     * The smallest table capacity for which bins may be treeified.
     * (Otherwise the table is resized if too many nodes in a bin.)
     * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
     * between resizing and treeification thresholds.
     */
    static final int MIN_TREEIFY_CAPACITY = 64;
    /* ---------------- Fields -------------- */

    /**
     * 存储元素的数组,总是2的幂次倍
     * The table, initialized on first use, and resized as
     * necessary. When allocated, length is always a power of two.
     * (We also tolerate length zero in some operations to allow
     * bootstrapping mechanics that are currently not needed.)
     */
    transient Node<K,V>[] table;

    /**
     * 保存entrySet()的缓存
     * Holds cached entrySet(). Note that AbstractMap fields are used
     * for keySet() and values().
     */
    transient Set<Map.Entry<K,V>> entrySet;

    /**
     * 元素的数量
     * The number of key-value mappings contained in this map.
     */
    transient int size;

    /**
     * 每次扩容和更改map结构的计数器
     * The number of times this HashMap has been structurally modified
     * Structural modifications are those that change the number of mappings in
     * the HashMap or otherwise modify its internal structure (e.g.,
     * rehash).  This field is used to make iterators on Collection-views of
     * the HashMap fail-fast.  (See ConcurrentModificationException).
     */
    transient int modCount;

    /**
     * 临界值 当实际大小(容量*填充因子)超过临界值时,会进行扩容
     * The next size value at which to resize (capacity * load factor).
     *
     * @serial
     */
    // (The javadoc description is true upon serialization.
    // Additionally, if the table array has not been allocated, this
    // field holds the initial array capacity, or zero signifying
    // DEFAULT_INITIAL_CAPACITY.)
    int threshold;

    /**
     * The load factor for the hash table.
     *加载因子
     * @serial
     */
    final float loadFactor;

1. Capacity: the length of the array, that is, the number of buckets, the default is 16, and the maximum is 2 to the 30th power. When the capacity reaches 64, it can be treed

2. Loading factor: loadFactor is the density of data stored in the control element group. The closer the loadFactor is to 1, the denser the elements stored in the entry (array), the greater the possibility of hash conflicts, and the smaller the loadFactor ( closer to 0) will lead to low utilization of array storage space. Therefore, in order to balance the two, the official has given a relatively critical value: the default is 0.75, that is, the initial capacity capacity = 16, loadFactor = 0.75, threshold = capacity * loadFactor = 12, so when the length of the array reaches 12, it will be expanded instead of Reach 16 to expand.

3. Treeization: When the capacity reaches 64 and the length of the linked list reaches 8, the treeing is performed, and when the length of the linked list is less than 6, the treeing is reversed

4.Node<K,V>[] table: There are four constants in the Node node:

final int hash;

final K key;

V value;

Node<K,V> next; // point to the next node

Source code analysis of put method of HashMap collection:

The put method is as follows:

public V put(K key, V value) {
        return putVal(hash(key), key, value, false, true);//hash(key)根据hash方法计算出key的hash值
    }

The hash method is as follows:

atic final int hash(Object key) {
        int h;
        /**
        *如果key为null,则hash值为0,否则调用key的hashCode()方法
        *并让高16位与整个hash异或,这种做法是为了尽可能减少hash碰撞
        */
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

The putVal method is as follows:

final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        //table未初始化或者长度为0,则扩容
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        //(n - 1) & hash 计算元素在哪个桶中
        //如果这个桶中还没有元素,则把这个元素放在桶中的第一个位置
        if ((p = tab[i = (n - 1) & hash]) == null)
             //新建一个节点放在桶中
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            // 如果桶中第一个元素的key与待插入元素的key相同,保存在e中e来记录,后续通过e判断是否直接return
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            
            // 如果第一个元素是树节点,则调用树节点的putTreeVal插入元素
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            //链表
            else {
                // 遍历这个桶对应的链表,binCount用于存储链表中元素的个数
                for (int binCount = 0; ; ++binCount) {
                    // 如果链表遍历完了都没有找到相同key的元素,说明该key对应的元素不存在,则在链表最后插入一个新节点
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        // 如果插入新节点后链表长度大于8,则判断是否需要树化,因为第一个元素没有加到binCount中,所以这里-1
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            // 如果此时数组长度小于64还是会扩容不会树化
                            treeifyBin(tab, hash);
                        break;
                    }
                    // 如果待插入的key在链表中找到了,则退出循环
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key    
                // 记录下旧值
                V oldValue = e.value;  
                // 判断是否需要替换旧值
                if (!onlyIfAbsent || oldValue == null)                
                    // 替换旧值为新值
                    e.value = value;
                afterNodeAccess(e);   
                // 返回旧值
                return oldValue;
            }
        }
        
        // 修改次数加1
        ++modCount;
        // 元素数量加1,判断是否需要扩容
        if (++size > threshold)
            // 扩容
            resize();
        afterNodeInsertion(evict);
        // 没找到元素返回null
        return null;
    }

Guess you like

Origin blog.csdn.net/weixin_71243923/article/details/128838778