Vernacular HashMap source code (on)

HashMap can be roughly said in one sentence:

Use the hash algorithm to calculate the index index of the key, and then put the HashMapEntry composed of the key and value into the HashMapEntry[index], that is, the put function is completed, and the key is recalculated to get the index to get the HashMapEntry.

The above is just the most superficial idea. What if different keys calculate the same index? The next in HashMapEntry comes into play.

HashMap

As shown in the figure above: the horizontal Entry is HashMapEntry[], the vertical arrow indicates that there are multiple Entry under the same index, and the chain storage is used between the Entry.

The general design idea of ​​HashMap is like this. Next, let's see how the source code is written (the source code is android-23).

Constructor

No-argument constructor: create empty table to set threshold

    /**
     * Min capacity (other than zero) for a HashMap. Must be a power of two
     * greater than 1 (and less than 1 << 30).
     */
    //HashMap最小的容量。此容量必须是以2为底的次方数,且大于1小于2的30次方
    private static final int MINIMUM_CAPACITY = 4;

    /**
     * An empty table shared by all zero-capacity maps (typically from default
     * constructor). It is never written to, and replaced on first put. Its size
     * is set to half the minimum, so that the first resize will create a
     * minimum-sized table.
     */
    //空表,容量是最小容量的一半
    private static final Entry[] EMPTY_TABLE
            = new HashMapEntry[MINIMUM_CAPACITY >>> 1];//无符号右移

    public HashMap() {
        table = (HashMapEntry<K, V>[]) EMPTY_TABLE;
        threshold = -1; // Forces first put invocation to replace EMPTY_TABLE阈值超过阈值时需要扩容
    }

Construction parameters with capacity

Do some processing on the incoming capacity value, then create the table

    /**
     * Max capacity for a HashMap. Must be a power of two >= MINIMUM_CAPACITY.
     */
    //吧啦吧啦,容量必须是以2为底的次方数,且大于MINIMUM_CAPACITY
    private static final int MAXIMUM_CAPACITY = 1 << 30;

    public HashMap(int capacity) {
        if (capacity < 0) {//抛异常
            throw new IllegalArgumentException("Capacity: " + capacity);
        }

        if (capacity == 0) {//容量为0时和无参构造函数逻辑一致,我认为可以HashMap()然后return搞定,不需要重复代码
            @SuppressWarnings("unchecked")
            HashMapEntry<K, V>[] tab = (HashMapEntry<K, V>[]) EMPTY_TABLE;
            table = tab;
            threshold = -1; // Forces first put() to replace EMPTY_TABLE
            return;
        }

        if (capacity < MINIMUM_CAPACITY) {
            capacity = MINIMUM_CAPACITY;
        } else if (capacity > MAXIMUM_CAPACITY) {
            capacity = MAXIMUM_CAPACITY;
        } else {
            capacity = Collections.roundUpToPowerOfTwo(capacity);//获取大于等于capacity并且是 2 的次方的整数
        }
        makeTable(capacity);
    }

    private HashMapEntry<K, V>[] makeTable(int newCapacity) {
        @SuppressWarnings("unchecked") HashMapEntry<K, V>[] newTable
                = (HashMapEntry<K, V>[]) new HashMapEntry[newCapacity];//根据传入的容量创建数组
        table = newTable;
        threshold = (newCapacity >> 1) + (newCapacity >> 2); // 3/4 capacity
//设置容量阈值,如果大于阈值则扩充数组大小
        return newTable;
    }

Construction method with load factor

What is load factor?

The storage structure of HashMap is an array, and the items in the array are linked lists.

The advantage of the array is that the search is fast, because the allocated memory space is continuous, but the generated index is not necessarily continuous, so there will be a waste of space.

The advantage of a linked list is that additions and deletions are fast, but the search speed is not as fast as an array, and it must be searched down one by one.

The function of the load factor is to coordinate the advantages and disadvantages between the array and the linked list. If the load factor is large, the length of the linked list will be large and the search speed will be reduced. Otherwise, the large length of the array will cause a waste of space.

Set an appropriate load factor according to the actual usage. Since the source code is not actually set, the function of load factor is only taken.

    /**
     * Constructs a new {@code HashMap} instance with the specified capacity and
     * load factor.
     *
     * @param capacity
     *            the initial capacity of this hash map.
     * @param loadFactor
     *            the initial load factor.
     * @throws IllegalArgumentException
     *                when the capacity is less than zero or the load factor is
     *                less or equal to zero or NaN.
     */
    public HashMap(int capacity, float loadFactor) {
        this(capacity);

        if (loadFactor <= 0 || Float.isNaN(loadFactor)) {
            throw new IllegalArgumentException("Load factor: " + loadFactor);
        }

        /*
         * Note that this implementation ignores loadFactor; it always uses
         * a load factor of 3/4. This simplifies the code and generally
         * improves performance.
         */
    }

Constructor with Subsets

Fits the subset data into itself by resizing its own array according to the size of the subset.

    public HashMap(Map<? extends K, ? extends V> map) {
        //设置数组大小
        this(capacityForInitSize(map.size()));
        //装填子集
        constructorPutAll(map);
    }

    //返回需要的大小
    static int capacityForInitSize(int size) {
        int result = (size >> 1) + size; // Multiply by 3/2 to allow for growth 
//对传入的size乘以1.5,但是移位的操作快速所以采用了移位代替乘

        // boolean expr is equivalent to result >= 0 && result<MAXIMUM_CAPACITY
//返回值要求>= 0 且<MAXIMUM_CAPACITY,具体如何实现
//假设MAXIMUM_CAPACITY为1000
//(MAXIMUM_CAPACITY-1)=0111
//~(MAXIMUM_CAPACITY-1)=1000
//所以result & ~(MAXIMUM_CAPACITY-1))的目的是去低位,与0一定为0,除非高位不为0,但高位不为0的话就大于MAXIMUM_CAPACITY了所以就有了如下
        return (result & ~(MAXIMUM_CAPACITY-1))==0 ? result : MAXIMUM_CAPACITY;
    }

    //装填
    final void constructorPutAll(Map<? extends K, ? extends V> map) {
        if (table == EMPTY_TABLE) {//如果是空表则翻倍空间大小
            doubleCapacity(); // Don't do unchecked puts to a shared table.
        }
        for (Entry<? extends K, ? extends V> e : map.entrySet()) {
//组装key value 进行装填
            constructorPut(e.getKey(), e.getValue());
        }
    }

loading item

    private void constructorPut(K key, V value) {
        if (key == null) {//key为null时维护entryForNullKey 
            HashMapEntry<K, V> entry = entryForNullKey;
            if (entry == null) {
                entryForNullKey = constructorNewEntry(null, value, 0, null);
                size++;
            } else {
                entry.value = value;
            }
            return;
        }

        int hash = Collections.secondaryHash(key);
        HashMapEntry<K, V>[] tab = table;
        int index = hash & (tab.length - 1);//计算出index
        HashMapEntry<K, V> first = tab[index];//此项不为空向下搜索链表
        for (HashMapEntry<K, V> e = first; e != null; e = e.next) {
            if (e.hash == hash && key.equals(e.key)) {
                e.value = value;
                return;
            }
        }

        // No entry for (non-null) key is present; create one
        tab[index] = constructorNewEntry(key, value, hash, first);
        size++;
    }

    HashMapEntry<K, V> constructorNewEntry(
            K key, V value, int hash, HashMapEntry<K, V> first) {
        return new HashMapEntry<K, V>(key, value, hash, first);
    }

PUT

public V put(K key, V value)
Pass in key and value, calculate the hash
If the key exists, update and return the old value, otherwise add a new HashMapEntry and return null

In addition to this, there is a special case, an entryForNullKey is also maintained in HashMap to store the value when key=null

    @Override public V put(K key, V value) {
        if (key == null) {
            return putValueForNullKey(value);
        }

        int hash = Collections.secondaryHash(key);
        HashMapEntry<K, V>[] tab = table;
        int index = hash & (tab.length - 1);
        for (HashMapEntry<K, V> e = tab[index]; e != null; e = e.next) {
            if (e.hash == hash && key.equals(e.key)) {
                preModify(e);
                V oldValue = e.value;
                e.value = value;
                return oldValue;
            }
        }

        // No entry for (non-null) key is present; create one
        modCount++;
        if (size++ > threshold) {
            tab = doubleCapacity();
            index = hash & (tab.length - 1);
        }
        addNewEntry(key, value, hash, index);
        return null;
    }

The first step is to store a null Key

    transient int size;
    transient int modCount;
    //transient 此关键字不参与序列化,存储时不会保存,只存于此对象。

    private V putValueForNullKey(V value) {
        HashMapEntry<K, V> entry = entryForNullKey;
        if (entry == null) {//不存在旧值
            addNewEntryForNullKey(value);
            size++;//总存储量
            modCount++;//修改次数
            return null;
        } else {//存在旧值
            preModify(entry);//修改前操作
            V oldValue = entry.value;
            entry.value = value;
            return oldValue;
        }
    }

    void addNewEntryForNullKey(V value) {
        entryForNullKey = new HashMapEntry<K, V>(null, value, 0, null);
    }

    //空实现,可以做一些修改前的预处理
    void preModify(HashMapEntry<K, V> e) { }

The second step is to update the key

        HashMapEntry<K, V>[] tab = table;
        //由hash求得index
        int index = hash & (tab.length - 1);
        //遍历链表
        for (HashMapEntry<K, V> e = tab[index]; e != null; e = e.next) {
            if (e.hash == hash && key.equals(e.key)) {//key存在于HashMap中的条件:hash和key相同
                preModify(e);//预处理
                V oldValue = e.value;
                e.value = value;//更新
                return oldValue;
            }
        }

The third step saves the new key

Being able to go here means that the key is not null, and this key has not been stored before

        modCount++;//修改计数加一
        if (size++ > threshold) {//存储空间大于阈值,加倍空间
            tab = doubleCapacity();
            index = hash & (tab.length - 1);
        }

        addNewEntry(key, value, hash, index);
        return null;

Increase space size

    private HashMapEntry<K, V>[] doubleCapacity() {
        HashMapEntry<K, V>[] oldTable = table;
        int oldCapacity = oldTable.length;
        //如果已经是最大的则无法增加
        if (oldCapacity == MAXIMUM_CAPACITY) {
            return oldTable;
        }
        //容量翻倍
        int newCapacity = oldCapacity * 2;
        HashMapEntry<K, V>[] newTable = makeTable(newCapacity);
        //表中无数据直接返回
        if (size == 0) {
            return newTable;
        }
        //表中有数据,需要将数据转移到新表
        for (int j = 0; j < oldCapacity; j++) {
            //代码见下
        }
        return newTable;
    }

transfer data to new table

Question: When redistributing the linked list, if the entry in the linked list is put into the new index, but the source code does not assign null to the next pointer of the preceding entry of the entry. There is no problem when using the index, but the same object pointer is stored in two places. I didn't understand Wang's advice.

        for (int j = 0; j < oldCapacity; j++) {
            /*
             * Rehash the bucket using the minimum number of field writes.
             * This is the most subtle and delicate code in the class.
             */
            HashMapEntry<K, V> e = oldTable[j];
            if (e == null) {//null不管
                continue;
            }
            int highBit = e.hash & oldCapacity;//旧index
            HashMapEntry<K, V> broken = null;
            newTable[j | highBit] = e;//存入新的index
            //对链表重新分布,疑问处
            for (HashMapEntry<K, V> n = e.next; n != null; e = n, n = n.next) {
                int nextHighBit = n.hash & oldCapacity;
                if (nextHighBit != highBit) {
                    if (broken == null)
                        newTable[j | nextHighBit] = n;
                    else
                        broken.next = n;
                    broken = e;
                    highBit = nextHighBit;
                }
            }
            if (broken != null)
                broken.next = null;
        }

Put the new table after processing

    void addNewEntry(K key, V value, int hash, int index) {
        table[index] = new HashMapEntry<K, V>(key, value, hash, table[index]);
    }

GET

If the key is null, otherwise, the key will calculate the hash to find the index item, and traverse the list to find it.

Search conditions in the list: the
key address is the same
or
the hash of the entry is the same and the value of the key is the same

not found return null

    public V get(Object key) {
        if (key == null) {
            HashMapEntry<K, V> e = entryForNullKey;
            return e == null ? null : e.value;
        }

        int hash = Collections.secondaryHash(key);
        HashMapEntry<K, V>[] tab = table;
        for (HashMapEntry<K, V> e = tab[hash & (tab.length - 1)];
                e != null; e = e.next) {
            K eKey = e.key;
            if (eKey == key || (e.hash == hash && key.equals(eKey))) {
                return e.value;
            }
        }
        return null;
    }

Remove

    @Override public V remove(Object key) {
        if (key == null) {//key为空的情况
            return removeNullKey();
        }
        int hash = Collections.secondaryHash(key);
        HashMapEntry<K, V>[] tab = table;
        int index = hash & (tab.length - 1);
        for (HashMapEntry<K, V> e = tab[index], prev = null;
                e != null; prev = e, e = e.next) {//链表的移除方式
            if (e.hash == hash && key.equals(e.key)) {
                if (prev == null) {
                    tab[index] = e.next;
                } else {
                    prev.next = e.next;
                }
                modCount++;
                size--;
                postRemove(e);
                return e.value;
            }
        }
    private V removeNullKey() {
        HashMapEntry<K, V> e = entryForNullKey;
        if (e == null) {
            return null;
        }
        entryForNullKey = null;
        modCount++;
        size--;
        postRemove(e);
        return e.value;
    }

    /**
     * Subclass overrides this method to unlink entry.
     */
//空实现 移除后操作
    void postRemove(HashMapEntry<K, V> e) { }

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325993920&siteId=291194637