Hashmap principle, expansion mechanism, data structure version differences, whether it is safe, closed loop issues

1.hashmap data structure

 1.7版本:数组+单链表
 1.8版本:数组+单链表+红黑树

2. Access process

Put method:
1. Determine whether the current Hashmap (the bottom layer is the Entry array) has a value (whether it is an empty array). If it is empty, initialize it (the default size is 16)
2. Calculate the hash value of the current key, and use the hash value and the current Data length, calculate the storage location of the current key value corresponding to the array, if the calculated hash position has a value (and hash conflict), and the key value is the same, the original value value will be overwritten and the original value value will be returned

public V put(K key, V value) {
    //判断当前Hashmap(底层是Entry数组)是否存值(是否为空数组)
    if (table == EMPTY_TABLE) {
      inflateTable(threshold);//如果为空,则初始化
    }
    
    //判断key是否为空
    if (key == null)
      return putForNullKey(value);//hashmap允许key为空
    
    //计算当前key的哈希值    
    int hash = hash(key);
    //通过哈希值和当前数据长度,算出当前key值对应在数组中的存放位置
    int i = indexFor(hash, table.length);
    for (Entry<K,V> e = table[i]; e != null; e = e.next) {
      Object k;
      //如果计算的哈希位置有值(及hash冲突),且key值一样,则覆盖原值value,并返回原值value
      if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
        V oldValue = e.value;
        e.value = value;
        e.recordAccess(this);
        return oldValue;
      }
    }
 
    modCount++;
    //存放值的具体方法
    addEntry(hash, key, value, i);
    return null;
  }

There are two conditions for specific put expansion:
1. When storing the new value, the number of existing elements must be greater than or equal to the threshold
2. When storing the new value, the current stored data has a hash collision (an array converted from the hash value calculated by the current key) The subscript position already has a value) The
expansion method is in the addEntry method

void addEntry(int hash, K key, V value, int bucketIndex) {
    //1、判断当前个数是否大于等于阈值
    //2、当前存放是否发生哈希碰撞
    //如果上面两个条件否发生,那么就扩容
    if ((size >= threshold) && (null != table[bucketIndex])) {
      //扩容,并且把原来数组中的元素重新放到新数组中
      resize(2 * table.length);
      hash = (null != key) ? hash(key) : 0;
      bucketIndex = indexFor(hash, table.length);
    }
 
    createEntry(hash, key, value, bucketIndex);
  }

During the expansion process, the original data will be put into the new array, but the hash value will be recalculated for distribution

void resize(int newCapacity) {
    Entry[] oldTable = table;
    int oldCapacity = oldTable.length;
    //判断是否有超出扩容的最大值,如果达到最大值则不进行扩容操作
    if (oldCapacity == MAXIMUM_CAPACITY) {
      threshold = Integer.MAX_VALUE;
      return;
    }
 
    Entry[] newTable = new Entry[newCapacity];
    // transfer()方法把原数组中的值放到新数组中
    transfer(newTable, initHashSeedAsNeeded(newCapacity));
    //设置hashmap扩容后为新的数组引用
    table = newTable;
    //设置hashmap扩容新的阈值
    threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1);
  }
 

transfer()在实际扩容时候把原来数组中的元素放入新的数组中

void transfer(Entry[] newTable, boolean rehash) {
    int newCapacity = newTable.length;
    for (Entry<K,V> e : table) {
      while(null != e) {
        Entry<K,V> next = e.next;
        if (rehash) {
          e.hash = null == e.key ? 0 : hash(e.key);
        }
        //通过key值的hash值和新数组的大小算出在当前数组中的存放位置
        int i = indexFor(e.hash, newCapacity);
        e.next = newTable[i];
        newTable[i] = e;
        e = next;
      }
    }
  }

3. Expansion issues

After the array is expanded, the most performance-consuming point appears: the data in the original array must be recalculated and placed in the new array. This operation is extremely performance-consuming. So if we have predicted the number of elements in the HashMap, the preset initial capacity can effectively improve the performance of the HashMap.
Re-adjust the size of the HashMap, conditional competition may occur in the case of multi-threading. Because if both threads find that the HashMap needs to be resized, they will try to resize at the same time. In the process of resizing, the order of the elements stored in the linked list will be reversed, because when moving to a new bucket position, HashMap will not put the elements at the end of the linked list, but at the head. This is To avoid tail traversing. If conditional competition occurs, then there is an endless loop.

4. Thread safety

HashMap is not thread-safe. Directly using HashMap in multi-threaded situations will cause some inexplicable and unpredictable problems. There are several solutions for using HashMap in multithreading:
A. Wrap HashMap externally to implement synchronization mechanism
B. Use Map m = Collections.synchronizedMap(new HashMap(...)); to achieve synchronization (official reference plan, but not recommended , It is easy to make mistakes to modify the mapping structure when using iterators to traverse)
C. Use java.util.HashTable, which is the least efficient (almost eliminated)
D. Use java.util.concurrent.ConcurrentHashMap, which is relatively safe and efficient (recommended)
Pay attention to a small problem. The iterators returned by all collection views of HashMap are fail-fast. After the iterator is created, if the map is modified structurally, unless the iterator's own remove or add method is used. , Any other modification at any time, the iterator will throw ConcurrentModificationException. . Therefore, in the face of concurrent modifications, the iterator will soon fail completely.

Version 1.8
Insert picture description here

When the length of the array is greater than 64, the length of the linked list is greater than 8 will be converted from the linked list to the red-black tree

Expansion

Mainly the calculation of the hash value and the conversion of the tree, 1.8 has been greatly optimized relative to 1.7, and the performance has been improved a lot.

Source code

/**
     * Initializes or doubles table size.  If null, allocates in
     * accord with initial capacity target held in field threshold.
     * Otherwise, because we are using power-of-two expansion, the
     * elements from each bin must either stay at same index, or move
     * with a power of two offset in the new table.
     *  @return the table
     *
     *
     *
     * 初始化或者翻倍表大小。
     * 如果表为null,则根据存放在threshold变量中的初始化capacity的值来分配table内存
     * (这个注释说的很清楚,在实例化HashMap时,capacity其实是存放在了成员变量threshold中,
     * 注意,HashMap中没有capacity这个成员变量)
     * 。如果表不为null,由于我们使用2的幂来扩容,
     * 则每个bin元素要么还是在原来的bucket中,要么在2的幂中
     *
     * 此方法功能:初始化或扩容
     */
    final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        //新的容量值,新的扩容阀界值
        int newCap, newThr = 0;
        //oldTab!=null,则oldCap>0
        if (oldCap > 0) {
            //如果此时oldCap>=MAXIMUM_CAPACITY(1 << 30),表示已经到了最大容量,这时还要往map中放数据,则阈值设置为整数的最大值 Integer.MAX_VALUE,直接返回这个oldTab的内存地址。
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            //如果(当前容量*2<最大容量&&当前容量>=默认初始化容量(16))
            //并将将原容量值<<1(相当于*2)赋值给 newCap
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                //如果能进来证明此map是扩容而不是初始化
                //操作:将原扩容阀界值<<1(相当于*2)赋值给 newThr
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
            //进入此if证明创建map时用的带参构造:public HashMap(int initialCapacity)或 public HashMap(int initialCapacity, float loadFactor)
            //注:带参的构造中initialCapacity(初始容量值)不管是输入几都会通过 “this.threshold = tableSizeFor(initialCapacity);”此方法计算出接近initialCapacity参数的2^n来作为初始化容量(初始化容量==oldThr)
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
            //进入此if证明创建map时用的无参构造:
            //然后将参数newCap(新的容量)、newThr(新的扩容阀界值)进行初始化
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
 
            //进入此if有两种可能
            // 第一种:进入此“if (oldCap > 0)”中且不满足该if中的两个if
            // 第二种:进入这个“else if (oldThr > 0)”
 
            //分析:进入此if证明该map在创建时用的带参构造,如果是第一种情况就说明是进行扩容且oldCap(旧容量)小于16,如果是第二种说明是第一次put
            float ft = (float)newCap * loadFactor;
            //计算扩容阀界值
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
            Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        //如果“oldTab != null”说明是扩容,否则直接返回newTab
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
 
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode)
                        //如果该元素是TreeNode的实例
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        Node<K,V> loHead = null, loTail = null;//此对象接收会放在原来位置
                        Node<K,V> hiHead = null, hiTail = null;//此对象接收会放在“j + oldCap”(当前位置索引+原容量的值)
                        Node<K,V> next;
                        do {
                            next = e.next;
                           
                            if ((e.hash & oldCap) == 0) {
 
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

Guess you like

Origin blog.csdn.net/qq_37980436/article/details/115002953