Analysis of the underlying source code of the container-HashMap (16)

Analysis of the underlying source code of the container-HashMap (16)

  1. The bottom layer of HashMap uses a hash table , which is a very important data structure, which is very helpful for us to understand many technologies in the future.

  2. In the data structure, the storage of data is realized by arrays and linked lists, each of which has its own characteristics:

    • Array: Consecutive space occupation, easy addressing, and fast query speed. However, the efficiency of adding and deleting is very low.
    • Linked list: The space occupied is discontinuous, the addressing is difficult, and the query speed is slow, but the efficiency of adding and deleting is very high.

    Can we combine the advantages of arrays and linked lists? Answer: Yes, using a hash table, the essence of the hash table is "array + linked list"

Insert picture description here

  1. HashMapde's inheritance structure and main member variables

    public class HashMap<K,V> extends AbstractMap<K,V>
        implements Map<K,V>, Cloneable, Serializable {
          
          
    
        private static final long serialVersionUID = 362498820763181265L;
         /**
         * The default initial capacity - MUST be a power of two.
         */
        static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16
     		//数组的默认长度,1 << 4=1*2*2*2*2=2^4=16,哈希值运算的时候会用到。int类型
        /**
         * The maximum capacity, used if a higher value is implicitly specified
         * by either of the constructors with arguments.
         * MUST be a power of two <= 1<<30.
         */
        static final int MAXIMUM_CAPACITY = 1 << 30;
    			//数组的最大容量是2的30次方,int类型
        /**
         * The load factor used when none specified in constructor.
         */
        static final float DEFAULT_LOAD_FACTOR = 0.75f;
    		//数组扩容的负载因子,当数组用到75%,数组就会扩容,16*0.75=12,当容量达到12,存13的时候就要扩容了。
        /**
         * The bin count threshold for using a tree rather than list for a
         * bin.  Bins are converted to trees when adding an element to a
         * bin with at least this many nodes. The value must be greater
         * than 2 and should be at least 8 to mesh with assumptions in
         * tree removal about conversion back to plain bins upon
         * shrinkage.
         */
        static final int TREEIFY_THRESHOLD = 8;
        	//链表达到8这个域值时,会转成功红黑树
    
        /**
         * The bin count threshold for untreeifying a (split) bin during a
         * resize operation. Should be less than TREEIFY_THRESHOLD, and at
         * most 6 to mesh with shrinkage detection under removal.
         */
        static final int UNTREEIFY_THRESHOLD = 6;
    		//将红黑树转换成链表的一个域值
        /**
         * The smallest table capacity for which bins may be treeified.
         * (Otherwise the table is resized if too many nodes in a bin.)
         * Should be at least 4 * TREEIFY_THRESHOLD to avoid conflicts
         * between resizing and treeification thresholds.
         */
        static final int MIN_TREEIFY_CAPACITY = 64;
        //当数组长度达到64时,链表为8,才会转换成红黑树
         /**
         * The number of key-value mappings contained in this map.
         */
        transient int size;//在map中存取键值对的数量
    
        /**
         * The table, initialized on first use, and resized as
         * necessary. When allocated, length is always a power of two.
         * (We also tolerate length zero in some operations to allow
         * bootstrapping mechanics that are currently not needed.)
         */
        transient Node<K,V>[] table;//这个Node类型的数组才是我们存存储红黑树的数组
    
  2. Node type stored in HashMap

    • Node type

      /**
           * Basic hash bin node, used for most entries.  (See below for
           * TreeNode subclass, and in LinkedHashMap for its Entry subclass.)
           */
      //在这个Node节点中实现了,我们获取元素的第三种方法,Set集合中的EntrySet的方法获取get(key)、get(Vlue),因为Node实现了Map.Entry<K,V> 接口,所以他当然可以实现Map.Entry类型,其实本质实现的方法还是在这个Node节点。
          static class Node<K,V> implements Map.Entry<K,V> {
              
              
              final int hash;//存放元素K的hashcode的值
              final K key;//存放K-V结构K的值,是不允许修改的
              V value;//存放K-V结构V的值,是允许修改的
              Node<K,V> next;//记录下一个节点的成员变量,是单向的链表,当前节点只记录下一个节点的地址,不记录上一个节点的地址,存储元素只能从头指向尾的方向
      
              Node(int hash, K key, V value, Node<K,V> next) {
              
              
                  this.hash = hash;
                  this.key = key;
                  this.value = value;
                  this.next = next;
              }
      
    • TreeNode inner class

       /**
           * Entry for Tree bins. Extends LinkedHashMap.Entry (which in turn
           * extends Node) so can be used as extension of either regular or
           * linked node.
           */
      //定义了在红黑树存储节点的定义
          static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> {
              
              
              TreeNode<K,V> parent;  // red-black tree links//存放当前节点的父节点
              TreeNode<K,V> left;//存放当前节点的左节点
              TreeNode<K,V> right;//存放当前节点的右节点
              TreeNode<K,V> prev;    // needed to unlink next upon deletion//当前节点的前一个节点
              boolean red;//用一个boolean值表示红树还是黑树
              TreeNode(int hash, K key, V val, Node<K,V> next) {
              
              //通过这个构造方法对这些值做对应的赋值处理
                  super(hash, key, val, next);
              }
      
      
    • TreeNode<K,V> inherits the superclass LinkedHashMap.Entry<K,V>, let’s take a look

       /**
           * HashMap.Node subclass for normal LinkedHashMap entries.
           */
      // Entry<K,V>又继承了HashMap.Node<K,V>,这个就是我们刚刚的分析的那个Node
          static class Entry<K,V> extends HashMap.Node<K,V> {
              
              
              Entry<K,V> before, after;
              Entry(int hash, K key, V value, Node<K,V> next) {
              
              
                  super(hash, key, value, next);
              }
          }
      
    • The Node class shows that TreeNode inherits Node

      /**
           * Basic hash bin node, used for most entries.  (See below for
           * TreeNode subclass, and in LinkedHashMap for its Entry subclass.)
           */
      //在这个Node节点中实现了,我们获取元素的第三种方法,Set集合中的EntrySet的方法获取get(key)、get(Vlue),因为Node实现了Map.Entry<K,V> 接口,所以他当然可以实现Map.Entry类型,其实本质实现的方法还是在这个Node节点。
          static class Node<K,V> implements Map.Entry<K,V> {
              
              
              final int hash;//存放元素K的hashcode的值
              final K key;//存放K-V结构K的值,是不允许修改的
              V value;//存放K-V结构V的值,是允许修改的
              Node<K,V> next;//记录下一个节点的成员变量,是单向的链表,当前节点只记录下一个节点的地址,不记录上一个节点的地址,存储元素只能从头指向尾的方向
      
              Node(int hash, K key, V value, Node<K,V> next) {
              
              
                  this.hash = hash;
                  this.key = key;
                  this.value = value;
                  this.next = next;
              }
      
    • We once saw a member variable of Node. We found that the singly linked list is Node, and the red and black arrays are of TreeNode. The two are of different types, but our array itself is an array of Node type. What about TreeNode? Can't you put it in? No, because now TreeNode inherits Node, it can also be put in.

      transient Node<K,V>[] table;//这个Node类型的数组才是我们存存储红黑树的数组
      
    • TreeNode's inheritance diagram :

Insert picture description here

  1. The initialization of the array is realized. After JDK1.8, the initialization of the array is delayed. The resize method is used to realize the initialization processing, and the resize method realizes the initialization of the array and also realizes the expansion of the array.

    • Enter the source code from the map.put() method of HashMap, and then use Ctrl+Alt to select the implementation class of the HashMap interface to enter the source code

       public V put(K key, V value) {
              
              
              return putVal(hash(key), key, value, false, true);
          }
      
    • We look directly at the putVal method, we only look at what we need (I commented).

      /**
           * Implements Map.put and related methods
           *
           * @param hash hash for key
           * @param key the key
           * @param value the value to put
           * @param onlyIfAbsent if true, don't change existing value
           * @param evict if false, the table is in creation mode.
           * @return previous value, or null if none
           */
          final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                         boolean evict) {
              
              
              Node<K,V>[] tab; Node<K,V> p; int n, i;//定义了几个局部变量,Node<K,V>[]=tab,Node<K,V>= p,int n, i
              if ((tab = table) == null || (n = tab.length) == 0)
                  //把table赋给了tab,table就是我们定义的Node类型的数组,现在table是空的,所以tab也是空的
                  n = (tab = resize()).length;
              //我们看到它调用了resize()方法,这个方法是有返回值的,而tab是一个Node类型的数组,所以resize()方法返回的肯定是一个Node类型的数组
              if ((p = tab[i = (n - 1) & hash]) == null)
                  tab[i] = newNode(hash, key, value, null);
              else {
              
              
                  Node<K,V> e; K k;
                  if (p.hash == hash &&
                      ((k = p.key) == key || (key != null && key.equals(k))))
                      e = p;
                  else if (p instanceof TreeNode)
                      e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
                  else {
              
              
                      for (int binCount = 0; ; ++binCount) {
              
              
                          if ((e = p.next) == null) {
              
              
                              p.next = newNode(hash, key, value, null);
                              if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                                  treeifyBin(tab, hash);
                              break;
                          }
                          if (e.hash == hash &&
                              ((k = e.key) == key || (key != null && key.equals(k))))
                              break;
                          p = e;
                      }
                  }
                  if (e != null) {
              
               // existing mapping for key
                      V oldValue = e.value;
                      if (!onlyIfAbsent || oldValue == null)
                          e.value = value;
                      afterNodeAccess(e);
                      return oldValue;
                  }
              }
              ++modCount;
              if (++size > threshold)
                  resize();
              afterNodeInsertion(evict);
              return null;
          }
      
    • Next we enter the resize () method is very complex, we look at what we need (my comment). That is to complete the array initialization and array expansion.

       /**
           * Initializes or doubles table size.  If null, allocates in
           * accord with initial capacity target held in field threshold.
           * Otherwise, because we are using power-of-two expansion, the
           * elements from each bin must either stay at same index, or move
           * with a power of two offset in the new table.
           *
           * @return the table
           */
          final Node<K,V>[] resize() {
              
              
              Node<K,V>[] oldTab = table;//table现在是null,他赋值给了oldTab,所以oldTab也是null
              int oldCap = (oldTab == null) ? 0 : oldTab.length;
              //通过三目运算符,现在oldTab=null,所以返回的是0,所以oldCap=0,所以下面的oldCap大于0,小于0都不用看了,看最后的else就行了
              int oldThr = threshold;
              int newCap, newThr = 0;
              if (oldCap > 0) {
              
              
                  if (oldCap >= MAXIMUM_CAPACITY) {
              
              
                      threshold = Integer.MAX_VALUE;
                      return oldTab;
                  }
                  else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                           oldCap >= DEFAULT_INITIAL_CAPACITY)
                      newThr = oldThr << 1; // double threshold
              }
              else if (oldThr > 0) // initial capacity was placed in threshold
                  newCap = oldThr;
              
              
              //看这个else
              else {
              
                             // zero initial threshold signifies using defaults
                  newCap = DEFAULT_INITIAL_CAPACITY;
                  //DEFAULT_INITIAL_CAPACITY是数组初始化的长度,是16,所以newCap=16
                  newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
                  //DEFAULT_LOAD_FACTOR是数组的扩容因子*DEFAULT_INITIAL_CAPACITY是数组的默认长度,也就是0.75*16=12,也就是newThr=12
                  
              }
              if (newThr == 0) {
              
              
                  float ft = (float)newCap * loadFactor;
                  newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                            (int)ft : Integer.MAX_VALUE);
              }
              
              threshold = newThr;//这个newThr=12,所以 threshold=12
              
              @SuppressWarnings({
              
              "rawtypes","unchecked"})
              
                  Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
              //新建了一个newCap=16,长度为16的Node类型的数组,赋值给newTab=16,也就是新建了一个newTab长度为16的Ndoe类型的数组。
              table = newTab;//然后newTab又赋值了给table
             //现在整个数组对象有两个对象指向它,一个是  table,一个是newTab
            
              if (oldTab != null) {
              
              
                  for (int j = 0; j < oldCap; ++j) {
              
              
                      Node<K,V> e;
                      if ((e = oldTab[j]) != null) {
              
              
                          oldTab[j] = null;
                          if (e.next == null)
                              newTab[e.hash & (newCap - 1)] = e;
                          else if (e instanceof TreeNode)
                              ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                          else {
              
               // preserve order
                              Node<K,V> loHead = null, loTail = null;
                              Node<K,V> hiHead = null, hiTail = null;
                              Node<K,V> next;
                              do {
              
              
                                  next = e.next;
                                  if ((e.hash & oldCap) == 0) {
              
              
                                      if (loTail == null)
                                          loHead = e;
                                      else
                                          loTail.next = e;
                                      loTail = e;
                                  }
                                  else {
              
              
                                      if (hiTail == null)
                                          hiHead = e;
                                      else
                                          hiTail.next = e;
                                      hiTail = e;
                                  }
                              } while ((e = next) != null);
                              if (loTail != null) {
              
              
                                  loTail.next = null;
                                  newTab[j] = loHead;
                              }
                              if (hiTail != null) {
              
              
                                  hiTail.next = null;
                                  newTab[j + oldCap] = hiHead;
                              }
                          }
                      }
                  }
              }
              return newTab;//最后它返回了newTab,就相当于完成了数组的初始化
          }
      
    • Finally, return newTab=16, that is, the resize() method returns an array with a length of 16, and then assigns it to the tab, and the tab is a null array at the beginning , and now it becomes a length of 16 after initialization. Array of Node type .

Guess you like

Origin blog.csdn.net/Xun_independent/article/details/114766251