Multi-threaded high-concurrency programming (10) -- ConcurrentHashMap source code analysis

1. Background

  The previous article talked about the source code analysis of HashMap, from which we can see the following problems:

  • The put/remove method of HashMap is not thread-safe. If you use synchronized to lock in a multi-threaded concurrent environment, it will lead to inefficiency;
  • Modifying (put/remove) operations while traversing iterative acquisitions will result in concurrent modification exceptions (ConcurrentModificationException);
  • Before JDK1.7, adding a put operation to HashMap will cause the linked list to be reversed, resulting in a linked list loop, resulting in an infinite get loop.
  • If multiple threads detect that the number of elements exceeds the array size * loadFactor at the same time, multiple threads will expand the array at the same time, recalculating the element position and copying data, but in the end only one thread will expand the array. For the table, that is to say, other threads will be lost, and the data put by the respective threads will also be lost;

  Based on the above problems, ConcurrentHashMap can be used to solve them. ConcurrentHashMap uses segment lock technology to solve the concurrent access efficiency, and the modification operation will not cause concurrent modification exceptions and other problems during traversal and iterative acquisition.

  2. Source code analysis

  1. Construction method:

    //最大容量大小
        private static final int MAXIMUM_CAPACITY = 1 << 30;
        //默认容量大小
        private static final int DEFAULT_CAPACITY = 16;
        /**
         *控制标识符,用来控制table的初始化和扩容的操作,不同的值有不同的含义
         *  多线程之间,以volatile方式读取sizeCtl属性,来判断ConcurrentHashMap当前所处的状态。
         *  通过cas设置sizeCtl属性,告知其他线程ConcurrentHashMap的状态变更
         *未初始化:
         *  sizeCtl=0:表示没有指定初始容量。
         *  sizeCtl>0:表示初始容量。
         *初始化中:
         *  sizeCtl=-1,标记作用,告知其他线程,正在初始化
         *正常状态:
         *  sizeCtl=0.75n ,扩容阈值
         *扩容中:
         *  sizeCtl < 0 : 表示有其他线程正在执行扩容
         *  sizeCtl = (resizeStamp(n) << RESIZE_STAMP_SHIFT) + 2 :表示此时只有一个线程在执行扩容
         */
        private transient volatile int sizeCtl;
        //并发级别
        private static final int DEFAULT_CONCURRENCY_LEVEL = 16;
        //创建一个新的空map,默认大小是16
        public ConcurrentHashMap() {
        }
        public ConcurrentHashMap(int initialCapacity) {
            if (initialCapacity < 0)
                throw new IllegalArgumentException();
            //调整table的大小,tableSizeFor的实现查看前文HashMap源码分析的构造方法模块
            int cap = ((initialCapacity >= (MAXIMUM_CAPACITY >>> 1)) ?
                       MAXIMUM_CAPACITY :
                       tableSizeFor(initialCapacity + (initialCapacity >>> 1) + 1));
            this.sizeCtl = cap;
        }
        public ConcurrentHashMap(Map<? extends K, ? extends V> m) {
            this.sizeCtl = DEFAULT_CAPACITY;
            putAll(m);
        }
        public ConcurrentHashMap(int initialCapacity, float loadFactor) {
            this(initialCapacity, loadFactor, 1);
        }
        /**
         * concurrencyLevel:并发度,预估同时操作数据的线程数量
         * 表示能够同时更新ConccurentHashMap且不产生锁竞争的最大线程数。
         * 默认值为16,(即允许16个线程并发可能不会产生竞争)。
         */
        public ConcurrentHashMap(int initialCapacity,
                                 float loadFactor, int concurrencyLevel) {
            if (!(loadFactor > 0.0f) || initialCapacity < 0 || concurrencyLevel <= 0)
                throw new IllegalArgumentException();
            //至少使用同样多的桶容纳同样多的更新线程来操作元素
            if (initialCapacity < concurrencyLevel)   // Use at least as many bins
                initialCapacity = concurrencyLevel;   // as estimated threads
            long size = (long)(1.0 + (long)initialCapacity / loadFactor);
            int cap = (size >= (long)MAXIMUM_CAPACITY) ?
                MAXIMUM_CAPACITY : tableSizeFor((int)size);
            this.sizeCtl = cap;
        }
    复制代码

  2. put:

    public V put(K key, V value) {
            return putVal(key, value, false);
        }
        static final int HASH_BITS = 0x7fffffff; // usable bits of normal node hash普通节点哈希的可用位
        //把位数控制在int最大整数之内,h ^ (h >>> 16)的含义查看前文的put源码解析
        static final int spread(int h) {
            return (h ^ (h >>> 16)) & HASH_BITS;
        }
        final V putVal(K key, V value, boolean onlyIfAbsent) {
            //key和value为空抛出异常
            if (key == null || value == null) throw new NullPointerException();
            //得到hash值
            int hash = spread(key.hashCode());
            int binCount = 0;
            //自旋对table进行遍历
            for (Node<K,V>[] tab = table;;) {
                Node<K,V> f; int n, i, fh;
                //初始化table
                if (tab == null || (n = tab.length) == 0)
                    tab = initTable();
                //如果hash计算出的槽位元素为null,CAS将元素填充进当前槽位并结束遍历
                else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
                    if (casTabAt(tab, i, null,
                                 new Node<K,V>(hash, key, value, null)))
                        break;                   // no lock when adding to empty bin
                }
                //hash为-1,说明正在扩容,那么就帮助其扩容。以加快速度
                else if ((fh = f.hash) == MOVED)
                    tab = helpTransfer(tab, f);
                else {
                    V oldVal = null;
                    synchronized (f) {// 同步 f 节点,防止增加链表的时候导致链表成环
                        if (tabAt(tab, i) == f) {// 如果对应的下标位置的节点没有改变
                            if (fh >= 0) {//f节点的hash值大于0
                                binCount = 1;//链表初始长度
                                // 死循环,直到将值添加到链表尾部,并计算链表的长度
                                for (Node<K,V> e = f;; ++binCount) {
                                    K ek;
                                    //hash和key相同,值进行覆盖
                                    if (e.hash == hash &&
                                        ((ek = e.key) == key ||
                                         (ek != null && key.equals(ek)))) {
                                        oldVal = e.val;
                                        if (!onlyIfAbsent)
                                            e.val = value;
                                        break;
                                    }
                                    Node<K,V> pred = e;
                                    //hash和key不同,添加到链表后面
                                    if ((e = e.next) == null) {
                                        pred.next = new Node<K,V>(hash, key,
                                                                  value, null);
                                        break;
                                    }
                                }
                            }
                            //是树节点,添加到树中
                            else if (f instanceof TreeBin) {
                                Node<K,V> p;
                                binCount = 2;
                                if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                               value)) != null) {
                                    oldVal = p.val;
                                    if (!onlyIfAbsent)
                                        p.val = value;
                                }
                            }
                        }
                    }
                     //如果节点添加到链表和树中
                    if (binCount != 0) {
                        //链表长度大于等于8时,将链表转换成红黑树树
                        if (binCount >= TREEIFY_THRESHOLD)
                            treeifyBin(tab, i);
                        if (oldVal != null)
                            return oldVal;
                        break;
                    }
                }
            }
            // 判断是否需要扩容
            addCount(1L, binCount);
            return null;
        }
    复制代码

    1. initTable: initialization

      private final Node<K,V>[] initTable() {
              Node<K,V>[] tab; int sc;
              while ((tab = table) == null || tab.length == 0) {
                  //如果一个线程发现sizeCtl<0,意味着另外的线程执行CAS操作成功,当前线程只需要让出cpu时间片,即保证只有一个线程初始化
                  //由于sizeCtl是volatile的,保证了顺序性和可见性
                  if ((sc = sizeCtl) < 0)
                      Thread.yield(); // lost initialization race; just spin
                  else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {//cas操作判断并置为-1
                      try {
                          if ((tab = table) == null || tab.length == 0) {
                              int n = (sc > 0) ? sc : DEFAULT_CAPACITY;//若没有参数则默认容量为16
                              @SuppressWarnings("unchecked")
                              Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n];//创建数组
                              table = tab = nt;//数组赋值给当前ConcurrentHashMap
                              //计算下一次元素到达扩容的阀值,如果n为16的话,那么这里 sc = 12,其实就是 0.75 * n
                              sc = n - (n >>> 2);
                          }
                      } finally {
                          sizeCtl = sc;
                      }
                      break;
                  }
              }
              return tab;
          }
      复制代码

    2. tabAt: Find the data at position i of the specified array in memory

      static final <K,V> Node<K,V> tabAt(Node<K,V>[] tab, int i) {
              /**getObjectVolatile:获取obj对象中offset偏移地址对应的object型field的值,支持volatile load语义。
               * 数组的寻址计算方式:a[i]_address = base_address + i * data_type_size
               * base_address:起始地址;i:索引;data_type_size:数据类型长度大小
               */
              return (Node<K,V>)U.getObjectVolatile(tab, ((long)i << ASHIFT) + ABASE);
          }
      复制代码

    3. helpTransfer: help expansion

      private static int RESIZE_STAMP_BITS = 16;
          /**
           * numberOfLeadingZeros()的具体算法逻辑请参考:https://www.jianshu.com/p/2c1be41f6e59
           * numberOfLeadingZeros(n)返回的是n的二进制标识的从高位开始到第一个非0的数字的之间0的个数,比如numberOfLeadingZeros(8)返回的就是28 ,因为0000 0000 0000 0000 0000 0000 0000 1000在1前面有28个0
           * RESIZE_STAMP_BITS 的值是16,1 << (RESIZE_STAMP_BITS - 1)就是将1左移位15位,0000 0000 0000 0000 1000 0000 0000 0000
           * 然后将两个数字再按位或,将相当于 将移位后的 两个数相加。
           * 比如:
           * 8的二进制表示是: 0000 0000 0000 0000 0000 0000 0000 1000 = 8
           * 7的二进制表示是: 0000 0000 0000 0000 0000 0000 0000 0111 = 7
           * 按位或的结果是:  0000 0000 0000 0000 0000 0000 0000 1111 = 15
           * 相当于 8 + 7 =15
           * 为什么会出现这种效果呢?因为8是2的整数次幂,也就是说8的二进制表示只会在某个高位上是1,其余地位都是0,所以在按位或的时候,低位表示的全是7的位值,所以出现了这种效果。
           */
          static final int resizeStamp(int n) {
              return Integer.numberOfLeadingZeros(n) | (1 << (RESIZE_STAMP_BITS - 1));
          }
          final Node<K,V>[] helpTransfer(Node<K,V>[] tab, Node<K,V> f) {
              Node<K,V>[] nextTab; int sc;
               //如果table不是空,且node节点是转移类型,且node节点的nextTable(新 table)不是空,尝试帮助扩容
              if (tab != null && (f instanceof ForwardingNode) &&
                  (nextTab = ((ForwardingNode<K,V>)f).nextTable) != null) {
                  //根据length得到一个标识符号
                  int rs = resizeStamp(tab.length);
                   //如果nextTab没有被并发修改,且tab也没有被并发修改,且sizeCtl<0(说明还在扩容)
                  while (nextTab == nextTable && table == tab &&
                         (sc = sizeCtl) < 0) {
                      /**
                       * 如果 sizeCtl 无符号右移16不等于rs( sc前16位如果不等于标识符,则标识符变化了)
                       * 或者 sizeCtl == rs + 1(扩容结束了,不再有线程进行扩容)(默认第一个线程设置 sc ==rs 左移 16 位 + 2,当第一个线程结束扩容了,就会将 sc 减1。这个时候,sc 就等于 rs + 1)
                       * 或者 sizeCtl == rs + 65535  (如果达到最大帮助线程的数量,即 65535)
                       * 或者转移下标正在调整 (扩容结束)
                       * 结束循环,返回 table
                       * 【即如果还在扩容,判断标识符是否变化,判断扩容是否结束,判断是否达到最大线程数,判断扩容转移下标是否在调整(扩容结束),如果满足任意条件,结束循环。】
                       */
                      if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
                          sc == rs + MAX_RESIZERS || transferIndex <= 0)
                          break;
                      // 如果以上都不是, 将 sizeCtl + 1, (表示增加了一个线程帮助其扩容)
                      if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1)) {
                          transfer(tab, nextTab);//进行扩容和数据迁移
                          break;
                      }
                  }
                  return nextTab;//返回扩容后的数组
              }
              return table;//没有扩容,返回原数组
          }
      复制代码

    4.  transfer: expansion and data migration, using multi-threaded expansion, the entire expansion process, set sizeCtl, transferIndex and other variables through cas to coordinate multiple threads to perform concurrent expansion;

      1.  transferIndex attribute:

        //扩容索引,表示已经分配给扩容线程的table数组索引位置。主要用来协调多个线程,并发安全地获取迁移任务(hash桶)。
        private transient volatile int transferIndex;
        复制代码
        1. Before expansion, transferIndex is at the far right of the array. At this time, a thread finds that the expansion threshold has been reached, and is ready to start expansion.

        2. To expand the thread, before migrating data, first shift the transferIndex to the left (modify transferIndex=transferIndex-stride (the number of hash buckets to be migrated) by cas  ) to obtain the migration task. Each expansion thread will set the transferIndex through for loop + CAS, so the concurrent security of multi-thread expansion can be ensured. (

          From another perspective, we can regard the table array to be migrated as a task queue, and transferIndex as the head pointer of the task queue. The expansion thread is the consumer of this queue. The process in which the expansion thread sets the transferIndex index through CAS is the process in which consumers obtain tasks from the task queue.

          )

      2.  Expansion process:

        1.  The capacity has reached the expansion threshold and needs to be expanded. At this time, transferindex=tab.length=32

        2. Expansion thread A modifies transferindex=31-16=16 in the form of cas, and then migrates the hash buckets in the interval table[31]--table[16] in descending order

        3. When migrating the hash bucket, the linked list or red-black tree in the bucket will be split into two parts according to a certain algorithm, and inserted into nextTable[i] and nextTable[i+n] (n is the length of the table array). The migrated hash bucket will be set as a ForwardingNode node, so as to inform other threads accessing this bucket that this node has been migrated

        4. At this time, thread 2 has accessed the ForwardingNode node. If thread 2 performs write operations such as put or remove, it will first help expand its capacity. If thread 2 executes read methods such as get, it will call the find method of ForwardingNode to find related elements in nextTable

        5. Thread 2 joins the expansion operation

        6. If you are going to join the expansion thread and find the following situations, give up the expansion and return directly.

          1. It is found that transferIndex=0, that is, all nodes have been allocated
          2. It is found that the expansion thread has reached the maximum number of expansion threads

                                                   

 

      1.  Source code analysis

        private final void transfer(Node<K,V>[] tab, Node<K,V>[] nextTab) {
            int n = tab.length, stride;
            //先判断CPU核数,如果是多核,将数组长度/8,再/核数,得到stride,否则stride=数组长度,如果stride<16,则stride=16
            //这里的目的是让每个CPU处理的桶一样多,避免出现转移任务不均匀的现象,如果桶较少的话,默认一个CPU(一个线程)处理16个桶,即确保每次至少获取16个桶(迁移任务)
            if ((stride = (NCPU > 1) ? (n >>> 3) / NCPU : n) < MIN_TRANSFER_STRIDE)
                stride = MIN_TRANSFER_STRIDE; // subdivide range
            //未初始化进行初始化
            if (nextTab == null) {            // initiating
                try {
                    @SuppressWarnings("unchecked")
                    Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n << 1];//扩容2倍
                    nextTab = nt;//更新
                } catch (Throwable ex) {      // try to cope with OOME
                    sizeCtl = Integer.MAX_VALUE;//扩容失败,sizeCtl使用int最大值。
                    return;
                }
                nextTable = nextTab;//更新成员变量
                //transferIndex默认=table.length
                transferIndex = n;
            }
            int nextn = nextTab.length;//新tab的长度
            //创建一个fwd节点,用于占位。当别的线程发现这个槽位中是fwd类型的节点,表示其他线程正在扩容,并且此节点已经扩容完毕,跳过这个节点。关联了nextTab,可以通过ForwardingNode.find()访问已经迁移到nextTab的数据。
            ForwardingNode<K,V> fwd = new ForwardingNode<K,V>(nextTab);
            //首次推进为 true,如果等于true,说明需要再次推进一个下标(i--),反之,如果是false,那么就不能推进下标,需要将当前的下标处理完毕才能继续推进
            boolean advance = true;
            //完成状态,如果是true,就结束此方法。
            boolean finishing = false; // to ensure sweep before committing nextTab
            //自旋,i表示当前线程可以处理的当前桶区间最大下标,bound表示当前线程可以处理的当前桶区间最小下标
            for (int i = 0, bound = 0;;) {
                Node<K,V> f; int fh;
                 //while:如果当前线程可以向后推进;这个循环就是控制i递减。同时,每个线程都会进入这里取得自己需要转移的桶的区间
                //分析场景:table.length=32,此时执行到这个地方nextTab.length=64 A,B线程同时进行扩容。
                //A,B线程同时执行到while循环中cas这段代码
                //A线程获第一时间抢到资源,设置bound=nextBound=16,i = nextIndex - 1=31 A线程搬运table[31]~table[16]中间16个元素
                //B线程再次回到while起点,然后在次获取到 bound = nextBound-0,i=nextIndex - 1=15,B线程搬运table[15]~table[0]中间16个元素
                //当transferIndex=0的时候,说明table里面所有搬运任务都已经完成,无法在分配任务。
                while (advance) {
                    int nextIndex, nextBound;
                    // 对i减1,判断是否大于等于bound(正常情况下,如果大于bound不成立,说明该线程上次领取的任务已经完成了。那么,需要在下面继续领取任务)
                    // 如果对i减1大于等于 bound,或者完成了,修改推进状态为 false,不能推进了。任务成功后修改推进状态为 true。
                    // 通常,第一次进入循环,i-- 这个判断会无法通过,从而走下面的nextIndex = transferIndex(获取最新的转移下标)。其余情况都是:如果可以推进,将i减1,然后修改成不可推进。如果i对应的桶处理成功了,改成可以推进。
                    if (--i >= bound || finishing)
                        advance = false;//这里设置false,是为了防止在没有成功处理一个桶的情况下却进行了推进
                   // 这里的目的是:1. 当一个线程进入时,会选取最新的转移下标。
                   //             2. 当一个线程处理完自己的区间时,如果还有剩余区间的没有别的线程处理,再次CAS获取区间。
                    else if ((nextIndex = transferIndex) <= 0) {
                        // 如果小于等于0,说明没有区间可以获取了,i改成-1,推进状态变成false,不再推进
                        // 这个-1会在下面的if块里判断,从而进入完成状态判断
                        i = -1;
                        advance = false;//这里设置false,是为了防止在没有成功处理一个桶的情况下却进行了推进
                    }
                    // CAS修改transferIndex,即 length - 区间值,留下剩余的区间值供后面的线程使用
                    else if (U.compareAndSwapInt
                             (this, TRANSFERINDEX, nextIndex,
                              nextBound = (nextIndex > stride ?
                                           nextIndex - stride : 0))) {
                        bound = nextBound;//这个值就是当前线程可以处理的最小当前区间最小下标
                        i = nextIndex - 1;//初次对i赋值,这个就是当前线程可以处理的当前区间的最大下标
                        advance = false;// 这里设置false,是为了防止在没有成功处理一个桶的情况下却进行了推进,这样导致漏掉某个桶。下面的 if(tabAt(tab, i) == f) 判断会出现这样的情况。
                    }
                }
                //i<0(不在 tab 下标内,按照上面的判断,领取最后一段区间的线程结束)
                if (i < 0 || i >= n || i + n >= nextn) {
                    int sc;
                    if (finishing) {// 如果完成了扩容和数据迁移
                        nextTable = null;//删除成员遍历
                        table = nextTab;//更新table
                        sizeCtl = (n << 1) - (n >>> 1);//更新阀值
                        return;//结束transfer
                    }
                    //如果没完成,尝试将sc -1. 表示这个线程结束帮助扩容了,将 sc 的低 16 位减一。
                    if (U.compareAndSwapInt(this, SIZECTL, sc = sizeCtl, sc - 1)) {
                        //如果 sc - 2 不等于标识符左移 16 位。如果他们相等了,说明没有线程在帮助他们扩容了。也就是说,扩容结束了。
                        /**
                         *第一个扩容的线程,执行transfer方法之前(helpTransfer方法中),会设置 sizeCtl = (resizeStamp(n) << RESIZE_STAMP_SHIFT) + 2)
                         *后续帮其扩容的线程,执行transfer方法之前,会设置 sizeCtl = sizeCtl+1
                         *每一个退出transfer的方法的线程,退出之前,会设置 sizeCtl = sizeCtl-1
                         *那么最后一个线程退出时:
                         *必然有sc == (resizeStamp(n) << RESIZE_STAMP_SHIFT) + 2),即 (sc - 2) == resizeStamp(n) << RESIZE_STAMP_SHIFT
                        */
                        if ((sc - 2) != resizeStamp(n) << RESIZE_STAMP_SHIFT)
                            return;// 不相等,说明不到最后一个线程,直接退出transfer方法
                        finishing = advance = true;// 如果相等,扩容结束了,更新 finising 变量
                        i = n; // recheck before commit,最后退出的线程要重新check下是否全部迁移完毕
                    }
                }
                else if ((f = tabAt(tab, i)) == null) // 获取老tab的i下标位置的变量,如果是 null,就使用 fwd 占位。
                    advance = casTabAt(tab, i, null, fwd);// 如果成功写入 fwd 占位,再次推进一个下标
                else if ((fh = f.hash) == MOVED)// 如果不是 null 且 hash 值是 MOVED。
                    advance = true; // already processed,说明别的线程已经处理过了,再次推进一个下标
                else {// 到这里,说明这个位置有实际值了,且不是占位符。对这个节点上锁。为什么上锁,防止 putVal 的时候向链表插入数据
                    synchronized (f) {
                        // 判断 i 下标处的桶节点是否和 f 相同
                        if (tabAt(tab, i) == f) {
                            Node<K,V> ln, hn;// low, height 高位桶,低位桶
                            // 如果 f 的 hash 值大于 0 。TreeBin 的 hash 是 -2
                            if (fh >= 0) {
                                // 对老长度进行与运算(第一个操作数的的第n位于第二个操作数的第n位如果都是1,那么结果的第n为也为1,否则为0)
                                // 由于 Map 的长度都是 2 的次方(000001000 这类的数字),那么取于 length 只有 2 种结果,一种是 0,一种是1
                                //  如果是结果是0 ,Doug Lea 将其放在低位,反之放在高位,目的是将链表重新 hash,放到对应的位置上,让新的取于算法能够击中他。
                                int runBit = fh & n;
                                Node<K,V> lastRun = f; // 尾节点,且和头节点的 hash 值取于不相等
                                // 遍历这个桶
                                for (Node<K,V> p = f.next; p != null; p = p.next) {
                                    // 取于桶中每个节点的 hash 值
                                    int b = p.hash & n;
                                    // 如果节点的 hash 值和首节点的 hash 值取于结果不同
                                    if (b != runBit) {
                                        runBit = b; // 更新 runBit,用于下面判断 lastRun 该赋值给 ln 还是 hn。
                                        lastRun = p; // 这个 lastRun 保证后面的节点与自己的取于值相同,避免后面没有必要的循环
                                    }
                                }
                                if (runBit == 0) {// 如果最后更新的 runBit 是 0 ,设置低位节点
                                    ln = lastRun;
                                    hn = null;
                                }
                                else {
                                    hn = lastRun; // 如果最后更新的 runBit 是 1, 设置高位节点
                                    ln = null;
                                }// 再次循环,生成两个链表,lastRun 作为停止条件,这样就是避免无谓的循环(lastRun 后面都是相同的取于结果)
                                for (Node<K,V> p = f; p != lastRun; p = p.next) {
                                    int ph = p.hash; K pk = p.key; V pv = p.val;
                                    // 如果与运算结果是 0,那么就还在低位
                                    if ((ph & n) == 0) // 如果是0 ,那么创建低位节点
                                        ln = new Node<K,V>(ph, pk, pv, ln);
                                    else // 1 则创建高位
                                        hn = new Node<K,V>(ph, pk, pv, hn);
                                }
                                // 其实这里类似 hashMap
                                // 设置低位链表放在新数组的 i
                                setTabAt(nextTab, i, ln);
                                // 设置高位链表,在原有长度上加 n
                                setTabAt(nextTab, i + n, hn);
                                // 将旧的链表设置成占位符,表示处理过了
                                setTabAt(tab, i, fwd);
                                // 继续向后推进
                                advance = true;
                            }// 如果是红黑树
                            else if (f instanceof TreeBin) {
                                TreeBin<K,V> t = (TreeBin<K,V>)f;
                                TreeNode<K,V> lo = null, loTail = null;
                                TreeNode<K,V> hi = null, hiTail = null;
                                int lc = 0, hc = 0;
                                // 遍历
                                for (Node<K,V> e = t.first; e != null; e = e.next) {
                                    int h = e.hash;
                                    TreeNode<K,V> p = new TreeNode<K,V>
                                        (h, e.key, e.val, null, null);
                                    // 和链表相同的判断,与运算 == 0 的放在低位
                                    if ((h & n) == 0) {
                                        if ((p.prev = loTail) == null)
                                            lo = p;
                                        else
                                            loTail.next = p;
                                        loTail = p;
                                        ++lc;
                                    } // 不是 0 的放在高位
                                    else {
                                        if ((p.prev = hiTail) == null)
                                            hi = p;
                                        else
                                            hiTail.next = p;
                                        hiTail = p;
                                        ++hc;
                                    }
                                }
                                // 如果树的节点数小于等于 6,那么转成链表,反之,创建一个新的树
                                ln = (lc <= UNTREEIFY_THRESHOLD) ? untreeify(lo) :
                                    (hc != 0) ? new TreeBin<K,V>(lo) : t;
                                hn = (hc <= UNTREEIFY_THRESHOLD) ? untreeify(hi) :
                                    (lc != 0) ? new TreeBin<K,V>(hi) : t;
                                // 低位树
                                setTabAt(nextTab, i, ln);
                                // 高位数
                                setTabAt(nextTab, i + n, hn);
                                // 旧的设置成占位符
                                setTabAt(tab, i, fwd);
                                // 继续向后推进
                                advance = true;
                            }
                        }
                    }
                }
            }
        }
        复制代码

    1. addCount: count

      // 从 putVal 传入的参数是x=1,check=binCount默认是0,只有hash冲突了才会大于1,且他的大小是链表的长度(如果不是红黑树结构的话,红黑树=2)。
          private final void addCount(long x, int check) {
              CounterCell[] as; long b, s;
               //如果计数盒子不是空或者修改 baseCount 失败
              if ((as = counterCells) != null ||
                  !U.compareAndSwapLong(this, BASECOUNT, b = baseCount, s = b + x)) {
                  CounterCell a; long v; int m;
                  boolean uncontended = true;
                   // 如果计数盒子是空(尚未出现并发)
                   // 如果随机取余一个数组位置为空 或者
                   // 修改这个槽位的变量失败(出现并发了)
                   // 执行 fullAddCount 方法,在fullAddCount自旋直到CAS操作成功才结束退出
                  if (as == null || (m = as.length - 1) < 0 ||
                      (a = as[ThreadLocalRandom.getProbe() & m]) == null ||
                      !(uncontended =
                        U.compareAndSwapLong(a, CELLVALUE, v = a.value, v + x))) {
                      fullAddCount(x, uncontended);
                      return;
                  }
                  if (check <= 1)
                      return;
                  s = sumCount();
              }
              // 检查是否需要扩容,在 putVal 方法调用时,默认就是要检查的(check默认是0,链表是链表长度,红黑树是2),如果是值覆盖了,就忽略
              if (check >= 0) {
                  Node<K,V>[] tab, nt; int n, sc;
                  // 如果map.size() 大于 sizeCtl(达到扩容阈值需要扩容) 且
                  // table 不是空;且 table 的长度小于 1 << 30。(可以扩容)
                  while (s >= (long)(sc = sizeCtl) && (tab = table) != null &&
                         (n = tab.length) < MAXIMUM_CAPACITY) {
                      // 根据 length 得到一个标识
                      int rs = resizeStamp(n);
                      if (sc < 0) {//表明此时有别的线程正在进行扩容
                          // 如果 sc 的低 16 位不等于 标识符(校验异常 sizeCtl 变化了)
                          // 如果 sc == 标识符 + 1 (扩容结束了,不再有线程进行扩容)(默认第一个线程设置 sc ==rs 左移 16 位 + 2,当第一个线程结束扩容了,就会将 sc 减一。这个时候,sc 就等于 rs + 1)
                          // 如果 sc == 标识符 + 65535(帮助线程数已经达到最大)
                          // 如果 nextTable == null(结束扩容了)
                          // 如果 transferIndex <= 0 (转移状态变化了)
                          // 结束循环
                          if ((sc >>> RESIZE_STAMP_SHIFT) != rs || sc == rs + 1 ||
                              sc == rs + MAX_RESIZERS || (nt = nextTable) == null ||
                              transferIndex <= 0)
                              break;
                          // 不满足前面5个条件时,尝试参与此次扩容,把正在执行transfer任务的线程数加1,+2代表有1个,+1代表有0个,表示多了一个线程在帮助扩容,执行transfer
                          if (U.compareAndSwapInt(this, SIZECTL, sc, sc + 1))
                              transfer(tab, nt);
                      }
                      //如果不在扩容,将 sc 更新:标识符左移 16 位 然后 + 2. 也就是变成一个负数。高 16 位是标识符,低 16 位初始是 2.
                      //试着让自己成为第一个执行transfer任务的线程
                      else if (U.compareAndSwapInt(this, SIZECTL, sc,
                                                   (rs << RESIZE_STAMP_SHIFT) + 2))
                          transfer(tab, null);
                      s = sumCount();// 重新计数,判断是否需要开启下一轮扩容
                  }
              }
          }
      复制代码

  1. get:

    public V get(Object key) {
            Node<K,V>[] tab; Node<K,V> e, p; int n, eh; K ek;
            //得到hash
            int h = spread(key.hashCode());
            //table有值,且查找到的槽位有值(tabAt方法通过valatile读)
            if ((tab = table) != null && (n = tab.length) > 0 &&
                (e = tabAt(tab, (n - 1) & h)) != null) {
                //hash、key、value都相同返回当前查找到节点的值
                if ((eh = e.hash) == h) {
                    if ((ek = e.key) == key || (ek != null && key.equals(ek)))
                        return e.val;
                }
                //遍历特殊节点:红黑树、已经迁移的节点(ForwardingNode)等
                else if (eh < 0)
                    return (p = e.find(h, key)) != null ? p.val : null;
                //遍历node链表(e.next也是valitle变量)
                while ((e = e.next) != null) {
                    if (e.hash == h &&
                        ((ek = e.key) == key || (ek != null && key.equals(ek))))
                        return e.val;
                }
            }
            return null;
        }
        Node<K,V> find(int h, Object k) {
            Node<K,V> e = this;
            if (k != null) {
                do {
                    K ek;
                    if (e.hash == h &&
                        ((ek = e.key) == k || (ek != null && k.equals(ek))))
                        return e;
                } while ((e = e.next) != null);
            }
            return null;
        }
    复制代码

  2. remove:

    public V remove(Object key) {
            return replaceNode(key, null, null);
        }
        //通过volatile设置第i个节点的值
        static final <K,V> void setTabAt(Node<K,V>[] tab, int i, Node<K,V> v) {
            U.putObjectVolatile(tab, ((long)i << ASHIFT) + ABASE, v);
        }
        final V replaceNode(Object key, V value, Object cv) {
            int hash = spread(key.hashCode());
            //自旋
            for (Node<K,V>[] tab = table;;) {
                Node<K,V> f; int n, i, fh;
                //数组或查找的槽位为空,结束自旋返回null
                if (tab == null || (n = tab.length) == 0 ||
                    (f = tabAt(tab, i = (n - 1) & hash)) == null)
                    break;
                //正在扩容,帮助扩容
                else if ((fh = f.hash) == MOVED)
                    tab = helpTransfer(tab, f);
                else {
                    V oldVal = null;//返回的旧值
                    boolean validated = false;//是否进行删除链表或红黑树节点
                    synchronized (f) {//槽位加锁
                        //getObjectVolatile获取tab[i],如果此时tab[i]!=f,说明其他线程修改了tab[i]。回到for循环开始处,重新执行
                        if (tabAt(tab, i) == f) {//槽位节点没有变化
                            if (fh >= 0) {//槽位节点是链表
                                validated = true;
                                //遍历链表
                                for (Node<K,V> e = f, pred = null;;) {
                                    K ek;
                                    //hash、key、value相同
                                    if (e.hash == hash &&
                                        ((ek = e.key) == key ||
                                         (ek != null && key.equals(ek)))) {
                                        V ev = e.val;//临时节点缓存当前节点值
                                        //值相同
                                        if (cv == null || cv == ev ||
                                            (ev != null && cv.equals(ev))) {
                                            oldVal = ev;//给旧值赋值
                                            if (value != null)//值覆盖,replace()调用
                                                e.val = value;
                                            else if (pred != null)//有前节点,表示当前节点不是头节点
                                                pred.next = e.next;//删除当前节点
                                            else
                                                setTabAt(tab, i, e.next);//删除头节点,即更新当前槽位(数组槽位)节点为头节点的下一节点
                                        }
                                        break;
                                    }
                                    //当前节点不是目标节点,继续遍历下一个节点
                                    pred = e;
                                    //到达链表尾部,依旧没有找到,跳出循环
                                    if ((e = e.next) == null)
                                        break;
                                }
                            }
                            else if (f instanceof TreeBin) {//红黑树
                                validated = true;
                                TreeBin<K,V> t = (TreeBin<K,V>)f;
                                TreeNode<K,V> r, p;
                                //树有节点且查找的节点不为null
                                if ((r = t.root) != null &&
                                    (p = r.findTreeNode(hash, key, null)) != null) {
                                    V pv = p.val;
                                    //值相同
                                    if (cv == null || cv == pv ||
                                        (pv != null && cv.equals(pv))) {
                                        oldVal = pv;//给旧值赋值
                                        if (value != null)//值覆盖,replace()调用
                                            p.val = value;
                                        else if (t.removeTreeNode(p))//删除节点成功
                                            setTabAt(tab, i, untreeify(t.first));//更新当前槽位(数组槽位)节点为树的第一个节点
                                    }
                                }
                            }
                        }
                    }
                    if (validated) {
                        //如果删除了节点,更新size
                        if (oldVal != null) {
                            if (value == null)
                                addCount(-1L, -1);//数量-1
                            return oldVal;
                        }
                        break;
                    }
                }
            }
            return null;
        }
    复制代码

  3. Summary

  1.  put: Use cas to insert, if it is a linked list or a tree node, the lock synchronization operation will be added, which improves the performance

    1. No key or value is allowed to be null, otherwise an exception will be thrown;
    2. Initialize the table (initTable()) at the first put, and the initialization has concurrency control. Judging by the sizeCtl variable, sizeCtl<0 means that there is already a thread being initialized, and the current thread is not in progress, otherwise sizeCtl is set to -1 (CAS) and create array;
    3. When the slot node calculated by hash is null, use CAS to insert the element;
    4. When the hash is MOVED (-1), it helps to expand the capacity, but it may not help, because each thread defaults to 16 buckets. If there are only 16 buckets, the second thread cannot help expand the capacity;
    5. If the hash conflicts, synchronize the slot node. If the slot is a linked list structure, perform a linked list operation, overwrite the old value or insert it at the end of the linked list; if it is a tree structure, add it to the tree;
    6. Elements are added to the linked list or tree, if the length of the linked list is greater than 8, the linked list is converted into a red-black tree;
    7. Call addCount(), size+1, and determine whether addCount() needs to be expanded. If it is a value overwriting operation, there is no need to call this method;
  2. initTable: initialization

    1. The array table is initialized only when it is null, otherwise the old array is returned directly;
    2. If the current sizeCtl is less than 0, it means that a thread is being initialized, and the current thread is polite to the CPU to ensure that only one thread is initializing the array;
    3. If no thread is initializing, the current thread CAS sets sizeCtl to -1 and creates an array, and then recalculates the threshold;
  3. helpTransfer: help expansion

    1. When trying to insert an operation, it is found that the node is of the forward type, which will help expand the capacity;
    2. Every time a thread is added, the lower 16 bits of sizeCtl will be +1, and the upper 16 bits of the identifier will be checked at the same time;
    3. The maximum help thread for capacity expansion is 65535, which is the maximum limit of the lower 16 bits;
    4. Each thread allocates 16 buckets by default. If the number of buckets is 16, then the second thread cannot help expand the capacity, that is, other threads cannot enter the site to expand the capacity after the buckets are allocated;
  4. transfer: capacity expansion and data migration

    1. According to the number of CPU cores, the same number of buckets are evenly allocated to each CPU. If there are not enough 16 buckets, the default is 16;
    2. Expand capacity according to 2 times capacity;
    3. After each thread finishes processing the interval it has received, it can continue to receive. If there is still, the number of buckets can be controlled by decrementing the transferIndex variable by 16;
    4. Every time an empty bucket is processed, the current bucket will be marked as a forward node, and other threads of put will be told to say "I am expanding the capacity, come and help", but if there are only 16 buckets, only one thread can expand the capacity;
    5. If there is a placeholder MOVED, it means that it has been processed, skip this bucket, and continue to process other buckets;
    6. If there is a real actual value, then lock the head node synchronously to prevent the concurrency of putVal;
    7. In the synchronization block, the linked list is split into two parts, and whether it is 0 is obtained according to hash & length. If it is 0, it is placed in the low position of the new array, otherwise it is placed in the high position of length+i. This is to prevent the next value hash from finding the correct position;
    8. If the bucket type is a red-black tree, it will also be split into two, and then judge whether the size of the split bucket is less than or equal to 6, if it is converted into a linked list;
    9. If there is no optional interval and the task is not completed after the thread is processed, it will check the entire table to prevent omissions;
  5. addCount: expansion judgment

    1. When the insertion is over, it will size+1, and judge whether expansion is needed;
    2. Use the counting box first (if it is not empty, it means concurrent), if the counting box is empty, use the baseCount variable +1;
    3. If the modification of baseCount fails, use the counting box, and if the modification still fails, spin in fullAddCount() until the CAS operation succeeds;
    4. Check whether expansion is required;
    5. If the size is greater than or equal to sizeCtl and the length is less than 1<<30, it can be expanded;
    6. If it is already expanding, help it expand;
    7. If the capacity is not being expanded, enable the capacity expansion by itself, update the sizeCtl variable to a negative number, and assign the value to the high 16 bits of the identifier + 2;
  6. remove: remove the element

    1. The number of spin traversals. If the value of the slot node calculated by the array or hash is null, the spin will be terminated and null will be returned directly;
    2. If the slot node is being expanded, help expand the capacity;
    3. If the slot node has a value, lock it synchronously;
    4. If there is still no change in the slot node, determine whether it is a linked list structure type node or a tree structure type node, and find and delete the node or reset the head node by traversing the search element;
    5. If the node is deleted, update size-1, if there is an old value, return the old value, otherwise return null;

Get more learning materials: leave a message in the comment area or private message

Guess you like

Origin blog.csdn.net/weixin_45536242/article/details/125782172