Play with concurrency: In high concurrency scenarios, use synchronized for counting, AtomicLong or LongAdder?

Insert picture description here

Synchronized, AtomicLong, or LongAdder for counting?

The counting function is used in many systems, so which one of synchronized, AtomicLong, LongAdder should we use for counting? Let's run an example

public class CountTest {
    
    
    
    private int count = 0;

    @Test
    public void startCompare() {
    
    
        compareDetail(1, 100 * 10000);
        compareDetail(20, 100 * 10000);
        compareDetail(30, 100 * 10000);
        compareDetail(40, 100 * 10000);
    }

    /**
     * @param threadCount 线程数
     * @param times 每个线程增加的次数
     */
    public void compareDetail(int threadCount, int times) {
    
    
        try {
    
    
            System.out.println(String.format("threadCount: %s, times: %s", threadCount, times));
            long start = System.currentTimeMillis();
            testSynchronized(threadCount, times);
            System.out.println("testSynchronized cost: " + (System.currentTimeMillis() - start));

            start = System.currentTimeMillis();
            testAtomicLong(threadCount, times);
            System.out.println("testAtomicLong cost: " + (System.currentTimeMillis() - start));

            start = System.currentTimeMillis();
            testLongAdder(threadCount, times);
            System.out.println("testLongAdder cost: " + (System.currentTimeMillis() - start));
            System.out.println();
        } catch (Exception e) {
    
    
            e.printStackTrace();
        }
    }

    public void testSynchronized(int threadCount, int times) throws InterruptedException {
    
    
        List<Thread> threadList = new ArrayList<>();
        for (int i = 0; i < threadCount; i++) {
    
    
            threadList.add(new Thread(()-> {
    
    
                for (int j = 0; j < times; j++) {
    
    
                    add();
                }
            }));
        }
        for (Thread thread : threadList) {
    
    
            thread.start();
        }
        for (Thread thread : threadList) {
    
    
            thread.join();
        }
    }
    
    public synchronized void add() {
    
    
        count++;
    }

    public void testAtomicLong(int threadCount, int times) throws InterruptedException {
    
    
        AtomicLong count = new AtomicLong();
        List<Thread> threadList = new ArrayList<>();
        for (int i = 0; i < threadCount; i++) {
    
    
            threadList.add(new Thread(()-> {
    
    
                for (int j = 0; j < times; j++) {
    
    
                    count.incrementAndGet();
                }
            }));
        }
        for (Thread thread : threadList) {
    
    
            thread.start();
        }
        for (Thread thread : threadList) {
    
    
            thread.join();
        }
    }

    public void testLongAdder(int threadCount, int times) throws InterruptedException {
    
    
        LongAdder count = new LongAdder();
        List<Thread> threadList = new ArrayList<>();
        for (int i = 0; i < threadCount; i++) {
    
    
            threadList.add(new Thread(()-> {
    
    
                for (int j = 0; j < times; j++) {
    
    
                    count.increment();
                }
            }));
        }
        for (Thread thread : threadList) {
    
    
            thread.start();
        }
        for (Thread thread : threadList) {
    
    
            thread.join();
        }
    }
}

threadCount: 1, times: 1000000
testSynchronized cost: 187
testAtomicLong cost: 13
testLongAdder cost: 15

threadCount: 20, times: 1000000
testSynchronized cost: 829
testAtomicLong cost: 242
testLongAdder cost: 187

threadCount: 30, times: 1000000
testSynchronized cost: 232
testAtomicLong cost: 413
testLongAdder cost: 111

threadCount: 40, times: 1000000
testSynchronized cost: 314
testAtomicLong cost: 629
testLongAdder cost: 162

When the amount of concurrency is relatively low, the advantage of AtomicLong is more obvious , because the bottom layer of AtomicLong is an optimistic lock, and there is no need to block the thread, just keep cas. However , it is advantageous to use synchronized when the concurrency is relatively high , because a large number of threads continue to cas, which will cause the cpu to continue to soar, which will reduce the efficiency.

LongAdder has obvious advantages regardless of the level of concurrency. And the higher the amount of concurrency, the more obvious the advantage

Alibaba's "Java Development Manual" also has the following suggestions.
Insert picture description here
So how does LongAdder achieve high concurrency?

How does LongAdder achieve high concurrency?

The basic idea

The secret of LongAdder's high concurrency is to use space for time. The cas operation on one value becomes the cas operation on multiple values. When the quantity is obtained, the multiple values can be added together.
Insert picture description here

Specific to the source code is

Perform cas operation on the base variable first, and return after cas succeeds
Get a hash value for the thread (call getProbe), the hash value modulates the length of the array, locates the element in the cell array, and performs cas on the elements in the array

Increase the number of

public void increment() {
    
    
    add(1L);
}

public void add(long x) {
    
    
    Cell[] as; long b, v; int m; Cell a;
    // 数组为空则先对base进行一波cas，成功则直接退出
    if ((as = cells) != null || !casBase(b = base, b + x)) {
    
    
        boolean uncontended = true;
        if (as == null || (m = as.length - 1) < 0 ||
            (a = as[getProbe() & m]) == null ||
            !(uncontended = a.cas(v = a.value, v + x)))
            longAccumulate(x, null, uncontended);
    }
}

When the array is not empty, and the element located in a subscript of the array is not empty according to the thread hash value, cas will return directly to this element if it succeeds, otherwise enter the longAccumulate method
Insert picture description here

The cell array has been initialized, mainly to put elements in the cell array, and perform operations such as expansion of the cell array
If the cell array is not initialized, initialize the array
The cell array is being initialized, and other threads use cas to accumulate baseCount

final void longAccumulate(long x, LongBinaryOperator fn,
                          boolean wasUncontended) {
    
    
    int h;
    if ((h = getProbe()) == 0) {
    
    
        ThreadLocalRandom.current(); // force initialization
        h = getProbe();
        wasUncontended = true;
    }
    // 往数组中放元素是否冲突
    boolean collide = false;                // True if last slot nonempty
    for (;;) {
    
    
        Cell[] as; Cell a; int n; long v;
        if ((as = cells) != null && (n = as.length) > 0) {
    
    
            if ((a = as[(n - 1) & h]) == null) {
    
    
            	// 有线程在操作数组cellsBusy=1
            	// 没有线程在操作数组cellsBusy=0
                if (cellsBusy == 0) {
    
           // Try to attach new Cell
                    Cell r = new Cell(x);   // Optimistically create
                    if (cellsBusy == 0 && casCellsBusy()) {
    
    
                        boolean created = false;
                        try {
    
                   // Recheck under lock
                            Cell[] rs; int m, j;
                            // // 和单例模式的双重检测一个道理
                            if ((rs = cells) != null &&
                                (m = rs.length) > 0 &&
                                rs[j = (m - 1) & h] == null) {
    
    
                                rs[j] = r;
                                created = true;
                            }
                        } finally {
    
    
                            cellsBusy = 0;
                        }
                        // 成功在数组中放置元素
                        if (created)
                            break;
                        continue;           // Slot is now non-empty
                    }
                }
                collide = false;
            }
            // cas baseCount失败
            // 并且往CounterCell数组放的时候已经有值了
            // 才会重新更改wasUncontended为true
            // 让线程重新生成hash值，重新找下标
            else if (!wasUncontended)       // CAS already known to fail
                wasUncontended = true;      // Continue after rehash
            // cas数组的值
            else if (a.cas(v = a.value, ((fn == null) ? v + x :
                                         fn.applyAsLong(v, x))))
                break;
            // 其他线程把数组地址改了（有其他线程正在扣哦荣）
            // 数组的数量>=CPU的核数
            // 不会进行扩容
            else if (n >= NCPU || cells != as)
                collide = false;            // At max size or stale
            else if (!collide)
                collide = true;
            // collide = true（collide = true会进行扩容）的时候，才会进入这个else if 
            // 上面2个else if 是用来控制collide的
            else if (cellsBusy == 0 && casCellsBusy()) {
    
    
                try {
    
    
                    if (cells == as) {
    
          // Expand table unless stale
                        Cell[] rs = new Cell[n << 1];
                        for (int i = 0; i < n; ++i)
                            rs[i] = as[i];
                        cells = rs;
                    }
                } finally {
    
    
                    cellsBusy = 0;
                }
                collide = false;
                continue;                   // Retry with expanded table
            }
            h = advanceProbe(h);
        }
        else if (cellsBusy == 0 && cells == as && casCellsBusy()) {
    
    
            boolean init = false;
            try {
    
                               // Initialize table
                if (cells == as) {
    
    
                    Cell[] rs = new Cell[2];
                    rs[h & 1] = new Cell(x);
                    cells = rs;
                    init = true;
                }
            } finally {
    
    
                cellsBusy = 0;
            }
            if (init)
                break;
        }
        else if (casBase(v = base, ((fn == null) ? v + x :
                                    fn.applyAsLong(v, x))))
            break;                          // Fall back on using base
    }
}

Acquired quantity

base value + the value in the Cell array

 public long sum() {
    
    
     Cell[] as = cells; Cell a;
     long sum = base;
     if (as != null) {
    
    
         for (int i = 0; i < as.length; ++i) {
    
    
             if ((a = as[i]) != null)
                 sum += a.value;
         }
     }
     return sum;
 }

It should be noted that the number returned by calling sum() may not be the current number, because in the process of calling the sum() method, there may be other arrays that have changed the base variable or cell array.

// AtomicLong
public final long getAndIncrement() {
    
    
    return unsafe.getAndAddLong(this, valueOffset, 1L);
}

The AtomicLong#getAndIncrement method will return the exact value after increment, because cas is an atomic operation

Finally, I will tell you a small menu. In jdk1.8, ConcurrentHashMap's idea of increasing the number of elements and statistical operations is exactly the same as that of LongAdder. The code is basically the same. If you are interested, you can take a look.

Reference blog

[1]https://www.cnblogs.com/thisiswhy/p/13176237.html