The Beauty of Java Concurrent Programming Chapter 4 Reading Notes

Chapter 4 Analysis of the Principles of Atomic Operations in Java Concurrent Packages

The JUC package provides a series of atomic class operations. These classes are implemented using the non-blocking algorithm CAS, which has greatly improved performance compared to using locks to implement atomic operations.

This chapter only explains the implementation principle of the simplest AtomicLong class and the principles of the newly added LongAdder and LongAccumulator classes in JDK8

Operation class for atomic variables

Atomic operations such as AtomicLong, AtomicInteger, and AtomicBoolean are implemented internally using Unsafe

public class AtomicLongTest extends Number implements Serializable {


    private static final long serialVersionUID= 1927816293512124184L;

    private static final Unsafe unsafe =Unsafe.getUnsafe();


    private static final long valueOffset;

    static final boolean VM_SUPPORTS_LONG_CAS= VMSupportsCS8();
    private static native boolean VMSupportsCS8();
    
    static {
        try{
            valueOffset = unsafe.objectFieldOffset(AtomicLongTest.class.getDeclaredField("value"));
            
        }catch (Exception e){
            throw new Error(e);
        }
        
    }
    private volatile long value;
    private  AtomicLongTest(long initiaValue){
        value=initiaValue;
    }
    @Override
    public int intValue() {
        return 0;
    }


    @Override
    public long longValue() {
        return 0;
    }


    @Override
    public float floatValue() {
        return 0;
    }


    @Override
    public double doubleValue() {
        return 0;
    }
}

Increment and Decrement Code Operations

boolean compareAndSet(long expect,long update)

public final boolean compareAndSet(long expect, long update) {
    return unsafe.compareAndSwapLong(this, valueOffset, expect, update);
}

The unsafe.compareAndSwapLong method is still called internally. If the value in the atomic variable is equal to expect, update the value with the update value and return true, otherwise return false

public class AtomicTest {
    private static AtomicLong atomicLong=new AtomicLong();
    private static Integer[] arrayOne=new Integer[]{0,1,2,3,4,5,6,7,56,0};
    private static Integer[] arrayTwo=new Integer[]{10,1,2,3,4,5,6,0,56,0};

    public static void main(String[] args) throws InterruptedException{
        Thread threadOne=new Thread(new Runnable() {
            @Override
            public void run() {
                int size = arrayOne.length;
                for(int i=0;i<size;i++){
                    if(arrayOne[i].intValue()==0){
                        atomicLong.incrementAndGet();
                    }
                }
            }
        });

        Thread threadTwo=new Thread(new Runnable() {
            @Override
            public void run() {
                int size = arrayTwo.length;
                for(int i=0;i<size;i++){
                    if(arrayTwo[i].intValue()==0){
                        atomicLong.incrementAndGet();
                    }
                }
            }
        });


        threadOne.start();
        threadTwo.start();
        threadOne.join();
        threadTwo.join();
        System.out.println("count 0:"+atomicLong.get());

        
    }
}

The new atomic operation class LongAdder() method in JDK8

a brief introdction

As mentioned earlier, AtomicLong is a non-blocking atomic operation provided by CAS. Compared with the synchronizer of blocking algorithm, the performance is already very good, but under high concurrency, a large number of threads compete for the same atomic variable at the same time. Only one thread Kaiyue operation succeeds, which causes a large number of threads to fail to compete, and will continue to perform self-selection operations through an infinite loop to try CAS, wasting CPU resources in vain

When using LongAddr, multiple Cell variables will be maintained internally. Each Cell has an initial zero long variable. Under the same amount of concurrency, the number of threads competing for a single variable update operation will be reduced, reducing the competition for shared resources in disguise. of concurrency

When multiple threads fail to compete for the same cell variable, instead of spinning CAS retry on the current cell variable, they try to perform CAS on other Cell variables. This change increases the success of the current thread retrying CAS Possibility, finally, when obtaining the value of LongAdder, it is to accumulate the values ​​of all Cell variables and add the value returned by Base

LongAdder maintains a lazy initialized atomic update array (by default, the Cell array is nu and a base value variable base. Since the memory occupied by Cells is relatively large, it is not created at the beginning but created when needed , which is lazy loading.

When it is first judged that the Cell array is null and there are fewer concurrent threads, all accumulation operations are performed on the base variable. Keep the size of the Cell array as 2 to the Nth power, the number of Cel elements in the Cel array is 2 at the time of initialization, and the variable entities in the array are of the Cell type. The Cell type is an improvement of AtomicLong, which is used to reduce cache contention, that is, to solve the problem of false sharing.

It is wasteful to fill bytes for most isolated multiple atomic operations, because atomic operations are scattered in memory irregularly (that is to say, the memory addresses of multiple atomic variables are discontinuous), and many The probability of atomic variables being placed in the same cache line is very small. However, the memory addresses of atomic array elements are continuous, so multiple elements in the array can often share the cache line, so the @sun.misc.Contended annotation is used to fill the Cell class with bytes, which prevents multiple elements in the array from Elements share a cache line, which is a performance boost.

code analysis

  • (1) What is the structure of LongAdder?
  • (2) Which Cell element in the Cell array should the current thread access?
  • (3) How to initialize the Cell array?
  • (4) How to expand the Cell array?
  • (5) How to deal with the conflict of threads accessing the allocated Cel elements?
  • (6) How to ensure the atomicity of the allocated Cell elements for thread operations?

Cell structure

@sun.misc.Contended static final class Cell {
    volatile long value;
    Cell(long x) { value = x; }
    final boolean cas(long cmp, long val) {
        return UNSAFE.compareAndSwapLong(this, valueOffset, cmp, val);
    }

    // Unsafe mechanics
    private static final sun.misc.Unsafe UNSAFE;
    private static final long valueOffset;
    static {
        try {
            UNSAFE = sun.misc.Unsafe.getUnsafe();
            Class<?> ak = Cell.class;
            valueOffset = UNSAFE.objectFieldOffset
                (ak.getDeclaredField("value"));
        } catch (Exception e) {
            throw new Error(e);
        }
    }
}

It can be seen that the structure of Cell is very simple. It maintains a variable declared as volatile. Here, it is called volatile because the thread does not use a lock when operating the value variable. In order to ensure the memory visibility of the variable, it is declared as volatile here. In addition, the cas function guarantees the atomicity of the value in the assigned Cell element when the current thread is updated through the CAS operation. In addition, the Cell class is decorated with @sun.misc.Contended to avoid false sharing

  • ·long sum() returns the current value. The internal operation is to accumulate all the internal values ​​of the Cell and then accumulate the bas. For example, the following code does not lock the Cell array when calculating the sum, so there may be other threads during the accumulation process. The value in the Cell has been modified, and the array may also be expanded, so the value returned by sum is not very accurate, and its return value is not an atomic snapshot value when calling the sum method
public long sum() {
    Cell[] as = cells; Cell a;
    long sum = base;
    if (as != null) {
        for (int i = 0; i < as.length; ++i) {
            if ((a = as[i]) != null)
                sum += a.value;
        }
    }
    return sum;
}
  • void reset() is a reset operation. The following code sets the base to 0. If the Cell array has elements, the element value is reset to 0.
public void reset() {
    Cell[] as = cells; Cell a;
    base = 0L;
    if (as != null) {
        for (int i = 0; i < as.length; ++i) {
            if ((a = as[i]) != null)
                a.value = 0L;
        }
    }
}
  • long sumThenReset0 is an improved version of sum. The following code resets the current Cell value to 0 and base to 0 after using sum to accumulate the corresponding Cell value. In this way, there will be problems when multiple threads call this method. For example, considering that the first calling thread clears the value of Cell, the accumulated value of 0 will be accumulated when the next thread calls.
public long sumThenReset() {
    Cell[] as = cells; Cell a;
    long sum = base;
    base = 0L;
    if (as != null) {
        for (int i = 0; i < as.length; ++i) {
            if ((a = as[i]) != null) {
                sum += a.value;
                a.value = 0L;
            }
        }
    }
    return sum;
}

long longValue0 is equivalent to sum0).

  • Let's mainly look at the implementation of the add method. From this method, you can find answers to other questions.
public void add(long x) {
    Cell[] as; long b, v; int m; Cell a;
    if ((as = cells) != null || !casBase(b = base, b + x)) {
        boolean uncontended = true;
        if (as == null || (m = as.length - 1) < 0 ||
            (a = as[getProbe() & m]) == null ||
            !(uncontended = a.cas(v = a.value, v + x)))
            longAccumulate(x, null, uncontended);
    }
}
final boolean casBase(long cmp, long val){
    return UNSAFE.compareAndSwapLong(this,BASE,cmp, val);
}

Code (1) first checks whether the cells are null, and if it is null, it is currently accumulating on the base variable base, which is similar to the operation of AtomicLong.

If the cells are not null or the CAS operation of the thread execution code (1) fails, the code (2) will be executed, and the code (2) (3) determines which Cell element in the cells array the current thread should access. If the current Execute code (4) if the element mapped by the thread exists, and use the CAS operation to update the value of the allocated Cell element. If the element mapped by the current thread does not exist or exists but the CAS operation fails, execute code (5). In fact, the combination of codes (2(3)(4) is to obtain the Cell element of the cells array that the current thread should access, and then perform the CAS update operation, but if some conditions are not met during the acquisition, it will jump to the code (5 ) execution. In addition, which Cell element of the cells array the current thread should access is calculated by getProbe0)& m, where m is the number of elements in the cells array -1, and getProbe0 is used to obtain the value of the variable threadLocalRandomProbe in the current thread. This The value starts out as 0 and is initialized in code (5). And the current thread guarantees the atomicity of updating the value of the Cell element through the cas function of the assigned Cell element. So far we have answered questions 2 and 6.

This code is the code for initialization and expansion of the cells array

final void longAccumulate(long x, LongBinaryOperator fn,
                          boolean wasUncontended) {
    int h;
    if ((h = getProbe()) == 0) {
        ThreadLocalRandom.current(); // force initialization
        h = getProbe();
        wasUncontended = true;
    }
    boolean collide = false;                // True if last slot nonempty
    for (;;) {
        Cell[] as; Cell a; int n; long v;
        if ((as = cells) != null && (n = as.length) > 0) {
            if ((a = as[(n - 1) & h]) == null) {
                if (cellsBusy == 0) {       // Try to attach new Cell
                    Cell r = new Cell(x);   // Optimistically create
                    if (cellsBusy == 0 && casCellsBusy()) {
                        boolean created = false;
                        try {               // Recheck under lock
                            Cell[] rs; int m, j;
                            if ((rs = cells) != null &&
                                (m = rs.length) > 0 &&
                                rs[j = (m - 1) & h] == null) {
                                rs[j] = r;
                                created = true;
                            }
                        } finally {
                            cellsBusy = 0;
                        }
                        if (created)
                            break;
                        continue;           // Slot is now non-empty
                    }
                }
                collide = false;
            }
            else if (!wasUncontended)       // CAS already known to fail
                wasUncontended = true;      // Continue after rehash
            else if (a.cas(v = a.value, ((fn == null) ? v + x :
                                         fn.applyAsLong(v, x))))
                break;
            else if (n >= NCPU || cells != as)
                collide = false;            // At max size or stale
            else if (!collide)
                collide = true;
            else if (cellsBusy == 0 && casCellsBusy()) {
                try {
                    if (cells == as) {      // Expand table unless stale
                        Cell[] rs = new Cell[n << 1];
                        for (int i = 0; i < n; ++i)
                            rs[i] = as[i];
                        cells = rs;
                    }
                } finally {
                    cellsBusy = 0;
                }
                collide = false;
                continue;                   // Retry with expanded table
            }
            h = advanceProbe(h);
        }
        else if (cellsBusy == 0 && cells == as && casCellsBusy()) {
            boolean init = false;
            try {                           // Initialize table
                if (cells == as) {
                    Cell[] rs = new Cell[2];
                    rs[h & 1] = new Cell(x);
                    cells = rs;
                    init = true;
                }
            } finally {
                cellsBusy = 0;
            }
            if (init)
                break;
        }
        else if (casBase(v = base, ((fn == null) ? v + x :
                                    fn.applyAsLong(v, x))))
            break;                          // Fall back on using base
    }
}

The function casCellsBusy is used here when it is being initialized or expanded, or is currently creating a new Cell element, and switching between 0 and 1 states through the CAS operation. Assuming that the current thread sets ellsBusy to 1 through CAS, the previous thread starts the initialization operation, then other threads cannot expand capacity at this time. For example, the code "14.1) initializes the number of elements in the cells array to 2, and then uses h&1 to calculate which position the current thread should visit in the el array, that is, use the threadLocalRandomProbe variable value & (number of elements in the cells array - 1) of the current thread, and then mark The cels array has been initialized, and the final code (14.3) resets the ellsBusy flag. Obviously, no CAS operation is used here, but it is thread-safe because cellsBusy is of volatile type, which ensures the memory visibility of variables, and at this time other codes have no chance to modify the value of cellsBusy. The values ​​of the two elements in the cells array initialized here are currently still null. Question 3 is answered here, knowing how the cells array is initialized.

The expansion of the cells array is carried out in the code (12), and the expansion of the cells is conditional, that is, when the conditions of the codes (10) and (11) are not met. Specifically, the expansion operation will only be performed when the number of elements in the current cells is less than the number of CPUs in the current machine and multiple threads currently access the same element in the cells, resulting in a conflict that makes one of the threads CAS fail. Why does the number of CPUs need to be involved here? In fact, as mentioned in the basic article, the effect of multithreading will be the best only when each CPU runs a thread, that is, when the number of elements in the cells array and the number of CPUs When consistent, each Cell uses one CPU for processing, and the performance is the best at this time. The capacity expansion operation in code (12) is also to set cellsBusy to 1 through CAS first, and then the capacity can be expanded. Assuming that the CAS is successful, execute the code (12.1) to double the capacity, and copy the Cell elements to the expanded array. In addition, after the expansion, the cells array contains not only the copied elements, but also other new elements. The values ​​of these elements are still null. Question 4 is answered here.

In code (7)(8), the current thread calls the add method and calculates the subscript of the Cell element to be accessed according to the random number threadLocalRandomProbe of the current thread and the number of cells elements, and if the value of the corresponding subscript element is found to be null, then Adds a Cell element to the cells array, and races to set cellsBusy to 1 before adding it to the cells array.

Code (13) recalculates the random value threadLocalRandomProbe of the current thread for the CAS failed thread, so as to reduce the chance of conflict when accessing the cells element next time. Question 5 is answered here.

summary

Introduces the newly added LongAdder atomic operation class in JDK8. This class shares the amount of competition when multiple threads update an atomic variable at the same time under high concurrency through the internal cells array, so that multiple threads can simultaneously update the cells array. The element is operated, and the array element cell is used

The @sun.misc.Contended annotation is modified, which prevents multiple atomic variables in the cells array from being placed in the same cache line, which also avoids false sharing

Probe into the principle of LongAccumulator class

The LongAdder class is a special case of LongAccumulator, which is more powerful than LongAdder.

Among them, accumulatorFunction is a binocular operator interface, and its root input two parameters return a calculated value, and identity is the initial value of the LongAccumulator accumulator

public LongAccumulator(LongBinaryOperator accumulatorFunction,
                       long identity) {
    this.function = accumulatorFunction;
    base = this.identity = identity;
}

Calling LongAdder is equivalent to calling LongAccumulator in the following way:

Calling LongAdder is equivalent to calling LongAccumulator in the following way:

public interface LongBinaryOperator{
        long applyAsLong(long left,long right);
    }
    LongAdder adder new LongAdder();
    LongAccumulator accumulator  =new LongAccumulator(new LongBinaryOperator(){
        @Override
        public long applyhsLong(long left, long right) {
            return left + right;
        }
        
},0);

Compared with LongAdder, LongAccumulator can provide a non-zero initial value for the accumulator, which can only provide a default value of 0. In addition, the former can also specify accumulation rules, such as multiplication without accumulation, just pass in a custom binocular operator when constructing LongAccumulator, while the latter has built-in accumulation rules.

From the following code, we can know that the difference between LongAccumulator and LongAdder is that when calling caseBase, the latter passes b+x, and the former uses r=function.ApplyAsLong(b=base,x) to calculate

public void add(long x) {
    Cell[] as; long b, v; int m; Cell a;
    if ((as = cells) != null || !casBase(b = base, b + x)) {
        boolean uncontended = true;
        if (as == null || (m = as.length - 1) < 0 ||
            (a = as[getProbe() & m]) == null ||
            !(uncontended = a.cas(v = a.value, v + x)))
            longAccumulate(x, null, uncontended);
    }
}
public void accumulate(long x) {
    Cell[] as; long b, v, r; int m; Cell a;
    if ((as = cells) != null ||
        (r = function.applyAsLong(b = base, x)) != b && !casBase(b, r)) {
        boolean uncontended = true;
        if (as == null || (m = as.length - 1) < 0 ||
            (a = as[getProbe() & m]) == null ||
            !(uncontended =
              (r = function.applyAsLong(v = a.value, x)) == v ||
              a.cas(v, r)))
            longAccumulate(x, function, uncontended);
    }
}

When the former is called, the function is passed, and the latter is null

When fn is null, use v+x addition operation, which is equivalent to LongAdder. When fn is not null, use transfer fn function to calculate

else if (caseBase(v = base, ((fn = null) ? v + x : fn.applyAsLong(v, x))))
        break;
}

Summary: This section briefly introduces the principle of LongAccumluator. LongAdder is a special case of LongAccumluator, but the latter provides more powerful functions, allowing users to customize accumulation rules

Summarize

This chapter introduces the atomic operation classes in the concurrent package. These classes are implemented using the non-blocking algorithm CAS, which has greatly improved performance compared to using locks to implement atomic operations. First, it explains the implementation principle of the simplest AtomicLong class, and then explains the principles of the newly added LongAdder class and LongAccumulator class in JDK 8. After studying this chapter, I hope readers can use atomic operation classes according to local conditions to improve system performance in the actual project environment.

Guess you like

Origin blog.csdn.net/weixin_60257072/article/details/130513831