Unsafe and CAS in Java, take a look

Unsafe

Briefly talk about this class. Java cannot directly access the underlying operating system, but through native methods. But despite this, the JVM still opened a back door. There is a class Unsafe in the JDK, which provides hardware-level atomic operations.

Although the methods in this class are public, there is no way to use them, and the JDK API documentation does not provide any explanation about the methods of this class. All in all, the use of Unsafe classes is restricted. Only trusted code can obtain instances of this class. Of course, the classes in the JDK library can be used at will.

From the description in the first line, you can understand that Unsafe provides hardware-level operations, such as obtaining the location of a certain attribute in memory, such as modifying the field value of an object, even if it is private. However, Java itself is to shield the underlying differences, and there is rarely such a demand for general development.

Give two examples, for example:

public native long staticFieldOffset(Field paramField);

This method can be used to obtain the memory address offset of a given paramField. This value is unique and fixed for the given field. Another example:

public native int arrayBaseOffset(Class paramClass);
public native int arrayIndexScale(Class paramClass);

The former method is used to obtain the offset address of the first element of the array, and the latter method is used to obtain the conversion factor of the array, that is, the incremental address of the elements in the array. Finally, look at three methods:

public native long allocateMemory(long paramLong);
public native long reallocateMemory(long paramLong1, long paramLong2);
public native void freeMemory(long paramLong);

Respectively used to allocate memory, expand memory and release memory.

Of course, this requires a certain C/C++ foundation and a certain understanding of memory allocation, which is why I always think that C/C++ developers will have an advantage in switching to Java.

CASE

CAS, Compare and Swap are comparison and exchange, a technology commonly used when designing concurrent algorithms. The java.util.concurrent package is completely built on CAS. Without CAS, there would be no such package. This shows the importance of CAS.

The current processors basically support CAS, but different manufacturers have different implementations. CAS has three operands: memory value V, old expected value A, value to be modified B, if and only if expected value A and memory value V are the same, modify the memory value to B and return true, otherwise nothing Do and return false.

CAS is also implemented through Unsafe, look at the three methods under Unsafe:

public final native boolean compareAndSwapObject(Object paramObject1, long paramLong, Object paramObject2, Object paramObject3);
public final native boolean compareAndSwapInt(Object paramObject, long paramLong, int paramInt1, int paramInt2);
public final native boolean compareAndSwapLong(Object paramObject, long paramLong1, long paramLong2, long paramLong3);

Take the middle comparison and exchange the Int value as an example. If we don't use CAS, the code is roughly like this:

public int i = 1;
public boolean compareAndSwapInt(int j)
{
    if (i == 1)
    {
        i = j;
        return true;
    }
    return false;
}

Of course, this code is definitely problematic under concurrency. It is possible that thread 1 has run to line 5 and is preparing to run line 7, thread 2 is running, and i is changed to 10, and the thread is switched back. Thread 1 has been satisfied previously. The if in line 5 is changed, so two threads modify the variable i at the same time.

The solution is also very simple, just lock and synchronize the compareAndSwapInt method, so that the compareAndSwapInt method becomes an atomic operation. The same is true for CAS. Comparison and exchange are also a group of atomic operations, which will not be interrupted by the outside. First, get the current memory value V in the memory according to paramLong/paramLong1, and compare the memory value V with the original value A. If If they are equal, modify the value B to be modified. Since CAS is a hardware-level operation, the efficiency will be higher.

Analysis of AtomicInteger principle by CAS

The atomic operation classes under the java.util.concurrent.atomic package are all implemented based on CAS. Let’s take AtomicInteger for analysis. First, the definition of ** AtomicInteger class variable: **

private static final Unsafe unsafe = Unsafe.getUnsafe();
private static final long valueOffset;
 
static {
 try {
    valueOffset = unsafe.objectFieldOffset
        (AtomicInteger.class.getDeclaredField("value"));
  } catch (Exception ex) { throw new Error(ex); }
}
 
private volatile int value;

About several member attributes appearing in this code:

Unsafe is the core class of CAS, which has been mentioned before.
valueOffset represents the offset address of the variable value in memory, because Unsafe obtains the original value of the data based on the memory offset address.
The value is modified with volatile, which is very critical.
Let's find a method getAndIncrement to study how AtomicInteger is implemented, such as our commonly used addAndGet method:

public final int addAndGet(int delta) {
    for (;;) {
        int current = get();
        int next = current + delta;
        if (compareAndSet(current, next))
            return next;
    }
}
public final int get() {
         return value;
}

How does this code achieve thread safety through CAS without locking? Let's consider the execution of the method:

The original value of value in AtomicInteger is 3, that is, the value of AtomicInteger in main memory is 3. According to the Java memory model, thread 1 and thread 2 each hold a copy of value with a value of 3.
Thread 1 runs to the third line to obtain the current value of 3, and the thread switches.
Thread 2 starts to run, and obtains a value of 3, using CAS to compare the value in the memory to 3, which is relatively successful, and the memory is modified. At this time, the value in the memory changes to 4, for example, and the thread switches.
Thread 1 resumes operation, uses CAS comparison to find that its value is 3, and the value in memory is 4, and an important conclusion is reached -> At this time, the value is being modified by another thread, so I cannot modify it.
Thread 1’s compareAndSet fails, and it is judged circularly. Because value is volatile modified, it has the characteristic of visibility. Thread 2’s changes to value can be seen by Thread 1. As long as thread 1 finds that the currently acquired value is 4, it is in memory The value of is also 4, indicating that thread 2 has completed the modification of value and thread 1 can try to modify it.
Lastly, for example, thread 3 is also ready to modify the value at this time, it doesn’t matter, because the comparison-swap is an atomic operation that cannot be interrupted. Thread 3 modifies the value, and thread 1 must return false when performing compareAndSet. 1 will continue to loop to obtain the latest value and perform compareAndSet until the obtained value is consistent with the value in the memory.
Throughout the process, the CAS mechanism is used to ensure the thread safety of value modification.

Disadvantages of CAS

CAS looks beautiful, but this operation obviously cannot cover all scenarios under concurrency, and CAS is not semantically perfect. There is such a logical loophole: if a variable V is the value of A when it is first read, and When it is checked that it is still the value of A when preparing for the assignment, can we show that its value has not been modified by other threads? If during this period its value has been changed to B and then changed back to A, the CAS operation will mistakenly believe that it has never been modified. This vulnerability is called the "ABA" problem of CAS operation. In order to solve this problem, the java.util.concurrent package provides a tagged atomic reference class "AtomicStampedReference", which can ensure the correctness of CAS by controlling the version of the variable value. However, at present, this class is relatively "chicken rib". In most cases, ABA problems will not affect the correctness of program concurrency. If you need to solve ABA problems, it may be more efficient to use traditional mutual exclusion synchronization to avoid atomic classes.

Guess you like

Origin blog.csdn.net/doubututou/article/details/109299035