The principle analysis of the CAS mechanism of concurrent programming in java

To learn Java concurrent programming, CAS mechanism is a knowledge point that has to be mastered. This article is mainly an analysis from the reason to the principle. Hope it helps you.

1. Why is CAS mechanism needed?

Why do we need CAS mechanism? Let's start with an error phenomenon. We often use the volatile keyword to modify a certain variable, indicating that this variable is a global shared variable, and at the same time has visibility and order. But there is no atomicity. For example, a common operation a++. This operation can actually be subdivided into three steps:

(1) Read a from the memory

(2) Add 1 to a

(3) Rewrite the value of a into the memory

There is no problem with this operation in single-threaded state, but various problems will occur in multi-threaded state. Because maybe one thread adds 1 to a, and before it can write to the memory, other threads read the old value. Caused thread insecurity. How to solve this problem? The most common way is to use AtomicInteger to modify a. We can look at the code:

public class Test3 {
    //使用AtomicInteger定义a
    static AtomicInteger a = new AtomicInteger();
    public static void main(String[] args) {
        Test3 test = new Test3();
        Thread[] threads = new Thread[5];
        for (int i = 0; i < 5; i++) {
            threads[i] = new Thread(() -> {
                try {
                    for (int j = 0; j < 10; j++) {
                        //使用getAndIncrement函数进行自增操作
                        System.out.println(a.incrementAndGet());        
                        Thread.sleep(500);
                    }
                } catch (Exception e) {
                    e.printStackTrace();
                }
            });
            threads[i].start();
        }
    }
}

Now we use the AtomicInteger class and call the incrementAndGet method to auto-increment a. How is this incrementAndGet implemented? We can look at the source code of AtomicInteger.

    /**
     * Atomically increments by one the current value.
     * @return the updated value
     */

    public final int incrementAndGet() {
        return unsafe.getAndAddInt(this, valueOffset, 1) + 1;
    }

At this step, we can see that it is actually achieved by usafe calling the getAndAddInt method, but now we still can’t see anything. Let’s dive into the source code to see how the getAndAddInt method is implemented.

public final int getAndAddInt(Object var1, long var2, int var4) {   
    int var5;     
    do {          
        var5 = this.getIntVolatile(var1, var2);   
    } while(!this.compareAndSwapInt(var1, var2, var5, var5 + var4));    
    return var5;   
}

At this step, it's a little bit eye-catching. It turns out that the bottom layer calls the compareAndSwapInt method. This compareAndSwapInt method is actually the CAS mechanism. Therefore, if we want to figure out how the atomic operation of AtomicInteger is implemented, we must figure out the CAS mechanism, which is why we need to master the CAS mechanism.

Two, analyze CAS

1. Basic meaning

CAS QuanPin is also called compareAndSwap. From the meaning of the name, it means more exchange. What is the comparison?

The process is like this: it contains 3 parameters CAS (V, E, N), V represents the value of the variable to be updated, E represents the expected value, and N represents the new value. Only when the value of V is equal to the value of E, will the value of V be set to N. If the value of V and E are different, it means that another thread has done two updates, and the current thread will do nothing. Finally, CAS returns the true value of the current V.

Let's give an example I gave before to illustrate this process:

For example, be engaged to your son. Your son is the memory location. You originally thought your son was with Concubine Yang, but when you got engaged, you found Xi Shi next to your son. What should I do at this time? You don't do anything in anger. If your son is with the concubine Yang you expected, you will be engaged to them as soon as you look happy, which is also called performing an operation. You should understand now.

CAS operates with an optimistic attitude. It always believes that it can successfully complete the operation. So CAS is also called optimistic lock, so what is pessimistic lock? Pessimistic lock is our famous synchronized before. You can understand the idea of ​​pessimistic locks in this way. A thread wants to acquire the lock but cannot acquire it. It must be released by someone else.

2. The underlying principle

If you want to clarify the underlying principle, it is the best way to go deep into the source code. Above we have seen through the source code that it is actually done by the Usafe method. In this method, the compareAndSwapInt CAS mechanism is used. Therefore, it is now necessary for us to go further and take a look:

public final class Unsafe {
    // compareAndSwapInt 是 native 类型的方法
    public final native boolean compareAndSwapInt(
        Object o, 
        long offset,
        int expected,
        int x
    )
;
    //剩余还有很多方法
}

We can see that there are four main parameters. The first parameter is the object a we are operating, the second parameter is the address offset of the object a, and the third parameter indicates what value we expect this a to be. The four parameters represent the actual value of a.

But here we will find that this compareAndSwapInt is a native method, that is to say, going down is the C language code, if we remain curious, we can continue to go in and see.

UNSAFE_ENTRY(jboolean, Unsafe_CompareAndSwapInt(JNIEnv *env, jobject unsafe, 
                                            jobject obj, jlong offset, jint e, jint x))
  UnsafeWrapper("Unsafe_CompareAndSwapInt");
  oop p = JNIHandles::resolve(obj);
  // 根据偏移量valueOffset,计算 value 的地址
  jint* addr = (jint *) index_oop_from_field_offset_long(p, offset);
  // 调用 Atomic 中的函数 cmpxchg来进行比较交换
  return (jint)(Atomic::cmpxchg(x, addr, e)) == e;
UNSAFE_END

Let's interpret the above code: first use jint to calculate the address of value, and then use Atomic's cmpxchg method for comparison and exchange based on this address. Now the problem is thrown to this cmpxchg, and it is this function that is actually implemented. Let's take a closer look, and the truth is not far from us.

unsigned Atomic::cmpxchg(unsigned int exchange_value,
                         volatile unsigned int* dest, 
                         unsigned int compare_value) {
    assert(sizeof(unsigned int) == sizeof(jint), "more work to do");
  /*
   * 根据操作系统类型调用不同平台下的重载函数,
     这个在预编译期间编译器会决定调用哪个平台下的重载函数
  */

    return (unsigned int)Atomic::cmpxchg((jint)exchange_value, 
                     (volatile jint*)dest, (jint)compare_value);
}

皮球又一次被完美的踢走了,现在在不同的操作系统下会调用不同的cmpxchg重载函数,我现在用的是win10系统,所以我们看看这个平台下的实现,别着急再往下走走:

inline jint Atomic::cmpxchg (jint exchange_value, volatile jint* dest, 
                            jint compare_value) {
  int mp = os::is_MP();
  __asm {
    mov edx, dest
    mov ecx, exchange_value
    mov eax, compare_value
    LOCK_IF_MP(mp)
    cmpxchg dword ptr [edx], ecx
  }
}

这块的代码就有点涉及到汇编指令相关的代码了,到这一步就彻底接近真相了,首先三个move指令表示的是将后面的值移动到前面的寄存器上。然后调用了LOCK_IF_MP和下面cmpxchg汇编指令进行了比较交换。现在我们不知道这个LOCK_IF_MP和cmpxchg是如何交换的,没关系我们最后再深入一下。

真相来了,他来了,他真的来了。

inline jint Atomic::cmpxchg (jint exchange_value, 
                             volatile jint* dest, jint compare_value) {
  //1、 判断是否是多核 CPU
  int mp = os::is_MP();
  __asm {
    //2、 将参数值放入寄存器中
    mov edx, dest   
    mov ecx, exchange_value
    mov eax, compare_value 
    //3、LOCK_IF_MP指令
    cmp mp, 0
    //4、 如果 mp = 0,表明线程运行在单核CPU环境下。此时 je 会跳转到 L0 标记处,直接执行 cmpxchg 指令
    je L0
    _emit 0xF0
//5、这里真正实现了比较交换
L0:
    /*
     * 比较并交换。简单解释一下下面这条指令,熟悉汇编的朋友可以略过下面的解释:
     *   cmpxchg: 即“比较并交换”指令
     *   dword: 全称是 double word 表示两个字,一共四个字节
     *   ptr: 全称是 pointer,与前面的 dword 连起来使用,表明访问的内存单元是一个双字单元 
     * 这一条指令的意思就是:
             将 eax 寄存器中的值(compare_value)与 [edx] 双字内存单元中的值进行对比,
             如果相同,则将 ecx 寄存器中的值(exchange_value)存入 [edx] 内存单元中。
     */

    cmpxchg dword ptr [edx], ecx
  }
}

到这一步了,相信你应该理解了这个CAS真正实现的机制了吧,最终是由操作系统的汇编指令完成的。

3、CAS机制的优缺点

(1)优点

一开始在文中我们曾经提到过,cas是一种乐观锁,而且是一种非阻塞的轻量级的乐观锁,什么是非阻塞式的呢?其实就是一个线程想要获得锁,对方会给一个回应表示这个锁能不能获得。在资源竞争不激烈的情况下性能高,相比synchronized重量锁,synchronized会进行比较复杂的加锁,解锁和唤醒操作。

(2)缺点

缺点也是一个非常重要的知识点,因为涉及到了一个非常著名的问题,叫做ABA问题。假设一个变量 A ,修改为 B之后又修改为 A,CAS 的机制是无法察觉的,但实际上已经被修改过了。这就是ABA问题,

ABA问题会带来大量的问题,比如说数据不一致的问题等等。我们可以举一个例子来解释说明。

你有一瓶水放在桌子上,别人把这瓶水喝完了,然后重新倒上去。你再去喝的时候发现水还是跟之前一样,就误以为是刚刚那杯水。如果你知道了真相,那是别人用过了你还会再用嘛?举一个比较黄一点的例子,女朋友被别人睡过之后又还回来,还是之前的那个女朋友嘛?

ABA可以有很多种方式来解决,我们在后续的文章中再进行叙述和讨论。


Guess you like

Origin blog.51cto.com/15082402/2592671