High concurrency series-CAS operation and CPU low-level operation analysis

CAS (Compare-and-Swap), that is, compare and replace, is a technology commonly used when implementing concurrent algorithms. Many classes in Java concurrent packages use CAS technology. CAS is also a frequently asked question in interviews. This article will introduce the principles of CAS in depth.

1. Simple example

When a method implements i++, such as the getAndIncrement1() method below, concurrency safety issues may arise when called by multiple threads. At this time, you can use the synchronized keyword to explicitly lock to ensure concurrency safety.

In the JAVA concurrency package, a lot of Atomic atomic classes are provided, which can also ensure thread safety, such as the AtomicInteger.getAndIncrement() method in getAndIncrement2().

package com.wuxiaolong.concurrent;

import java.util.concurrent.atomic.AtomicInteger;

/**
 * Description:
 *
 * @author 诸葛小猿
 * @date 2020-09-14
 */
public class Test1 {
    
    

    public static int i = 0;

    /**
     * 加锁保证多线程调用的并发安全
     * @return
     */
    public synchronized static int getAndIncrement1(){
    
    
        i++;
        return i;
    }

    /**
     * AtomicInteger是线程安全的  
     * @return
     */
    public static int getAndIncrement2(){
    
    
        AtomicInteger ai = new AtomicInteger();
        int in = ai.getAndIncrement();
        return in;
    }
}

2. What is CAS

CAS: compare and swap or compare and exchange, comparison and interaction. The function is to ensure that the update of a value by multiple threads is thread-safe without a lock.

CAS process: It contains three parameters CAS (V, E, N), V represents the variable to be updated, E represents the expected value, and N represents the new value. Only when the V value is equal to the E value, will the V value be set to N value, if the V value is different from the E value, it means that other threads have made updates, and the current thread does nothing.

If you use CAS to implement the i++ operation, you can have the following steps:

1. Read the current value of i (such as i=0) and mark it as E

2. Calculate the value of i++ as 1, denote it as V

3. Read the value of i again and mark it as N.

4. If E == N, update the value of i to 1, otherwise do not update.

The process of CAS is very similar to optimistic locking. Optimistic locking believes that the probability of thread safety issues is relatively small, so there is no need to directly add locks, just compare the original data when updating the data to see if the original data has changed.

In the fourth step above, if E==N, it does not mean that the value of i has not changed. When one thread executes the fourth step, another thread changes i and then changes it back. For the first thread Say, I don’t know the existence of this intermediate process. This phenomenon is the ABA problem.

How to solve the ABA problem? In fact, it is also very simple. Add a version number field to i, and add 1 to the version number every time i changes. In addition to comparing the value of E every time i is updated, it also compares whether the version number is consistent. This solves the ABA problem. In the actual development process, if the ABA issue has no impact on the business, there is no need to consider this issue.

Third, the use of CAS in AtomicInteger

The underlying implementation of CAS, the underlying implementation of synchronized, and the underlying implementation of volatile are all the same. We use the above-mentioned AtomicInteger class to illustrate.

AtomicInteger is thread-safe, and it is usually said that AtomicInteger is lock-free or spinlock. This is the application of CAS in JDK.

The source code of the AtomicInteger.getAndIncrement() method:

    /**
     * Atomically increments by one the current value.
     *
     * @return the previous value
     */
    public final int getAndIncrement() {
    
    
        return unsafe.getAndAddInt(this, valueOffset, 1);
    }

Unsafe.getAndAddInt() source code:

    public final int getAndAddInt(Object var1, long var2, int var4) {
    
    
        int var5;
        do {
    
    
            var5 = this.getIntVolatile(var1, var2);
        } while(!this.compareAndSwapInt(var1, var2, var5, var5 + var4));

        return var5;
    }

Unsafe.compareAndSwapInt() source code:

public final native boolean compareAndSwapInt(Object var1, long var2, int var4, int var5);

This method is nativemodified, and the source code implementation is not visible in the JDK. Since the java code is executed in the JVM, Oracle's JVM is Hotspot, if you want to see the implementation of the native method, you can find the source code of Hotspot, this source code is written in C and C++. The source code of Unsafe.java corresponds to unsafe.cpp in the Hotspot source code, which is written in C++.

Fourth, the underlying implementation of CAS

To implement CAS, you must understand the source code of Hotspot. You can check the code of OpenJdk, you can find the source code of various versions here . Let's take unsafe.cpp in jdk8u as an example to continue the analysis compareAndSwapIntmethod.

// 地址:http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/a0eb08e2db5a/src/share/vm/prims/unsafe.cpp

UNSAFE_ENTRY(jboolean, Unsafe_CompareAndSwapInt(JNIEnv *env, jobject unsafe, jobject obj, jlong offset, jint e, jint x))
  UnsafeWrapper("Unsafe_CompareAndSwapInt");
  oop p = JNIHandles::resolve(obj);
  jint* addr = (jint *) index_oop_from_field_offset_long(p, offset);
  return (jint)(Atomic::cmpxchg(x, addr, e)) == e;
UNSAFE_END

You can see that the Atomic::cmpxchg method is called, continue to analyze and find this method:

// 地址:http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/a0eb08e2db5a/src/share/vm/runtime/atomic.cpp

jbyte Atomic::cmpxchg(jbyte exchange_value, volatile jbyte* dest, jbyte compare_value) {
    
    
  assert(sizeof(jbyte) == 1, "assumption.");
  uintptr_t dest_addr = (uintptr_t)dest;
  uintptr_t offset = dest_addr % sizeof(jint);
  volatile jint* dest_int = (volatile jint*)(dest_addr - offset);
  jint cur = *dest_int;
  jbyte* cur_as_bytes = (jbyte*)(&cur);
  jint new_val = cur;
  jbyte* new_val_as_bytes = (jbyte*)(&new_val);
  new_val_as_bytes[offset] = exchange_value;
  while (cur_as_bytes[offset] == compare_value) {
    
    
    //关键方法
    jint res = cmpxchg(new_val, dest_int, cur); 
    if (res == cur) break;
    cur = res;
    new_val = cur;
    new_val_as_bytes[offset] = exchange_value;
  }
  return cur_as_bytes[offset];
}

Various CPU architectures under various systems have related implementation methods. The specific file names are as follows:

// 地址:http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/a0eb08e2db5a/src/share/vm/runtime/atomic.inline.hpp

#ifndef SHARE_VM_RUNTIME_ATOMIC_INLINE_HPP
#define SHARE_VM_RUNTIME_ATOMIC_INLINE_HPP

#include "runtime/atomic.hpp"

// Linux
#ifdef TARGET_OS_ARCH_linux_x86
# include "atomic_linux_x86.inline.hpp"
#endif
#ifdef TARGET_OS_ARCH_linux_sparc
# include "atomic_linux_sparc.inline.hpp"
#endif
#ifdef TARGET_OS_ARCH_linux_zero
# include "atomic_linux_zero.inline.hpp"
#endif
#ifdef TARGET_OS_ARCH_linux_arm
# include "atomic_linux_arm.inline.hpp"
#endif
#ifdef TARGET_OS_ARCH_linux_ppc
# include "atomic_linux_ppc.inline.hpp"
#endif

// Solaris
#ifdef TARGET_OS_ARCH_solaris_x86
# include "atomic_solaris_x86.inline.hpp"
#endif
#ifdef TARGET_OS_ARCH_solaris_sparc
# include "atomic_solaris_sparc.inline.hpp"
#endif

// Windows
#ifdef TARGET_OS_ARCH_windows_x86
# include "atomic_windows_x86.inline.hpp"
#endif

// AIX
#ifdef TARGET_OS_ARCH_aix_ppc
# include "atomic_aix_ppc.inline.hpp"
#endif

// BSD
#ifdef TARGET_OS_ARCH_bsd_x86
# include "atomic_bsd_x86.inline.hpp"
#endif
#ifdef TARGET_OS_ARCH_bsd_zero
# include "atomic_bsd_zero.inline.hpp"
#endif

#endif // SHARE_VM_RUNTIME_ATOMIC_INLINE_HPP
    

Under the src/os_cpu/ directory, there are code implementations of various cpu architectures under various systems. Among them, src/os_cpu/linux_x86/vm is the code based on the x86 architecture under linux, and the final implementation of the cmpxchg method:

// 地址:http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/a0eb08e2db5a/src/os_cpu/linux_x86/vm/atomic_linux_x86.inline.hpp

inline jint     Atomic::cmpxchg    (jint     exchange_value, volatile jint*     dest, jint     compare_value) {
    
    
  int mp = os::is_MP();
  __asm__ volatile (LOCK_IF_MP(%4) "cmpxchgl %1,(%3)"
                    : "=a" (exchange_value)
                    : "r" (exchange_value), "a" (compare_value), "r" (dest), "r" (mp)
                    : "cc", "memory");
  return exchange_value;
}    

One __asm__ volatile (LOCK_IF_MP(%4) "cmpxchgl %1,(%3)"piece of code is the core, and asm refers to assembly language, which is a machine language that directly interacts with the cpu.

LOCK_IF_MP means "lock if there are multiple CPUs", and MP means Multi-Processors. The program will decide whether to add the lock prefix to the cmpxchg instruction according to the current number of processors. If the program is running on multiple processors, add the lock prefix (lock cmpxchg) to the cmpxchg instruction. On the contrary, if the program is running on a single processor, the lock prefix is ​​omitted (a single processor itself will maintain the order consistency within the single processor, and the memory barrier effect provided by the lock prefix is ​​not required).

// 地址:http://hg.openjdk.java.net/jdk8u/jdk8u/hotspot/file/a0eb08e2db5a/src/os_cpu/linux_x86/vm/atomic_linux_x86.inline.hpp

// Adding a lock prefix to an instruction on MP machine
#define LOCK_IF_MP(mp) "cmp $0, " #mp "; je 1f; lock; 1: "

to sum up

As can be seen from the above, the essence of CAS is:

lock cmpxchg instruction

But cmpxchgthis cpu instruction itself is not atomic, it still relies on the previous lockinstruction.

Follow the official account and enter " java-summary " to get the source code.

Finished, call it a day!

[ Dissemination of knowledge, sharing of value ], thank you friends for your attention and support. I am [ Zhuge Xiaoyuan ], an Internet migrant worker struggling in hesitation.

Guess you like

Origin blog.csdn.net/wuxiaolongah/article/details/108591678