Java Concurrency Principle and JMM of Reading Notes

640?wxfrom=5&wx_lazy=1

Source: Blog Garden | Author: To Point | Link: Read the original text

Note: This article is a reading note from the book "The Art of Java Concurrent Programming".

1 Implementation of the underlying principle of Java concurrency mechanism

Application of volatile

  Volatile is a lightweight synchronized, which ensures the "visibility" of shared variables in multiprocessor development. Visibility means that when one thread modifies a shared variable, another thread can read the modified value. If the volatile variable modifier is used properly, it is cheaper to use and execute than synchronized because it does not cause thread context switching and scheduling.


The realization principle and application of synchronized

  Synchronized realizes the basis of synchronization: every object in Java can be used as a lock. The specific performance is in the following 3 situations* 

 1. For ordinary synchronized methods, the lock is the current instance object.
 2. For static synchronized methods, the lock is the Class object of the current class.
 3. For synchronized method blocks, the lock is the object configured in the Synchronized brackets.
 When a thread attempts to access a synchronized block of code, it must first acquire the lock and release the lock when it pushes or throws an exception.  

Bias lock

  When a thread accesses a synchronized block and acquires a lock, it will store the lock-biased thread ID in the lock record in the object header and stack frame. In the future, the thread does not need to perform CAS operations to lock and unlock when entering and exiting the synchronized block. , just simply test whether the Mark Word in the object header stores a biased lock pointing to the current thread. If the test succeeds, the thread has acquired the lock. If the test fails, you need to test again whether the flag of the biased lock in Mark Word is set to 1 (indicating that it is currently a biased lock): if it is not set, use CAS to compete for the lock; if it is set, try to use CAS to convert the object header Biased locks point to the current thread. Biased locks use a mechanism that waits until contention occurs before releasing the lock, so when other threads try to compete for the biased lock, the thread holding the biased lock releases the lock. The revocation of biased locks requires waiting for a global safepoint (at which point no bytecode is being executed).  

Lightweight lock

  Before the thread executes the synchronized block, the JVM will first create a space for storing the lock record in the stack frame of the current thread, and copy the Mark Word in the object header to the lock record, which is officially called Displaced Mark Word. The thread then attempts CAS to replace the Mark Word in the object header with a pointer to the lock record. If it succeeds, the current thread acquires the lock. If it fails, it means that other threads compete for the lock, and the current thread tries to use spin to acquire the lock. When lightweight unlocking, the Displaced Mark Word is replaced back to the object header using an atomic CAS operation. If successful, no competition occurs. If it fails, it means that the current lock is competing, and the lock will swell into a heavyweight lock. Because spin consumes CPU, in order to avoid useless spin (for example, the thread acquiring the lock is blocked), once it is confirmed to be upgraded to a heavyweight lock, it will not return to the lightweight lock state. When the lock is in this state, when other threads try to acquire the lock, they will all be blocked. When the thread holding the lock releases the lock, these threads will be awakened, and the awakened thread will have a new round of lock competition.  

Advantages and disadvantages of locks

                                                                                                                                                                       

Lock advantage shortcoming Applicable scene
Bias lock Locking and unlocking do not require additional consumption, and there is only a nanosecond gap compared to executing asynchronous methods If there is lock competition between threads, it will bring additional lock revocation consumption Suitable for scenarios where only one thread accesses synchronized blocks
Lightweight lock Competing threads will not block, improving the response speed of threads Using spin consumes CPU if you never get a thread contending for a lock In pursuit of response time
synchronized block execution is very fast
heavyweight lock Thread competition does not apply spin and does not consume CPU Thread blocking and slow response time Pursuit of throughput
synchronized block execution speed is longer

How atomic operations work

 

Atomic means "the smallest particle that cannot be further divided", while atomic operation means "an operation or series of operations that cannot be interrupted".  

Definition of Terms

                                                                                                                                                               

term name English explain
cache line Cover line The smallest operating unit of the cache
compare and swap Compare and Swap The CAS operation requires the input of two values, an old value (the value before the expected operation) and a new value. During the operation, the old value is first compared to see if it has changed. If there is no change, it is exchanged with the new value. do not exchange
CPU pipeline CPU pipeline CPU流水线的工作方式就像工业生产上的装配流水线,在CPU中由5-6个不同功能的电路单元组成一条指令处理流水线,然后将一条X86指令分成5-6步后再由这点电路单元分别执行,这样就能实现在一个CPU时钟周期完成一条指令,因此提高CPU的运算速度
内存顺序冲突 Memory order violation 内存顺序冲突一般是由假共享引起的,假共享是指多个CPU同时修改同一个缓存行的不同部分而引起其中一个CPU的操作无效,当出现这个内存顺序冲突时,CPU必须清空流水线

处理器如何实现原子操作

  处理器提供总线锁定和缓存锁定两个机制来保证复杂内存操作的原子性

(1) Use bus locks to ensure atomicity : The first mechanism is to ensure atomicity through bus locks. It is ensured that when CPU1 reads and writes a shared variable, CPU2 cannot operate the cache that caches the memory address of the shared variable. The so-called bus lock is to use a LOCK# signal provided by the processor. When a processor outputs this signal on the bus, the requests of other processors will be blocked, so the processor can exclusively share the shared memory.
(2) Use cache locks to ensure atomicity : The second mechanism is to ensure atomicity through cache locks. At the same time, we only need to ensure that the operation of a certain memory address is atomic, but the bus lock locks the communication between the CPU and the memory, which makes other processors cannot operate other memory addresses during the lock. Therefore, the overhead of bus locking is relatively high. Currently, processors use cache locking instead of bus locking for optimization in some scenarios. The so-called "cache lock" means that if the memory area is cached in the processor's cache line and locked during the Lock operation, then when other lock operations are performed to write back to the memory, the processor does not assert the LOCK# signal on the bus, Instead, it modifies the internal memory address and allows its cache coherence mechanism to ensure the atomicity of the operation, because the cache coherence mechanism will prevent the simultaneous modification of the memory area data cached by more than two processors and write back to other processors. The cache line is invalidated when the cache line data has been locked.
But there are two cases where the processor does not use cache locking : when the data of the operation cannot be cached inside the processor, or when the data of the operation spans multiple cache lines, the processor will call the bus lock. When some processors do not support cache locking.

2How Java implements atomic operations


(1)使用循环CAS实现原子操作:JVM中的CAS操作正是利用了处理器提供的CMPXCHG指令实现的。自旋CAS实现的基本思路就是循环进行CAS操作直到成功为止。从Java1.5开始,JDK的并发包里提供了一些类来支持原子操作,如AtomicBoolean。
(2)CAS实现原子操作的三大问题ABA问题,因为CAS需要在操作值的时候,检查值有没有发生变化,如果没有发生变化则更新,但是如果一个值原来是A,变成了B,又变成了A,那么使用CAS进行检查时会发现它的值没有变化,但是实际上却变化了。ABA问题的解决思路是使用版本号,在变量前追加版本号,每次变量更新的时候把版本号加1,那么A-B-A就会变成了1A-2B-3A。从JAVA1.5开始,JDK的Atomic包里提供了一个类AtomicStampedreference来解决ABA问题。这个类的compareAndSet方法的作用是首先检查当前引用是否等于预期引用,并且检查当前标志是否等于预期标志,如果全部相等,则以原子方式将该引用和该标识的值设置为给定的更新值。循环时间长开销大,自旋CAS如果长时间不成功,会给CPU带来非常大的执行开销。如果JVM能支持处理器提供的pause指令,那么效率会有一定的提升。pause指令有两个作用:第一,它可以延迟流水线执行指令,使CPU不会消耗过多的执行资源,延迟的时间取决于具体的版本,在一些处理器上延迟时间是零;第二,它可以避免在退出循环的时候因内存顺序冲突而引起CPU流水线被清空,从而提高CPU的执行效率。只能保证一个共享变量的原子性, When performing operations on a shared variable, we can use cyclic CAS to ensure atomic operations, but when operating on multiple shared variables, cyclic CAS cannot guarantee the atomicity of operations, and locks can be used at this time. Another tricky way is to combine multiple shared variables into one shared variable to operate. After Java 1.5, the JDK provides the AtomicReference class to ensure the atomicity between reference objects, so that multiple variables can be placed in one object to perform CAS operations.
(3) Use the lock mechanism to achieve atomic operations : The lock mechanism ensures that only the thread that acquires the lock can operate the locked memory area. There are many locking mechanisms implemented in the JVM, including biased locks, lightweight locks and mutex locks. Interestingly, in addition to biased locks, the way the JVM implements locks uses cyclic CAS, that is, a thread uses cyclic CAS to acquire the lock when it wants to enter a synchronized block, and uses cyclic CAS to release the lock when it exits the synchronized block.

3Java memory model

Fundamentals of the Java Memory Model

  In concurrent programming, two key issues need to be addressed: how to communicate between threads and how to synchronize between threads (threads here refer to concurrently executing active entities). Passage refers to the mechanism by which threads exchange information. In imperative programming, there are two mechanisms for communication between threads: memory sharing and message passing .
   

Synchronization refers to the mechanism used in a program to control the relative order in which operations occur between different threads. In the shared memory concurrency model, synchronization is done explicitly. Programmers need to explicitly execute a method or a piece of code needs to be mutually exclusive between threads. In the concurrency model of message passing, synchronization is implicit because the sending of the message must precede the receipt of the message.

Java concurrency uses a shared memory model.

JMM provides Java programmers with memory visibility by controlling the interaction between main memory and each thread's local memory.

Reordering from source code to instruction sequence

在执行程序时,为了提高性能,编译器和处理器常常会对指令做重排序,重排序分3种类型。
1)编译器优化的重排序。编译器在不改变单线程程序语义的前提下,可以重新安排语句的执行顺序。
2)指令级并行的重排序。现代处理器采用了指令级并行技术来将多条指令重叠执行。如果不存在数据依赖性,处理器可以改变语句对应机器指令的执行顺序。
3)内存系统的重排序。由于处理器使用缓存和读/写缓冲区,这使得加载和存储操作看上去可能是在乱序执行。

happens-before简介

  JSR-133(JDK1.5开始Java使用的新的内存模型)使用 happens-before 的概念来阐述操作之间的内存可见性。在JMM中,如果一个操作执行的结果需要对另一个操作可见,那么这两个操作之间必须要存在happens-before关系。这里提到的两个操作既可以是在一个线程内,也可以是在不同线程之间。
   

程序顺序规则:一个线程中的每个操作,happens-before于该线程中的任意后续操作。
 监视器锁规则:对一个锁的解锁,happens-before于随后对这个锁的加锁。
 volatile变量规则:对一个volatile域的写,happens-before于任意后续对这个 volatile域的读。
 传递性:如果A heppens-before于B,且B happnes-before C, 那么A happens-before C。

两个操作之间具有happens-before关系,并不意味着前一个操作必须要在后一个操作之前执行,happens-before仅仅要求前一个操作(执行的结果)对后一个操作可见,且前一个操作按顺序排在第二个操作之前。

重排序

  重排序是指编译器和处理器为了优化程序性能而对指令序列进行重新排序的一种手段。

as-if-serial语义

  as-if-serial的语义是:不管怎么重排序,单线程程序执行的结果不能被改变。编译器、runtime和处理器都必须遵守as-if-serial语义。

volatile的内存语义

  • 可见性。对一个volatile变量的读,总是能看到任意线程对这个volatile变量最后的写入。

  • 原子性。对任意单个volatile变量的读/写具有原子性,但类似于volatile++这种复合操作不具有原子性。


volatile写-读的内存语义:当写一个volatile变量时,JMM会把该线程对应的本地内存中的共享变量值刷新到主内存。
volatile读的内存语义:但读一个volatile变量时,JMM会把该线程对应的本地内存置为无效。线程接下来将从主内存中读取共享变量。
为了实现volatile的内存语义,编译器在生成字节码时,会在指令序列中插入内存屏障来禁止特定类型的处理器重排序。

锁的内存语义

锁的释放和获取的内存语义

  • 当线程释放锁时:JMM会把该线程对应的本地内存中的共享变量刷新到主内存中。  

  • 当线程获取锁时:JMM会把该线程对应的本地变量置为无效,从而使得被监视器保护的临界区代码必须从主内存中读取共享变量。


CAS具有volatile读和volatile写的内存语义

公平锁与非公平锁的内存语义(ReentrantLock为例):

  • 公平锁和非公平锁释放时,最后都要写一个volatile变量state。  

  • 公平锁获取时,首先会去读volatile变量。  

  • 非公平锁获取时,首先会用CAS更新volatile变量,这个操作同时具有volatile读和volatile写的内存语义。


从对ReentrantLock的分析可以看出,释放锁-获取锁的内存语义的实现至少有下面两种方式:

  • 利用volatile变量的写-读所具有的内存语义。

  • 利用CAS所附带的volatile读和volatile写的内存语义。  


final域的内存语义

  • 在构造函数内对一个final域的写入,与随后把这个被构造对象的引用赋值给一个引用变量,这两个操作之间不能重排序。


  • 初次读一个包含final域的对象的引用,与随后初次读这个final域,这两个操作之间不能重排序。


JSR-133为什么要增强final的语义?
在旧的Java内存模型中,一个最严重的缺陷就是线程可能看到final域的值会改变,比如,一个线程当前看到一个证书final域的值为0(还未初始化之前的默认值),过一段时间之后这个线程再去读这个final域的值时,却发现值变成了1(被某个线程初始化之后的值)。

happens-before

happens-before定义:  

  • 如果一个操作happens-before另一个操作,那么第一个操作的执行结果将对第二个操作可见,而且第一个操作的执行顺序排在第二个操作之前。(是JMM对程序员的承诺)

  • 两个操作之间存在happens-before关系,并不意味着Java平台的具体实现必须要按照happens-before关系执行的顺序来执行。如果重排序之后的执行结果,与按happens-before关系来执行的结果一直,那么这种重排序是合法的。(是JMM对编译器和处理器重排序的约束原则)


  as-if-serial语义保证单线程内程序的执行结果不被改变,happens-before关系保证正确同步的多线程程序的执行结果不被改变。 

 as-if-serial semantics create an illusion for programmers writing single-threaded programs: single-threaded programs are executed in program order. The happens-before relationship creates an illusion for programmers who write properly synchronized multithreading: properly synchronized multithreading processes are executed in the order specified by happens-before.

happens-before rules:

  • Program order rules: every operation in a thread happens-before any subsequent operations in that thread.

  • Monitor lock rules: unlocking a lock happens-before subsequent locking of the lock.

  • Volatile variable rule: Writes to a volatile field occur-before any subsequent reads of the volatile field.

  • Transitive: If A happens-before B, and B happens-before C, then A happens-before C.

  • start() rule: If thread A performs the operation ThreadB.start(), then the ThreadB.start() operation of thread A happens-before any operation in thread B.

  • join() rule: If thread A executes the operation ThreadB.join() and returns successfully, then any operation in thread B happens-before thread A returns successfully from the ThreadB.join() operation.


Double Checked Locking and Lazy Initialization

public class DoubleCheckedLocking {
   private static Instance instance;

   public static Instance getInsatnce() {
       if (instance == null) {
           synchronized(DoubleCheckedLocking.class) {
               if (instance == null) {
                   instance = new DoubleCheckedLocking();
               }
           }
       }
   }
}
 
  
  • Double checking is a relatively common lazy initialization scheme, but there are still some problems:

 
  
memory = allocate();    
ctorInstance(memory);  
instance = memory;


2 and 3 in the above pseudocode may be reordered, in which case the returned instance reference may not have been initialized yet. At this time, we have to solve the problem from two aspects:
1. Reordering of 2 and 3 is not allowed.
2. Allow 2 and 3 reordering, but not allow other threads to "see" this reordering.

volatile based solution

public class DoubleCheckedLocking {
   private volatile static Instance instance;

   public static Instance getInsatnce() {
       if (instance == null) {
           synchronized(DoubleCheckedLocking.class) {
               if (instance == null) {
                   instance = new DoubleCheckedLocking();
                   
               }
           }
       }
   }
}
 
  
  • A solution based on class initialization

 
  
public class DoubleCheckedLocking {
   private static class InstanceHolder {
       public static Instance instance = new Instance();
   }

   public static Instance getInsatnce() {
       return InstanceHolder.instance;
   }
}
Recommended reading

The 20th Dachang Harvest Road

Initialization order of classes in Java

Java multithreading series - the problem of production and consumers

640?

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325602146&siteId=291194637
Recommended