Java Memory Model Study Notes (a) - basic

1, classification concurrent programming model

In concurrent programming, we need to address two key questions: 1, how to communicate between threads, 2, how to synchronize between threads . Which refers to the communication mechanism to exchange information between threads, synchronization refers to a program for operating the mechanism relative order between the threads occurs.

In the programming command, the communication mechanism between two threads: the shared memory and message passing . In the shared-memory concurrency model, the common state shared program between the threads between the thread by writing - to make implicit communicate public status read memory. In concurrency model messaging, there is no common thread between the state, must be explicitly communicate by transmitting explicit messages between threads.

In the concurrent shared memory model, synchronization is performed explicitly, because the programmer must explicitly specify in some way mutually exclusive or writing a piece of code required to perform between threads. In concurrency model's messaging, since the message must be sent before receiving the message, thus the synchronization is performed implicitly.

java concurrent uses a shared memory model, communication between java thread is always implicitly carried out, the whole communication process is completely transparent to the programmer, if the programmer does not understand the working mechanism of communication between threads implicitly carried out, then You will encounter a variety of memory visibility baffling problem.

2, JMM - java memory model

In java, all the instance fields, static fields and array elements are stored in the heap memory, heap memory is shared between threads of the area. Local variables, parameters and methods defined exception processing parameters are not shared among the threads, they do not have the visibility of the problem of memory, and the impact is not restricted memory model.

Communication between a java thread JMM (java memory model) control, JMM decide when a thread writes to shared variables visible to another thread. From an abstract point of view, JMM defines an abstract relationship between the thread and the main memory (high-energy warning !!): shared between threads of variables stored in main memory (main memory), each thread has a private local memory (local memory), local memory is stored in a copy of a copy of the thread to shared variables. Note: JMM local memory is an abstract concept, and are not real. abstract schematic java memory model are as follows:

From the map view, between the thread A and thread B to communicate, it must undergo the following two steps:

  1. First of all, the thread will be updated A local memory shared variables flushed to main memory;

  2. Then, the thread B reads the shared variables in main memory (note that in this case A shared variable is updated).

JMM model two provisions:

  • Thread must be in their memory for all operations shared variables, can not read and write directly from the main memory;

  • Other threads can not access the working memory of the variables directly between different threads, the thread passes between variable values ​​need to be done by the main memory.

A schematic diagram to explain it:

, Thread A and thread B in the main memory will be shared variable X shown in FIG copy to their working memory. Hypothesis, the initial value is x = 0. A thread in its own local memory in the value of x is changed to 1, then the modified x flushed to main memory. Thread B into the main memory read thread A modified value, at this time, the value of x in the local memory of the thread B has become one. Thus, thread A and thread B to complete a communication.

Look at the whole process, from the whole, these two steps are essentially the thread A to B a message thread, and this process must rely on main memory. JMM by controlling the interaction between the main memory and the local memory of each thread, to provide visibility of memory (to modify the value of a shared variable thread, the ability to be seen by other threads).

Therefore, to achieve visibility of shared variables must ensure two things:

  • Shared variables after the thread change the value of timely refresh from the working memory to the main memory;

  • Other threads can be timely to the latest value of the shared variable from main memory to update their working memory.

在Java语言层面支持的可见性实现原理方式有SynchronizeVolatile

3、指令重排

在执行一段程序的时候,为了性能,编译器和处理器常常会对一些指令进行重排。重排序分为三种类型:

  • 编译器优化的重排序:编译器在不改变单线程语义的前提下,可以重新安排语句的执行顺序。

  • 指令级并行的重排序:现代处理器采用了指令级并行技术来将多条指令重叠执行(计算机组成原理的课程中有讲到)。如果不存在数据依赖性,处理器可以改变语句对应机器指令的执行顺序。

  • 内存系统的重排序:由于处理器使用缓存和读写缓冲区,这使得加载和存储操作看上去可能是在乱序执行。

从java源码到最终实际执行的指令序列,分别会经历下面三种重排序:

这些重排序可能导致多线程程序出现内存可见性问题。对于编译器,JMM的编译器重排序规则会禁止特定类型的编译器重排序(不是所有的编译器重排都禁止)。对于处理器排序,JMM的处理器重排序规则会要求java编译器在生成指令序列时,插入特定类型的内存屏障(memory barriers,intel称之为memory fence)指令,通过内存屏障指令来禁止特定类型的处理器重排序(不是禁止所有的处理器重排序)。

JMM属于语言级的内存模型,它确保不同的编译器和不同的处理器平台之上,通过禁止特定类型的编译器重排序和处理器重排序指令,为程序提供一致的内存可见性保证。

4、处理器重排序与内存屏障指令

现代的处理器使用写缓冲区来临时保存向内存中写入的数据。写缓冲区可以保证指令流水线持续运行,它可以避免处理器停顿下来等待向内存中写入数据而产生的延迟。同时,通过批处理的方式刷新写缓冲区,以及合并写缓冲区对同一内存地址的多次写,可以减少对内存总线的占用。虽然写缓冲区有这么多好处,但每个处理器上的写缓冲区,仅仅对它所在的处理器可见。这个特性会对内存操作的执行顺序产生重要的影响:处理器对内存的读/写操作的执行顺序,不一定与内存实际发生的读/写操作顺序一致!为了具体说明,请看下面示例:

Processor A Processor B
a = 1; //A1 b = 2; //B1
x = b; //A2 y = a; //B2

初始状态:a = b = 0 , 处理器允许执行后得到结果:x = y = 0

这里处理器A和处理器B可以同时把共享变量写入自己的写缓冲区(A1,B1),然后从内存中读取另一个共享变量(A2,B2),最后才把自己写缓存区中保存的脏数据刷新到内存中(A3,B3)。当以这种时序执行时,程序就可以得到x = y = 0的结果。

从内存操作实际发生顺序来看,直到处理器A执行A3来刷新自己的写缓存区,写操作A1才算真执行了。虽然处理器A执行内存操作的顺序为:A1->A2,但内存操作实际发生的顺序却是:A2->A1。此时,处理器A的内存操作顺序被重排序了(处理器B的情况和处理器A一样)。

这里的关键是,由于写缓冲区仅对自己的处理器可见,它会导致处理器执行内存操作的顺序可能会与内存实际的操作执行顺序不一致。由于现代的处理器都会使用写缓冲区,因此现代的处理器都会允许对写-读操作重排序。

为了保证内存可见性,java编译器在生成指令序列的适当位置会插入内存屏障指令来禁止特定类型的处理器重排序。JMM把内存屏障指令分为下列四类:

屏障类型 指令示例 说明
LoadLoadBarriers Load1; LoadLoad; Load2 确保Load1数据的装载,之前于Load2
及所有后续装载指令的装载。
StoreStoreBarriers Store1;StoreStore;Store2 确保Store1数据对其他处理器可见(刷新到内存)
,之前于Store2及所有后续存储指令的存储。
LoadStoreBarriers Load1; LoadStore;Store2 确保Load1数据装载,之前于Store2及
所有后续的存储指令刷新到内存。
StoreLoadBarriers Store1; StoreLoad;Load2 确保Store1数据对其他处理器变得可见
(指刷新到内存),之前于Load2及所有后续装载指令的装载。
StoreLoad Barriers会使该屏障之前的所有内存访问
指令(存储和装载指令)完成之后,才执行该屏障之后的内存访问指令。

StoreLoad Barriers是一个“全能型”的屏障,它同时具有其他三个屏障的效果。现代的多处理器大都支持该屏障(其他类型的屏障不一定被所有处理器支持)。执行该屏障开销会很昂贵,因为当前处理器通常要把写缓冲区中的数据全部刷新到内存中(buffer fully flush)。

总结:Java编译器在生成指令序列的适当位置会插入内存屏障指令来禁止特定类型的处理器重排序,从而让程序按我们预想的流程去执行:

  • 保证特定操作的执行顺序;

  • 影响某些数据(或则是某条指令的执行结果)的内存可见性。

5、Happens-Before规则

上面的内容讲述了重排序原则,一会是编译器重排序一会是处理器重排序,如果让程序员再去了解这些底层的实现以及具体规则,那么程序员的负担就太重了,严重影响了并发编程的效率。

因此,JMM为程序员在上层提供了happens-before规则,这样我们就可以根据规则去推论跨线程的内存可见性问题,而不用再去理解底层重排序的规则。程序员对于两个操作指令是否真的被重排序并不关心,程序员关心的是程序执行时的语义不能被改变(即执行结果不能被改变)。

从JDK5开始,java使用happens-before的概念来阐述操作之间的内存可见性。在JMM中,如果一个操作执行的结果需要对另一个操作可见,那么这两个操作之间必须要存在happens-before关系。这里提到的两个操作既可以是在一个线程之内,也可以是在不同线程之间。

两个操作之间具有happens-before关系,并不意味着前一个操作必须要在后一个操作之前执行!happens-before仅仅要求前一个操作(执行的结果)对后一个操作可见,且前一个操作按顺序排在第二个操作之前(the first is visible to and ordered before the second) 。

与程序员密切相关的happens-before规则如下:

  • 程序顺序规则:一个线程中的每个操作,happens- before 于该线程中的任意后续操作;

  • 监视器锁规则:对一个监视器锁的解锁,happens- before 于随后对这个监视器锁的加锁;

  • volatile变量规则:对一个volatile域的写,happens- before 于任意后续对这个volatile域的读;

  • 传递性:如果A happens- before B,且B happens- before C,那么A happens- before C;

  • 线程start()规则:主线程A启动线程B,线程B中可以看到主线程启动B之前的操作。也就是start() happens before 线程B中的操作;

  • 线程join()规则:主线程A等待子线程B完成,当子线程B执行完毕后,主线程A可以看到线程B的所有操作。也就是说,子线程B中的任意操作,happens-before join()的返回。

参考资料:

[1] 程晓明. 深入理解Java内存模型

[2] 周志明. 深入理解JVM虚拟机

[3] 程晓明,方腾飞,魏鹏. java并发编程的艺术

Guess you like

Origin www.cnblogs.com/simon-1024/p/12082269.html