In-depth understanding of the Java memory model (1)-basics

Classification of concurrent programming models

In concurrent programming, we need to deal with two key issues: how to communicate between threads and how to synchronize between threads (threads here refer to active entities that execute concurrently). Communication refers to what mechanism is used to exchange information between threads. In imperative programming, there are two communication mechanisms between threads: shared memory and message passing.

In the concurrency model of shared memory, threads share the common state of the program, and threads communicate implicitly by writing to reading the common state in memory. In the concurrency model of message passing, there is no public state between threads, and the threads must communicate explicitly by sending messages explicitly.

Synchronization refers to the mechanism used by the program to control the relative order of operations between different threads. In the shared memory concurrency model, synchronization is done explicitly. The programmer must explicitly specify that a certain method or a certain piece of code needs to be mutually exclusive between threads. In the concurrency model of message delivery, because the message must be sent before the message is received, synchronization is done implicitly.

Java's concurrency uses a shared memory model. The communication between Java threads is always implicit, and the entire communication process is completely transparent to the programmer . If a Java programmer writing a multi-threaded program does not understand the working mechanism of implicit communication between threads, it is likely to encounter various strange memory visibility problems.

 

Abstraction of the Java memory model

In Java, all instance domains, static domains, and array elements are stored in heap memory, and heap memory is shared between threads (this article uses the term "shared variable" to refer to instance domains, static domains, and array elements). Local variables (Local variables), methods to define parameters (java language specification called formal method parameters) and the exception handler parameters (exception handler parameters) are not shared among threads, they do not have visibility of memory problems, nor Affected by the memory model .

The communication between Java threads is controlled by the Java memory model (referred to as JMM in this article) , and JMM determines when a thread's write to a shared variable is visible to another thread. From an abstract point of view, JMM defines the abstract relationship between threads and main memory : shared variables between threads are stored in main memory, and each thread has a private local memory (local memory) , A copy of the thread to read/write shared variables is stored in local memory . Local memory is an abstract concept of JMM and does not really exist. It covers caches, write buffers, registers, and other hardware and compiler optimizations. The abstract diagram of the Java memory model is as follows:

From the above figure, if you want to communicate between thread A and thread B, you must go through the following two steps:

  1. First, thread A flushes the updated shared variables in local memory A to main memory.
  2. Then, thread B goes to the main memory to read the shared variable that thread A has updated before.

The following diagram illustrates these two steps:

As shown in the figure above, local memory A and B have a copy of the shared variable x in the main memory. Assume that at the beginning, the values ​​of x in these three memories are all 0. When thread A is executing, it temporarily stores the updated x value (assumed to be 1) in its local memory A. When thread A and thread B need to communicate, thread A will first flush the modified x value in its local memory to the main memory, at this time the x value in the main memory becomes 1. Subsequently, thread B goes to the main memory to read the updated x value of thread A, at this time the x value of thread B's local memory also becomes 1.

On the whole, these two steps are essentially thread A sending a message to thread B, and this communication process must go through the main memory. JMM provides memory visibility guarantees for java programmers by controlling the interaction between the main memory and the local memory of each thread.

 

Reorder

In order to improve performance during program execution, compilers and processors often reorder instructions. There are three types of reordering:

  • Compiler optimized reordering. The compiler can rearrange the execution order of statements without changing the semantics of a single-threaded program.
  • Instruction-level parallel reordering. Modern processors use instruction-level parallelism (Instruction-Level Parallelism, ILP) to overlap multiple instructions. If there is no data dependency, the processor can change the execution order of the statements corresponding to the machine instructions.
  • Reordering of the memory system. Because the processor uses caches and read/write buffers, this makes load and store operations appear to be performed out of order.

From the java source code to the final instruction sequence actually executed, the following three kinds of reordering will be experienced respectively:

The above 1 belongs to the compiler reordering, 2 and 3 belong to the processor reordering. These reordering may cause memory visibility problems in multi-threaded programs. For compilers, JMM's compiler reordering rules prohibit certain types of compiler reordering (not all compiler reordering must be prohibited). For processor reordering, JMM's processor reordering rules require the java compiler to insert specific types of memory barriers (memory barriers, referred to as memory fence by intel) instructions when generating instruction sequences, and use memory barrier instructions to prohibit specific types of Processor reordering (not all processor reordering must be prohibited).

JMM is a language-level memory model , which ensures that on different compilers and different processor platforms, it provides programmers with consistent memory visibility guarantees by prohibiting specific types of compiler reordering and processor reordering.

 

Processor reordering and memory barrier instructions

Modern processors use write buffers to temporarily store data written to memory. The write buffer can ensure that the instruction pipeline continues to run, and it can avoid the delay caused by the processor stalling and waiting to write data to the memory. At the same time, by refreshing the write buffer in a batch process and merging multiple writes to the same memory address in the write buffer, the occupation of the memory bus can be reduced. Although the write buffer has so many benefits, the write buffer on each processor is only visible to the processor on which it is located. This feature will have an important impact on the execution order of memory operations: the execution order of the processor's read/write operations on the memory is not necessarily the same as the actual read/write operation sequence of the memory! To illustrate, please see the following example:

Processor A Processor B
a = 1; //A1

x = b; //A2
b = 2; //B1

y = a; //B2
Initial state: a = b = 0 After the

processor is allowed to execute, the result is obtained: x = y = 0

Assuming that processor A and processor B execute memory accesses in parallel in the order of the program, they may end up with the result x = y = 0. The specific reason is shown in the figure below:

Here processor A and processor B can write shared variables into their own write buffers (A1, B1) at the same time, then read another shared variable (A2, B2) from memory, and finally write themselves to the buffer area The saved dirty data is flushed to the memory (A3, B3). When executed in this sequence, the program can get the result of x = y = 0.

Judging from the order in which the memory operations actually occur, until processor A executes A3 to refresh its own write buffer area, the write operation A1 is actually executed. Although the order of memory operations performed by processor A is: A1->A2, the actual order of memory operations is: A2->A1. At this time, the memory operation sequence of processor A is reordered (the situation of processor B is the same as that of processor A, so I won't repeat it here).

The key here is that since the write buffer is only visible to its own processor, it will cause the order in which the processor executes memory operations may be inconsistent with the actual order of execution of the memory operations. Since modern processors will use write buffers, modern processors will allow write-read operations to be reordered.

The following is a list of the types of reordering allowed by common processors:

  Load-Load Load-Store Store-Store Store-Load Data dependence
sparc-TSO N N N Y N
x86 N N N Y N
ia64 Y Y Y Y N
PowerPC Y Y Y Y N

The "N" in the cell of the table above indicates that the processor does not allow reordering of the two operations, and "Y" indicates that reordering is allowed.

From the above table, we can see that: common processors allow Store-Load reordering; common processors do not allow reordering operations that have data dependencies. sparc-TSO and x86 have relatively strong processor memory models, and they only allow reordering of write-read operations (because they both use write buffers).

※Note 1: sparc-TSO refers to the characteristics of the sparc processor when running under the TSO (Total Store Order) memory model.

※Note 2: The x86 in the above table includes x64 and AMD64.

※Note 3: Since the memory model of the ARM processor is very similar to the memory model of the PowerPC processor, this article will ignore it.

※Note 4: Data dependency will be specifically explained later.

In order to ensure memory visibility, the java compiler inserts memory barrier instructions at appropriate positions in the generated instruction sequence to prohibit specific types of processor reordering. JMM divides memory barrier instructions into the following four categories:

Barrier type Instruction example Description
LoadLoad Barriers Load1; LoadLoad; Load2 Ensure the loading of Load1 data before the loading of Load2 and all subsequent loading instructions.
StoreStore Barriers Store1; StoreStore; Store2 Make sure that Store1 data is visible to other processors (flush to memory), before it is stored in Store2 and all subsequent store instructions.
LoadStore Barriers Load1; LoadStore; Store2 Ensure that Load1 data is loaded, the previous Store2 and all subsequent store instructions are flushed to the memory.
StoreLoad Barriers Store1; StoreLoad; Load2 Ensure that Store1 data becomes visible to other processors (refers to flushing to memory), before Load2 and all subsequent load instructions are loaded. StoreLoad Barriers will make all memory access instructions (store and load instructions) before the barrier are completed before executing the memory access instructions after the barrier.

StoreLoad Barriers is an "all-round" barrier, which has the effects of the other three barriers at the same time. Most modern multiprocessors support this barrier (other types of barriers may not be supported by all processors). The overhead of implementing this barrier is expensive, because current processors usually flush all data in the write buffer to the memory (buffer fully flush).

 

happens-before

Starting from JDK5, java uses the new JSR-133 memory model (unless otherwise specified in this article , it is the JSR- 133 memory model). JSR-133 puts forward the concept of happens-before, through this concept to explain the memory visibility between operations. If the result of one operation needs to be visible to another operation, then there must be a happens-before relationship between the two operations. The two operations mentioned here can be within one thread or between different threads. The happens-before rules closely related to programmers are as follows:

  • Program sequence rules: For each operation in a thread, happens- before any subsequent operations in the thread.
  • Monitor lock rule: To unlock a monitor lock, happens- before and then lock the monitor lock.
  • Volatile variable rules: write to a volatile domain, happens- before any subsequent reads of this volatile domain.
  • Transitivity: If A happens- before B, and B happens- before C, then A happens- before C.

Note that the happens-before relationship between two operations does not mean that the previous operation must be executed before the next operation! Happens-before only requires the previous operation (the result of execution) to be visible to the next operation, and the first is visible to and ordered before the second (the first is visible to and ordered before the second). The definition of happens-before is very subtle, and the following text will specifically explain why happens-before is so defined.

The relationship between happens-before and JMM is shown in the following figure:

As shown in the figure above, a happens-before rule usually corresponds to multiple compiler reordering rules and processor reordering rules. For Java programmers, the happens-before rule is simple and easy to understand. It prevents programmers from learning complex reordering rules and the specific implementation of these rules in order to understand the memory visibility guarantee provided by JMM.

 

Next article In-    depth understanding of the Java memory model (2)-reordering

 

Thanks to the author for his contribution to this article

Cheng Xiaoming, Java software engineer, nationally certified system analyst and information project manager. Focus on concurrent programming and work at Fujitsu Nanda. Personal email: [email protected].
---------------------
Author: World coding
Source: CSDN
Original: https://blog.csdn.net/dgxin_605/article/details/86181002
Copyright: This article is the original article of the blogger, please attach a link to the blog post if you reprint it!

Guess you like

Origin blog.csdn.net/dgxin_605/article/details/86181002