JMM - Fun happens-before

To Fun happens-before we need to briefly explain a few basic concepts under

cache

With the rapid development of its computational speed CPU and memory read and write speed gap is growing, if still to read and write memory, then the processing speed of the CPU memory read and write speed limit will receive, in order to bridge this gap in order to ensure fast processing of the CPU cache appeared.

It characterized by cache read and write speed, small capacity, according to the price expensive.

With the rapid development of the CPU, the cache read and write speeds depend also rising, in order to meet higher requirements on the development of a better and more rapid process cache, according to the price it is also more expensive.

CPU would write speed in accordance with the extent and close sequentially into L1 (a cache), L2 (secondary cache), L3 (level three cache) processing speed between them in descending order, for modern computer He said that there will be at least one L1 cache.

Java Memory Model

Communications between threads is Java Java memory model (JMM) controlled, JMM defines the shared variables among the plurality of threads stored in the main memory, the data is private to each thread is stored in local memory among threads, local memory and stores a copy of the multi-threaded shared variables in main memory (local memory is a virtual concept does not exist, referring to the concept of buffer, registers, etc.). FIG abstract model as follows:

What is happens-before

It happens-before concept was originally developed by Leslie Lamport ( "of Events in a Distributed System Time, Clocks and thhe Ordering") proposed a far-reaching influence in the paper. It happens-before described with a partial order of events in a distributed system.

JDK5 from the start, Java memory model using JSR-133, JSR-133 uses the concept of happens-before to provide single-threaded or multi-threaded memory visibility guarantees.

It happens-before memory provides visibility among multiple threads for programmers

happens-before following rules

  • Program sequence rules: each operation a thread happens-before any subsequent operation of the thread
  • Monitor lock Rule: subsequent to locking the thread of a thread or other thread happens-before unlocking of the object
  • volatile variable rules: to write a volatile field happens-before reading this volatile region in any subsequent
  • Transitivity rule: If A happens-before B, B happens-before C then A happens-before C

According to this rule we can guarantee the visibility of memory between threads, followed by a detailed analysis, defined here first released

Visibility memory

Above that happens-before main memory is to provide visibility to ensure single-threaded or multi-threaded, then the visibility of memory, what is it that we look at the following definition

Heap memory is shared between threads, thread stack memory is private. There is the problem of data memory visibility of heap memory, stack memory, not memory-visibility impact.

Visibility of memory: in fact, a multi-threaded able to see the shared memory data state, which is also likely to be correct there may be wrong (of course, our aim is to ensure the correct visibility of memory).

Now we have to analyze what instructions will appear when the memory visibility problems (that is, under what circumstances, incorrect visibility state memory will lead to a multi-threaded program access error)

Cache memory visibility problems caused

We know that each CPU has its own cache, then the computer has more than one CPU, a data read and write time, because the processor can write data (corresponding to the JMM is a thread private memory) to cache , and the cache is not immediately brush into memory (main memory JMM abstract model), which would result in inconsistencies between the plurality of the CPU to read and write data, the following

class Test {
    int val = 0;
    void f() {
        val = val + 1;
        // ...
    }
}
复制代码

The figure is only one possible error state, there may be correct, multithreaded unsynchronized there is uncertainty

  • At time T1, thread A runs, the main memory loaded val = 0 private working memory, then the processor time T2 + 1 is processed, T3 time is written in the local cache, T6 time before the local cache are flushed to the main memory
  • Time T4, the thread B runs, find the main memory or val = 0 (since the data is not yet thread A brush as main memory), and then continue processing after the + 1 returns, the thread B in the working memory val = 1
  • White final brush-memory data val = 1

Programmers can see the intention is to use two threads to perform operations val + 1, respectively, results to be obtained val = 2 Results The results obtained are finished running val = 1

Instruction reordering memory visibility problems caused

Let's look at what is under the command reordering

    void f() {
        int a = 1;
        int b = 2;
        int c = a + b;
    }
复制代码

After reordering compiler or processor, performs the order of execution may be changed to b = 2perform the a = 1C is not possible in the row before the above step 2, will be described below.

Reordering the instruction is divided into compiler directives reordered reordering processor instructions.

Compiler and a processor in order to improve the degree of parallelism of instruction execution and instruction reordering, their purpose is to accelerate the process speed, but no matter how the reordering must ensure that the final single-threaded execution result can not be changed, but if in the case of multi-threading can not be guaranteed, so there is the result of incorrect execution may be the case.

In order to ensure the correctness of the final single-threaded program, one thing is certain is that if there is a dependency between operations, then either the compiler or processor will not perform the reordering, this is now the compiler and processing are It is the realized. as follows

    void f() {
        int a = 1;
        // 这个操作依赖上一步操作 a = 1,所以他们不会被重排序
        int b = a + 1;
    }
复制代码

The instruction reordering and how visibility leads to memory problems? We look at an example

class Test {
    private static Instance instance;
    
    public static Instance getInstance() {
        if (instance != null) {
            synchronized(Test.class) {
                if (instance != null) {
                    // 错在这里
                    instance = new Instance();
                }
            }
        }
        return instance;
    }
}
复制代码

This is a common double-checked locking of single model (wrong), it would be wrong in the instruction reordering may result in the return uninitialized instance, we have to analyze why.

instance = new Instance (); when executed by a processor for actually dismantling, pseudocode steps performed are as follows

// 步骤1 分配内存空间
memory = allocate();
// 步骤2 初始化对象
ctorInstance(memory);
// 步骤3 设置对象的内存地址
instance = memory;
复制代码

We can see the above three steps in single-threaded scenarios for steps 2 and 3 of these two is not dependent, we can set the address to give him its contents can initialize the object, it might command reordering as follows:

// 步骤1 分配内存空间
memory = allocate();
// 步骤2 设置对象的内存地址
instance = memory;
// 步骤3 初始化对象
ctorInstance(memory);
复制代码

So in a multithreaded scenario, thread A executed to step 2 (not initialized), and just to refresh the working memory to the main memory, the thread B saw instance, thought to have been created initialized, direct return, and on B may cause the thread to get is uninitialized object, then subsequent use when problems arise.

Solve the visibility problem memory

It is for these reasons led to the visibility of memory problems, unexpected situations may arise in multithreaded scenarios, we have to get the right results the right multi-threaded program execution, then we will ensure the visibility of memory correctness.

To ensure the correctness of the visibility of memory mainly be achieved through the following technical

  • volatile
    • Solve the visibility problem and instruction reordering memory
  • final
  • Monitor Lock
    • Solve the visibility problem memory
    • Exclusive access between the lock
  • happen-before
    • Happen-before binding rules using the above three or more techniques to ensure the correctness of the execution of a multithreaded program. We can also figure out with this man-made rules and codes corresponding to the existence of multi-threaded program and whether the results produced results consistent with single-threaded execution (that is, you can calculate whether to get the correct results)

volatile

When writing a volatile variable time, JMM will thread corresponding local memory shared variable values ​​flushed to main memory to go.

volatile two properties

  • Visibility: read a volatile always be able to see any thread write this volatile last
  • Atomicity: reading of any single variable volatile / write atomic having, for example, 64-bit long, double, etc.

JMM by limiting volatile read / write reordering, for the compiler developed the following volatile reordering rules

Can reordering The second operation
The first operation Normal read / write volatile read volatile write
Normal read / write NO
volatile read NO NO NO
volatile write NO NO

From the table it can be concluded:

  • The second operation is a volatile time of writing, no matter what the operation can not be reordered
  • The second operation is a volatile read, only on a read-write operation is common to reorder
  • When the second read and write operations for the general, not only volatile read reordering

After reading these rules are not mind a little dizzy, that's because they do not know why he does it, we start to think about one aspect.

When writing a volatile variable is when will the corresponding thread local memory refresh variable values ​​into memory, meaning that if there is one or more operations also wrote a volatile write shared variables before, then all changes made before this time will all shared variables to be flushed to main memory, this feature is not feeling particularly important!

After reading the contents of the latter will be able to look at this table understand why sink to the bottom enough to do that.

We now look again before a single example of the wrong example, due to instruction reordering leads, but we make the following changes to the program will be to ensure that the correct

class Test {
    private static volatile Instance instance;
    
    public static Instance getInstance() {
        if (instance != null) {
            synchronized(Test.class) {
                if (instance != null) {
                    // 错在这里
                    instance = new Instance();
                }
            }
        }
        return instance;
    }
}
复制代码

You can see the addition of a volatile, add it after we can ensure the following piece of tape ah can not be reordered, and consciousness is only in Step 1 -> 2 -> 3 in order to perform, and also to ensure that this a single model of correctness.

// 步骤1 分配内存空间
memory = allocate();
// 步骤2 初始化对象
ctorInstance(memory);
// 步骤3 设置对象的内存地址
instance = memory;
复制代码

Then the compiler is how to achieve this rule does, also said that the compiler is a reordering of the rules of what technology to limit volatile reordering of it.

When the compiler generates byte code will be inserted in the instruction sequence memory barrier to inhibit a particular type of processor reordering, the insertion of the optimal strategy is too cumbersome barriers is almost difficult to achieve, so JMM conservative strategy follows inserted memory barrier

  • In each of the foregoing volatile write operation to insert a barrier StoreStore
  • Write back operation is inserted in each of a volatile barrier StoreLoad
  • Inserting a volatile LoadLoad barrier after each read operation
  • Inserting a volatile LoadStore barrier after each read operation

Memory barriers are explained below

This strategy based on this strategy, we can ensure that any processor platform, any program can get the right volaile a semantic memory, see the figure is volatile write scenes

StoreStore to ensure that all of the above general write flushed to main memory before volatile write.

If the above StoreLoad volatile at the end of this method, it is difficult to confirm whether or not there is call its methods to read or write so volatile, if volatile or written back at the end of the method will really have to read and write in both cases under volatile insert StoreLoad barrier.

Summarize a memory method :

  • StoreStore barrier: Store has a storage means, StoreStore means 2 need to be stored, Volatile write in front of the common memory write first step into the main brush, brush itself but also the White memory is not that the "Store and Store" StoreStore it
  • StoreLoad barrier: the former is a need Volatile Store, subsequent operations may be uncertain Volatile write or read, the assumption is read, then we insert a StoreLoad barrier against volatile write above and below volatile read-write reordering the

The following is a volatile read scenes

As well as a summary of the method

  • LoadLoad barrier: 2 Load implies the need to prohibit volatile read and read all of the following general reordering
  • LoadStore barrier: First Load after Store is meant to be inhibited volatile read and write all of the following general

Then we look at a code with the volatile and happen-before rule to analyze

class Test {
    int num = 10;
    boolean volatile flag = false;
    
    void writer() {
        num = 100;     // 1
        flag = true;   // 2
    }
    
    void reader() {
        if (flag) {   // 3
            System.out.println(num);   // 4
        }
    }
}
复制代码

Suppose after the thread A method performed over writer, reader thread B executed to perform the method. (Forget the brief look above rules)

  • Program sequence rules, because we are look for a single thread
    • For the thread A, 1 happens-before 2
    • For the thread B, 3 happens-before 4
  • volatile variable rule, 2 happens-before 3
    • 3 Here is a volatile read, we know that later will insert LoadLoad and LoadStore barrier to ensure that the following general reader will not be reordered to go top
    • Here ensures 34 and will not reorder, only to see the flag = true value of num
  • Transfer rules, 1 happens-before 2,2 happens-before 3,3 happens-before 4, then it is possible to obtain 1 happens-before 4

Finally, to emphasize that, with regard to these volatile read-write these barriers do not necessarily have to be fully inserted as required, the compiler will optimize found not need to insert the time would not have to insert the memory barrier, but it can guarantee and we insert this barrier as a way to get the right results, not carried out here.

Monitor Lock

For locking the code block or method, they are mutually exclusive executed, a thread releases the lock, another thread to get a lock to perform after this.

It has a similar volatile memory and semantic

When a thread releases the lock, it will hide the thread corresponding local memory shared variables to be flushed to main memory.

When a thread to acquire the lock, the current thread will JMM corresponding local memory is set invalid, so that the code is protected monitor critical section must acquire new shared variable from main memory.

Let's look at a piece of code

    int a = 0;
    
    public synchronized void writer() {  // 1
        a++;  // 2
    }  // 3
    
    public void synchronized reader() { // 4
        int i = a; // 5
    } // 6
复制代码

Suppose thread A executed writer () method of the thread B executed Reader () method, to continue with the analysis happens-before

  • Program sequence rules (because the monitor lock relates to the above analysis and so the critical region multi-step analysis of the critical sections)
    • 线程 A,1 happens-before 2, 2 happens-before 3
    • 线程 B,4 happens-before 5,5 happens-before 6
  • Monitor lock rules, 3 happens-before 4
    • A thread releases the lock when a refresh will go into the main memory
    • Since when B thread to acquire the lock, the current thread will JMM corresponding local memory is set invalid, to obtain the new main memory shared variables a = 1
  • Transfer rules, 1 happens-before 2, 2 happens-before 3,3 happens-before 4,4 happens-before 5,5 happens-before 6. Finally obtained 2 happens-before 5, so that i can be assigned properly.

Sequential consistency model

Sequential consistency model, JMM, was designed with reference to the sequential consistency model.

We look at the definition of sequential consistency model

  • All operations must be performed in a thread in the order
  • Regardless of whether synchronization between threads, all threads can only see the same execution sequence, and each operation must be performed and the atom immediately visible to all threads

The first point of difference in the JMM and believe that we can easily see, JMM is allowed to command reordering, and their order of execution may change, but the final result obtained is the same.

Is not synchronized to the procedure in sequential consistency model is such that

Sequential consistency model requires a model for unsynchronized must achieve this effect, this is actually very little meaning, and why? This effect is achieved because even if the end result is not synchronized program is uncertain. JMM did not do so from the designs. Specifically how do we have after a detailed analysis before.

The JMM for the synchronization of multithreaded most minimal of security that thread to see the data is either the default value or the value of other threads written.

Finally, there is also the memory of the visibility issue final and final semantic memory caused due to space too long to write back alone.

Every time I see << Java concurrent programming Art >> has a different feeling, this time in conjunction with their thinking to write articles under deepen their understanding.

reference:

  • Art Java concurrent programming

Guess you like

Origin juejin.im/post/5d92ad1be51d4577ec4eb955