Java Memory Model (JMM) and happens-before


Java Memory Model (JMM) and happens-before

We know that the java program is running in the JVM, and the JVM is a virtual machine built on memory, so what does the memory model JMM do?

We consider a simple assignment problem:

int a=100;

What JMM considers is when the thread reading the variable a can see the value 100. It seems that this is a very simple question, can't the value be read after the assignment?

But the above is just the order of writing our source code. When the source code is compiled, the order of the instructions generated in the compiler is not exactly the same as the order of the source code. The processor may execute instructions in an out-of-order or parallel manner (as long as the final execution result of the program is consistent with the execution result in a strict serial environment in the JVM, such reordering is allowed). And the processor also has a local cache. When the result is stored in the local cache, other threads cannot see the result. In addition, the order in which the cache is submitted to main memory can also change.

All the above mentioned may lead to different results in a multi-threaded environment. In a multi-threaded environment, most of the time multi-threads are performing their own tasks. Only when multiple threads need to share data, they need to coordinate the operations between the threads.

The JMM is a set of minimum guarantees that must be observed in the JVM. It specifies when variable write operations are visible to other threads.

Reorder

The above mentioned reordering in the JVM, here we give an example, so that everyone has a deeper understanding of reordering:

@Slf4j
public class Reorder {

    int x=0, y=0;
    int a=0, b=0;

    private  void reorderMethod() throws InterruptedException {

        Thread one = new Thread(()->{
            a=1;
            x=b;
        });

        Thread two = new Thread(()->{
            b=1;
            y=a;
        });
        one.start();
        two.start();
        one.join();
        two.join();
        log.info("{},{}", x, y);
    }

    public static void main(String[] args) throws InterruptedException {

        for (int i=0; i< 100; i++){
            new Reorder().reorderMethod();
        }
    }
}

The above example is a very simple concurrent program. Since we do not use synchronization restrictions, the execution order of threads one and two is uncertain. It is possible that one is executed before two, it may be executed after two, or both. Different execution orders may result in different output results.

At the same time, although we specified in the code to execute a = 1 first, and then execute x = b, the two statements are actually irrelevant. It is entirely possible to reorder the two statements to x = b first in the JVM a = 1 is behind, which leads to more unexpected results.

Happens-Before

In order to ensure the order of operations in the java memory model, JMM defines a sequence relationship for all operations in the program. This order is called Happens-Before. To ensure that operation B sees the results of operation A, whether A and B are in the same thread or different threads, then A and B must satisfy the Happens-Before relationship. If the two operations do not satisfy the happens-before relationship, then the JVM can reorder them arbitrarily.

Let's take a look at the rules of happens-before:

  1. Program sequence rules: If operation A is in the program before operation B, then operation A will be executed before operation B in the same thread.

Note that operation A here before operation B means that in a single-threaded environment, although the virtual machine reorders the corresponding instructions, the final execution result is the same as the execution in code order. The virtual machine will only reorder code that does not have dependencies.

  1. Monitor lock rule: The unlock operation on the monitor must be performed before the lock operation on the same monitor.

We all know the lock very well. The order here must refer to the same lock. If it is on a different lock, the order of execution cannot be guaranteed.

  1. Rules for volatile variables: Write operations to volatile variables must be performed before read operations on the variables.

Atomic variables and volatile variables have the same semantics for read and write operations.

  1. Thread start rules: The operation of Thread.start on a thread must be performed before performing any operation in the thread.

  2. Thread end rule: Any operation in the thread must be executed before the other thread detects the end of the thread.

  3. Interruption rules: When a thread calls interrupt on another thread, it must be executed before the interrupted thread detects the interrupt call.

  4. Terminator rules: The object's constructor must be executed before starting the object's finalizer.

  5. Transitivity: If operation A is performed before operation B, and operation B is performed before operation C, then operation A must be performed before operation C.

Rule 2 above is easy to understand. In the process of locking, other threads are not allowed to obtain the lock, which also means that other threads must wait for the lock to be released before locking and executing their business logic.

The rules of 4, 5, 6, and 7 are also well understood. Only the beginning can end. This is consistent with our general understanding of procedures.

The transitivity of 8 believes that people who have studied mathematics should not be difficult to understand.

Next, we focus on the combination of Rule 3 and Rule 1. Before the discussion, let us summarize what exactly happens-before does.

Because the JVM reorders the received instructions, in order to ensure the order of execution of the instructions, we have the rules-before rule. The rules of 2, 3, 4, 5, 6, and 7 mentioned above can be regarded as reordering nodes. These nodes are not allowed to be reordered. Only instructions between these nodes are allowed to be reordered.

Combined with rule 1 program sequence rules, we get its true meaning: the instructions written in the code before the reorder node will be executed before the reorder node is executed.

The reordering node is a boundary point, and its position cannot be moved. Take a look at the following intuitive example:

There are two instructions in thread 1: set i = 1, set volatile a = 2.
There are also two instructions in thread 2: get volatile a, get i.

According to the above theory, set and get volatile are two reordering nodes, and set must be sorted before get. According to rule 1, set i = 1 in the code before set volatile a = 2, because set volatile is a reordering node, so you need to observe the program order execution rules, so set i = 1 must be executed before set volatile a = 2. In the same way, get volatile a is executed before get i. Finally, i = 1 is executed before get i.

This operation is called with synchronization.

Secure release

We often use the singleton pattern to create a single object. Let's see what is wrong with the following methods:

public class Book {

    private static Book book;

    public static Book getBook(){
        if(book==null){
            book = new Book();
        }
        return book;
    }
}

The above class defines a getBook method to return a new book object. Before returning the object, we first determine whether the book is empty. If it is not empty, then a new book object is created.

At first glance, it seems that there is no problem, but if you carefully consider the JMM rearrangement rules, you will find the problem.
book = new Book () is actually a complex command, not an atomic operation. It can be roughly divided into 1. allocate memory, 2. instantiate objects, 3. associate objects with memory addresses.

Among them, 2 and 3 may be reordered, and then the book may return, but it has not been initialized. This leads to unforeseen errors.

According to the happens-before rule we mentioned above, the easiest way is to add the synchronized keyword to the front of the method:

public class Book {

    private static Book book;

    public synchronized static Book getBook(){
        if(book==null){
            book = new Book();
        }
        return book;
    }
}

Let us look at the implementation of the following static domain:

public class BookStatic {
    private static BookStatic bookStatic= new BookStatic();

    public static BookStatic getBookStatic(){
        return bookStatic;
    }
}

The JVM will be statically initialized after the class is loaded and before it is used by the thread. During this initialization phase, a lock will be acquired to ensure that memory write operations will be visible to all threads during the static initialization phase.

The above example defines static variables, which will be instantiated during the static initialization phase. This method is called early initialization.

Let's look at a pattern of delayed initialization placeholder class:


public class BookStaticLazy {

    private static class BookStaticHolder{
        private static BookStaticLazy bookStatic= new BookStaticLazy();
    }

    public static BookStaticLazy getBookStatic(){
        return BookStaticHolder.bookStatic;
    }
}

In the above class, the class will only be initialized when the getBookStatic method is called.

Next, we will introduce the double check locking.

public class BookDLC {
    private volatile static BookDLC bookDLC;

    public static BookDLC getBookDLC(){
        if(bookDLC == null ){
            synchronized (BookDLC.class){
                if(bookDLC ==null){
                    bookDLC=new BookDLC();
                }
            }
        }
        return bookDLC;
    }
}

The value of bookDLC is detected twice in the above class, and the lock operation is performed only when bookDLC is empty. Everything looks perfect, but we have to pay attention to it, here bookDLC must be volatile.

Because the assignment and return operations of bookDLC do not have happens-before, it may be possible to obtain an instance that is only partially constructed. This is why we should add the volatile keyword.

Initial security

At the end of this article, we will discuss the initialization of objects with final fields in the constructor.

For a properly constructed object, initializing the object ensures that all threads can correctly see the correct value set by the constructor for each final field of the object, including any variables that can be reached by the final field (such as elements in the final array, final HashMap etc.).

public class FinalSafe {
    private final HashMap<String,String> hashMap;

    public FinalSafe(){
        hashMap= new HashMap<>();
        hashMap.put("key1","value1");
    }
}

In the above example, we defined a final object and initialized this object in the constructor. Then this final object will not be reordered with other operations after the constructor.

For examples of this article, please refer to https://github.com/ddean2009/learn-java-concurrency/tree/master/reorder

For more information, please visit flydean's blog

Published 160 original articles · Like 168 · Visits 460,000+

Guess you like

Origin blog.csdn.net/superfjj/article/details/105549370