Java classic interview questions volatile underlying implementation principle

Foreword

When the shared variable is declared as volatile, the read / write operations on this variable will be very special. Below we will demystify volatile.

1. Volatile memory semantics

1.1 volatile features

A volatile variable itself has the following three characteristics:

Visibility: That is, when a thread modifies the value declared as a volatile variable, the new value is immediately visible to other threads that want to read the variable. While ordinary variables cannot do this, the transfer of values of ordinary variables between threads needs to be done through main memory.
Ordering: The so-called ordering of volatile variables means that the execution of code in the critical section of variables declared as volatile is in order, that is, the reordering of instructions is prohibited.
Restricted atomicity: The atomicity of volatile variables is different from the atomicity of synchronized. Synchronized atomicity means that as long as the method or code block declared as synchronized is executed atomically. While volatile does not modify methods or code blocks, it is used to modify variables. The read / write operations of a single volatile variable are atomic, but the compound operation similar to volatile ++ is not atomic. So the atomicity of volatile is limited. And in a multi-threaded environment, volatile does not guarantee atomicity.

1.2 Memory semantics of volatile write-read

Memory semantics of volatile writing: When a writer thread writes a volatile variable, JMM will refresh the shared variable value in the local memory corresponding to the thread to the main memory.

Memory semantics of volatile reading: When a reading thread reads a volatile variable, JMM will invalidate the local memory corresponding to the thread, and the thread will then read the shared variable from the main memory.

2. The principle of volatile semantic implementation

Before introducing the principle of volatile semantic implementation, let's first look at two professional terms related to CPU:

Memory barriers (memory barriers): a set of processor instructions, used to achieve the order of restrictions on memory operations.
Cache line: The smallest storage unit that can be allocated in the CPU cache. When the processor fills the cache line, it loads the entire cache line.

2.1 The principle of volatile visibility

How is the memory semantics of volatile visibility implemented? Let's look at a piece of code and print the assembly instructions of the processor generated by the code (about how to print the assembly instructions, I will attach it at the end of the article), and see what the CPU will do when writing to volatile variables:

public class VolatileTest {

    private static volatile VolatileTest instance = null;

    private VolatileTest(){}

    public static VolatileTest getInstance(){
        if(instance == null){
            instance = new VolatileTest();
        }

        return instance;
    }

    public static void main(String[] args) {
        VolatileTest.getInstance();
    }
}

The above code is a singleton mode code that we are very familiar with and cannot guarantee thread safety in a multi-threaded environment. The special place in this code is that I added a volatile modification to the instance variable instance. Let's look at the printed assembly instructions. :

In the screenshot above, we see a compilation comment at the end of the line I underlined: putstatic instance, all friends who understand the JVM bytecode instructions know that putstatic means setting a value for a static variable, which is also in the above code Is to assign a value to the static variable instance, the corresponding code: instance = new VolatileTest (); instance is instantiated in the getInstance method, because the instance is added with volatile decoration, so to set the value of the static variable instance is also writing a volatile variable.

Seeing that there are assembly instructions and bytecode instructions above, will you confuse these two instructions? Here I indicate the difference between bytecode instructions and assembly instructions:

We all know that java is a cross-platform language, so how does java achieve this platform independence? This requires us to understand the JVM and java bytecode files. Here we need to have a consensus that any programming language needs to be converted into platform-related assembly instructions before it can be finally executed by hardware. For example, C and C ++ both directly compile our source code into CPU-related assembly instructions CPU execution. Different series of CPUs have different system architectures, so their assembly instructions are also different. For example, X86 architecture CPUs correspond to X86 assembly instructions, and arm architecture CPUs correspond to arm assembly instructions. If the program source code is directly compiled into hardware-related low-level assembly instructions, the cross-platform nature of the program will be greatly reduced, but the execution performance is relatively high. In order to achieve platform independence, java compiler javac does not directly compile java source program into assembly instructions related to the platform, but compiles into an intermediate language, that is, java class bytecode file. Bytecode files, as the name implies, store bytecodes, that is, bytes one by one. Some friends who have opened java bytecode files and studied may find that the bytecode files are not stored in binary, but in hexadecimal. This is because the binary is too long, and a byte consists of 8 bits. Binary composition. So expressed in hexadecimal notation, two hexadecimal can represent a byte. The bytecode file compiled by java source code cannot be directly executed by the CPU, so how to execute it? The answer is the JVM. In order to allow java programs to be executed on different platforms, java official provides java virtual machines for each platform. The JVM runs on the hardware layer to shield the differences of various platforms. The bytecode files compiled by javac are uniformly loaded by the JVM, and finally converted into hardware-related machine instructions to be executed by the CPU. Knowing that loading the bytecode file through the JVM, then there is still a question, how does the JVM associate each byte in the bytecode with the java source code we wrote, that is, how the JVM knows the java source we wrote Which hexadecimal segment in the class file does the code correspond to, what does this hexadecimal segment do, and what function does it perform? And a lot of hexadecimal, we can't understand it. So this requires the definition of a JVM level specification. At the JVM level, we abstract some instruction mnemonics that we can recognize. These instruction mnemonics are java bytecode instructions.

The lock instruction will trigger the following events on a multi-core processor:

Write the data of the cache line of the current processor back to the system memory, and invalidate the data cached at the memory address in other CPUs.

In order to improve the processing speed, the processor generally does not directly communicate with the memory, but first reads the data in the system memory to the internal cache before performing the operation, but does not know when the processor writes the cached data back to the memory after the operation is completed. However, if you write a variable modified with volatile, the JVM will send a lock prefix instruction to the processor to write the data of this variable in the cache line back to the system memory. At this time, it is only written back to the system memory, but the data in the cache line of the other processor is still old. If the data of the cache line of the other processor is also the data of the newly written system memory, the cache consistency protocol needs to be implemented. That is, after a processor writes its own cache line data back to the system memory, each other processor will check whether its cached data has expired by sniffing the data propagated on the bus, when the processor finds itself After the data of the memory address corresponding to the cache line is modified, the data of the cache line cache will be set to invalid. When the processor wants to modify this data, it will read the data from the system memory to itself again. Cache line, re-cache.

To sum up: the realization of volatile visibility is to use the CPU's lock instruction. By adding the lock prefix before writing volatile machine instructions, writing volatile has the following two principles:

When writing volatile, the processor writes the cache back to main memory.

Writing the cache of one processor back to memory will invalidate the cache of other processors.

2.2 Implementation principle of volatile order

The guarantee of volatile order is achieved by prohibiting instruction reordering. Instruction reordering includes compiler and processor reordering, and JMM restricts these two instruction reorderings separately.

So how to prohibit instruction reordering? The answer is to add a memory barrier. JMM is volatile plus memory barrier in the following 4 situations:

Insert a StoreStore barrier in front of each volatile write operation to prevent reordering of write volatile and subsequent write operations.
Insert a StoreLoad barrier after each volatile write operation to prevent reordering of write volatile and subsequent read operations.
Insert a LoadLoad barrier after each volatile read operation to prevent reordering of read volatile and subsequent read operations.
Insert a LoadStore barrier after each volatile read operation to prevent reordering of read volatile and subsequent write operations.

The above memory barrier insertion strategy is very conservative. For example, StoreStore and StoreLoad barriers need to be added after a volatile write operation, but there may be no read operation after this write volatile, so in theory, only StoreStore barrier can be added, indeed, Some processors do just that. But JMM's conservative memory barrier insertion strategy can ensure that volatile variables are ordered in any processor platform.

3. JSR-133 enhances the semantics of volatile memory

In the old java memory model before JSR-133, although reordering of operations between volatile variables is not allowed, reordering between volatile variables and ordinary variables is allowed. For example, the operation in front of the memory barrier is an operation to write volatile variables, and the operation behind the memory barrier is an operation to write ordinary variables. Even though these two write operations may destroy the semantics of volatile memory, JMM allows these two operations to be reordered .

In the new Java memory model of JSR-133 and later, the memory semantics of volatile is enhanced. As long as the reordering between volatile variables and ordinary variables may destroy volatile memory semantics, such reordering will be prohibited by the compiler reordering rules and processor memory barrier access policies.

Attachment: configure idea to print assembly instructions

Toolkit download address: link: https://pan.baidu.com/s/11yRnsOHca5EVRfE9gAuVxA
extraction code: gn8z

Unzip the downloaded toolkit and copy it to the bin directory under the jre path of the jdk installation directory, as shown in the figure:

Then configure idea, enter: -server -Xcomp -XX: + UnlockDiagnosticVMOptions -XX: + PrintAssembly -XX: CompileCommand = compileonly, * class name. Method name

The JRE option selects the jre path that has been put into the toolkit.

The following picture is my idea configuration:

After the above configuration is run, the assembly instructions can be printed.

Kite Lee

81 original articles published · Like 29 · Visitors 20,000+

Private letter concerns