Instruction rearrangement
The sequence of instructions that are executed by the computer after being compiled by the program compiler is generally a sequence of instructions that is output by the program compiler. In general, this sequence of instructions outputs a definite result; to ensure that every execution has a definite result. However, under normal circumstances, in order to improve the efficiency of program execution, CPU and compiler will allow instruction optimization according to certain rules
Source code- > compiler-optimized rearrangement- > instruction parallel rearrangement- > memory system rearrangement- > final executed instruction
In single thread, ensure that the final execution result of the program is consistent with the result of code sequential execution
The processor must consider the data dependency between instructions when reordering
Threads are executed alternately in a multi-threaded environment. Due to the existence of compiler reordering, whether the variables used in the two threads can ensure consistency is uncertain, and the results cannot be predicted.
int a,b,x,y = 0;
Thread 1 | Thread 2 |
x = a; | y = b; |
b = 1; | a = 2; |
x = 0 and = 0 |
If the compiler performs rearrangement optimization on this program code, the following situations may occur
Thread 1 | Thread 2 |
b = 1; | a = 2; |
x = a; | y = b; |
x = 2 y = 1 |
How volatile prohibits instruction rearrangement
The volatile keyword prevents instructions from being reordered by providing a "memory barrier". In order to implement volatile memory semantics, the compiler inserts a memory barrier in the instruction sequence when generating bytecode to prohibit specific types of processor reorder .
Most processors support memory barrier instructions.
It is almost impossible for the compiler to find an optimal arrangement to minimize the total number of insertion barriers. For this reason, the Java memory model adopts a conservative strategy. The following is a JMM memory barrier insertion strategy based on a conservative strategy:
insert a StoreStore barrier in front of each volatile write operation.
Insert a StoreLoad barrier after each volatile write operation.
Insert a LoadLoad barrier after each volatile read operation.
Insert a LoadStore barrier after each volatile read operation.
What is a memory barrier
Memory barrier (or sometimes called memory fence) is a CPU instruction used to control reordering and memory visibility issues under certain conditions. The Java compiler also prohibits reordering according to the rules of the memory barrier.
Role 1. To ensure the execution order of specific operations 2. To ensure the memory visibility of certain variables (volatile uses this feature to achieve memory visibility)
Memory barriers can be divided into the following types of
LoadLoad barriers: For such statements Load1; LoadLoad; Load2, before the data to be read by Load2 and subsequent read operations is accessed, ensure that the data to be read by Load1 is read.
StoreStore barrier: For such statements Store1; StoreStore; Store2, before Store2 and subsequent write operations are performed, ensure that Store1 write operations are visible to other processors.
LoadStore barrier: For such statements Load1; LoadStore; Store2, before Store2 and subsequent write operations are flushed out, ensure that the data to be read by Load1 is read.
StoreLoad barrier: For such statements Store1; StoreLoad; Load2, before Load2 and all subsequent read operations are performed, ensure that Store1 writes are visible to all processors. Its overhead is the largest of the four barriers. In most processor implementations, this barrier is a omnipotent barrier, combining the functions of the other three memory barriers.
Since both the compiler and the processor can perform instruction rearrangement optimization, if a memory barrier (Momory Barrier) is inserted between the instructions, the compiler and the CPU will be told that no matter what order they cannot reorder with this Memory Barrier instruction, That is to say, by inserting a memory barrier, the instructions before and after the memory barrier are prohibited from performing reordering optimization
Another function of the memory barrier is to force the various CPU cache data to be flushed, so any thread on the CPU can read the latest version of these data.