In-depth understanding of the volatile keyword


In-depth understanding of the volatile keyword

The cause of the problem: a visibility problem of shared data caused by optimization at the hardware level

Due to the optimization caused by the difference in processing speed between the CPU/memory/IO/disk, the CPU needs to access the data in the memory when performing data operations, but the speed difference between them is very different; 1. The first step is to optimize It is the introduction of high-speed cache, respectively L1/L2/L3 cache (speed gradually decreases), and each CPU has its own cache (equivalent to L1), and multiple CPUs will have data security when accessing shared resources The problem is that when CPU1 changed the value of a shared resource a, it did not synchronize to the main memory in time, and CUP2 did not get the latest data when calculating the value of a (that is, the cache consistency problem). 2. Therefore, at this time, the cache coherency protocol (MESI used by the X86 architecture) [there are four states...] (cache row lock, lock granularity is small, only locks on shared data), and consistent through the cache Protocol to inform each other whether the value of cache line a in each CPU needs to be retrieved from main memory (L2->L3->main memory), but the cache coherency protocol will have a communication delay problem, which will exist in the middle The time segment of CPU blocking is also a waste of CPU resources, so what should I do? 3. Excellent engineers introduced a storebuffer to realize asynchronous operation, and further optimize the utilization of CPU resources, that is, when changing the value of shared variable a, you do not need to wait for the communication result, you can directly run the following program instructions; but It will cause another problem, the problem of instruction reordering (that is, when two CPUs are executing the code of certain scenarios, the a variable is too late to synchronize, and the other CPUs use the value of the a variable, which is similar to the code execution out of order), And this problem is a problem that cannot be solved at the hardware level, because at the hardware level, it is not known when such optimization is required at the software level, so there will be a hole for software application developers to use, that is, the CPU level provides instructions -> Memory barrier (read barrier, write barrier, full barrier). After adding this barrier, it is equivalent to shielding the optimization of storebuffer (mandatory notifying storebuffer to the cache lines of other CPUs, or removing the optimization of storebuffer?), that is, a Visibility of shared variables. The mapping at the java level is to provide keywords such as volatile to add memory barrier operations.

Questions:
1. Is the memory barrier equivalent to the optimization of removing storebuffer, or does it mean removing other optimizations?
   
 

Guess you like

Origin blog.csdn.net/qq_36336332/article/details/109375645