The definition and implementation principle of volatile

      The definition of volatile in the 3rd edition of the Java Language Specification is as follows: The Java programming language allows threads to access shared variables. To ensure that the shared variable can be updated accurately and consistently, the thread should ensure that the variable is obtained separately through an exclusive lock. The Java language provides volatile, which is more convenient than locking in some cases. If a field is declared volatile, the Java threading memory model ensures that all threads see the same value of the variable.

      Before understanding the implementation principle of volatile, let's take a look at the CPU terms and descriptions related to its implementation principle.

      Definitions of CPU terms.
write picture description here

      How does volatile guarantee visibility? Let's see what the CPU does when writing to volatile by using the tool to obtain the assembly instructions generated by the JIT compiler under the X86 processor.
      The Java code is as follows:

instance = new Singleton(); // instance是volatile变量

      Converted to assembly code, as follows:

0x01a3de1d: movb $0×0,0×1104800(%esi);
0x01a3de24: lock addl $0×0,(%esp);

      When a shared variable modified by a volatile variable is written, a second line of assembly code will be added. By checking the IA-32 architecture software developer's manual, it can be seen that the Lock prefix instruction will cause two things under the multi-core processor:

1) Write the data of the current processor cache line back to system memory.
2) This write-back operation will invalidate the data cached at the memory address in other CPUs.
In order to improve the processing speed, the processor does not communicate directly with the memory, but first reads the data in the system memory into the internal cache (L1, L2 or other) before performing the operation, but does not know when it will be written to the memory after the operation. If a write operation is performed on a variable declared volatile, the JVM will send a Lock-prefixed instruction to the processor to write the data of the cache line where the variable is located back to system memory. However, even if it is written back to memory, if the value cached by the other processor is still old, it will be problematic to perform the calculation operation again. Therefore, under multiprocessors, in order to ensure that the caches of each processor are consistent, a cache coherence protocol will be implemented. Each processor checks whether the value of its own cache is expired by sniffing the data propagated on the bus. , when the processor finds that the memory address corresponding to its cache line has been modified, it will set the current processor's cache line to an invalid state. When the processor modifies the data, it will re-read the data from the system memory. into the processor cache.

Let's explain the two implementation principles of volatile in detail.
1) The Lock prefix instruction will cause the processor cache to be written back to memory. The Lock prefix instruction causes the processor's LOCK# signal to be asserted during execution of the instruction. In a multiprocessor environment, the LOCK# signal ensures that the processor can monopolize any shared memory while the signal is asserted. However, in recent processors, the LOCK# signal generally does not lock the bus, but locks the cache. After all, the overhead of locking the bus is relatively large. The effect of locking operations on the processor cache is detailed in Section 8.1.4. For Intel486 and Pentium processors, the LOCK# signal is always asserted on the bus during locking operations. But in P6 and current processors, the LOCK# signal is not asserted if the memory area being accessed is already cached inside the processor. Instead, it locks the cache of this memory region and writes it back to memory, and uses a cache coherence mechanism to ensure the atomicity of modifications. This operation is called "cache locking". The cache coherence mechanism prevents simultaneous modifications by two Memory region data cached by more than one processor.
2) A write-back of one processor's cache to memory will invalidate the other processor's cache. IA-32 processors and Intel 64 processors use the MESI (modify, exclusive, share, invalidate) control protocol to maintain internal cache coherence with other processor caches. When operating in multi-core processor systems, IA-32 and Intel 64 processors can sniff other processors accessing system memory and their internal caches. The processor uses sniffing techniques to ensure that data in its internal cache, system memory, and other processors' caches are coherent on the bus. For example, in Pentium and P6 family processors, if sniffing one processor detects that other processors intend to write to a memory address that is currently shared, then the sniffing processor will invalidate its cache line , to force a cache line fill on the next access to the same memory address.

Personal WeChat public account:
write picture description here

Author: jiankunking Source: http://blog.csdn.net/jiankunking

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324935188&siteId=291194637
Recommended