java memory model and volatile

Foreword

In computer hardware configuration, in order to balance between the cpu and memory speed since the gaps caused by, introduced cpu cache as a buffer between the processor and memory. In the multi-core processor, each core has its own cache, which brought a cache consistency problem. MESI protocol mentioned earlier is a protocol for handling cache coherency problem, it will cache the content is divided into several states, and asked each other issued out of the event uploaded to the nuclear monitor bus, according to these external events and their internal event operation cache to maintain cache contents and status, in order to achieve cache consistency. But MESI protocol-specific optimizations sometimes cause temporary inconsistent data exists in the cache, so the introduction of a memory barrier to circumvent this problem.

Even with the presence of the cache, the cache when the processor load will still be waiting for a waste of time. So the processor will attempt to execute waiting for data blocked when the current instruction This instruction does not depend on other data, to improve the processing speed as much as possible, which is called out of order. The processor will ensure that the results consistent with the order execution result of the execution order, but only within the scope of the current processor. If there are other computing tasks depending on the current task intermediate results, there results do not meet expectations that may arise, this problem can be circumvented by the same memory barrier.

java memory model

java virtual machine specification defines itself java memory model, through this memory model to mask differences in different operating systems and hardware brought achieve the target all platforms running the same effect. java memory model specifies that all variables are stored in the main memory, each thread has its own working memory, threads access directly from the work memory when accessing the variable, but can not access the main memory. A thread can not access the working memory of the other threads, the thread passes between the variables need to go through the main memory to complete. Thread here, working memory and main memory are related processor, cache and memory is somewhat similar to computer hardware structure. In addition, real-time compiled java virtual machine is also optimized similar instruction reordering.

volatile variable

There are ways in java for a single-mode embodiment, called "double check Cheng Li." Cheng Li double check using the keyword synchronized and volatile embodiment ensures the correctness of the single mode in the case of concurrent execution. But jdk1.5 before (not including 1.5) version is problematic, which is volatile keyword specific reasons underlying implementation was entirely correct in jdk1.5.

According to the characteristics of volatile, if a variable is marked as volatile, it will get two additional properties:

  1. In one thread to modify a volatile variable will be immediately perceived by other threads, that is visibility. As mentioned earlier, in the java memory model, the variable transfer between the various threads need to go through the main memory, so for performance reasons, threads do not always get the latest value of the variable from main memory, but in a specific time only sync the latest content from main memory. The volatile keyword can be forced to trigger other threads synchronize the contents of main memory.
  2. Prohibit instruction reordering. For an ordinary variable, dependent on this variable will ensure that all places can get the right result, but does not guarantee this variable assignment of the order and the actual code execution order agreement, such as the code does not depend on this variable may before being moved to or after the execution, that is, "it looks like the same as the order of execution." Key words can be volatile while prohibiting command reordering.

In previous versions of jdk1.5, volatile and does not prohibit instruction reordering effect, so even if the variable is declared volatile situation there will reorder the code before and after volatile variables, which also can not use double-Cheng Li before jdk1.5 check to achieve a single embodiment reasons.

volatile implementation

The aforementioned memory barriers can be avoided stale data exists in the cache and avoids out of order execution, and is volatile itself to implement the foregoing two characteristics through the memory barrier.

Memory barrier is usually divided into several levels: read-write (read and write operations to ensure that before the barrier after barrier are earlier than the read and write operations), read (read only guaranteed) and write (only guarantee write operation). Different hardware architecture to implement memory barriers are not the same, as in x86 instructions in memory barrier is:

  • lfence read barriers
  • sfence write barrier
  • read and write barrier mfence

And when we put the actual java bytecode disassembly into assembly instructions, you can not see these barriers, but add a volatile variable after writing lock addl $0, 0 (%esp)instruction. action lock instruction cache is to make the contents of the current processor is written to memory, while the other processor cache failure, this operation is equivalent to the contents of the working memory of the present thread synchronization to main memory, also ensure visibility sex. In the instruction reordering point of view, as a result of the operation of the lock before the instructions are synchronized to memory, it is equivalent to the operation have been completed before the lock, this is equivalent to "barrier behind the action can not pass through to the front barrier" effect .

The actual role of lock

We can see that, in fact, have a lock semantic memory barrier, what is the role of specific lock it. lock instruction is a prefix instruction behind it will ensure that the atomic execution. This is achieved provided that the processor during instruction execution LOCK#signal, This ensures that the processor can operate mutually exclusive memory (achieved by locking the bus), when the instruction is finished LOCK#信号will be automatically canceled. Starting intel Pentium Pro processors, when you want to lock the memory address has been loaded into the cache, directly corresponding cache instead of locking settings LOCK#信号.

That is, to achieve the volatile cache is locked by a lock prefix and instruction space, and realizes the functions of visibility is prohibited reordering. As for why use addl $0, 0 (%esp)with the lock because the lock prefix is the prefix instruction only supports memory operations class, you can not directly use lock nop instruction prefixes empty.

Guess you like

Origin juejin.im/post/5dac47d95188254ed108fa8b