Reordering commands and memory barrier

An instruction reordering

  Reordering the instruction is divided into three, namely, optimizing compiler reordering , instruction level parallelism reordering , the memory system reordering . As illustrated, the latter two processor levels (i.e. hardware level).

  • Reordering compiler optimization: the compiler without changing the program execution result, in order to enhance the efficiency of compiled instructions out of order. For example A operations need to get in the code other resources into the state waiting and A behind the operation code with which no dependencies, if the compiler has been waiting for A operation to complete before continuing down if efficiency is much slower, so you can first the latter compiled code, so out of order can improve the compilation speed is not small.
  • ILP reordering: the processor without affecting the program execution result, overlap the plurality of instructions executed, it is also to improve efficiency.
  • System memory reordering: The difference is that with the two before that, that just looks like it is out of order execution in a pseudo reordering. For modern processors, between the CPU and main memory are provided with a cache, cache acts primarily to reduce the interaction of the CPU and main memory (CPU processing speed is much faster), the CPU read operation when, if not, then the cache from main memory to take, and for write operations are written in the first cache, main memory and finally write-once, because reducing the interaction with the main memory of the CPU brief Caton , thus improve performance , but the delay could lead to a written question - inconsistent data.
    // CPU1 do the following 
    A =. 1 ; 
    int I = B; 
    
    // the CPU 2 performs the following operations 
    B =. 1 ; 
    int J = A; 

    it executes the following:
        

     

         We can see from the figure, the CPU, the first a = 1 in the write cache to read a variable B, after a write to main memory, and this operation becomes the surface first reading take variable B, a write to main memory, i.e. occurred reordering , it says that it is a pseudo reordering.

            And we can see from the above, due to the different timings in writing CPU1 and 2, may eventually lead to read (a, b) where there are four variables, namely, (0,0), (0,1), (1,0), (1,1). For example, when the two buffers are not written on the main memory for reading variables, this time it is read (0,0), and so on other conditions. So in the realization of Java memory model will be allowed in certain types of reordering.

 

   as-if-serial semantics: This is the need to comply with the rule of reordering, which generally means that the single-threaded , the execution without altering the final result of the program, the order may be changed in order to improve the performance of instruction execution.

Second, the memory barrier

  In using the volatile keyword the compiler aspects may prohibit instruction reordering , and achieve a ban on instruction in hardware reordering is the memory barrier . Which comprises a hardware layer already there, and four memory barrier StoreBarriers LoadBarriers JVM package and implemented.

   From the hardware layer

    Memory barriers divided into two types, LoadBarriers and StoreBarriers.

    • LoadBarriers: before the operation after executing a barrier, ensuring that you refresh the cached data, which means that the cache invalidation, forced to refresh data from the cache memory.
      = I A; 
      LoadBarriers; 
      // else ..

      In the above pseudo-code, before performing other operations must guarantee a variables read from the main memory to the cache is refreshed.

    • StoreBarriers: Data written to the cache before this barrier synchronization into memory, and to ensure that other threads are visible.
      . 1 = A ; 
      B = 2 ; 
      C =. 3 ; 
      StoreBarriers; 
      // else ..

      In the above pseudo-code, to ensure that prior to other operations, the write cache a, b, c three variables synchronized to main memory, and other threads may change was observed variables.

  JVM implementation of memory barrier

  1. LoadLoad: For Load1; LoadLoad; Load2 such a case, prior to the Load operation to ensure Load1 and Load2 later, and its visible. E.g:
    ...
    int i = a;
    LoadLoad;
    int j = b; 

    In this code, the int j = b and subsequent Load operation, can see the operation int i = a, i.e. before int i = a later read operation. I.e., prohibits the reordering int i = a read operation and after.

  2. LoadStore: For Load1; LoadStore; Store1, the operation prior to ensure Load1 Store1 and subsequent Store operations, i.e. visible after Store operation. Such as:
    int I = A; 
    LoadStore 
    B = 1; 

    // int I = A 1 for b = store operation and after are visible.
  3. S toreLoad: supra, Store1; StoreLoad; Load1 case, the operation to ensure Store1 Load prior to all subsequent operations, and its Variable Store operation visible to other processors. As the Store operation will be immediately flushed to the processor cache memory and other visible features , comprising three other barrier function, but relative, the overhead cost of their larger.
  4. S toreStore: In Store1; StoreStore; Store2 case, prior to the operation to ensure Store1 store2 operation, i.e. prior to the subsequent Store1 Store operation, Store1 flushed to ensure that the operation and visible to other processors.

  The volatile disable command reordering

    We all know the volatile keyword has two semantics:

    • Ensure visibility of memory
    • Prohibit instruction reordering

    JVM where it is prohibited instruction reordering implemented in hardware level is modified by volatile variable is inserted before and after the memory barrier . Rules volatile memory barrier variables are as follows:

  Each volatile inserted before a write operation StoreStore barrier interposed write operation StoreLoad barrier;
  each volatile inserted before the read operation LoadLoad barrier interposed after the read LoadStore barrier;

    In terms of the compiler is because volatile memory of the six operating variables have special rules, you can see another one of my articles - Talking about the memory model , which introduces the principle of Two Kinds of volatile, while also it explains why there is no atomic volatile keyword.

 

 

 If the article is not correct, but also look pointed out, I would like to thank!

Guess you like

Origin www.cnblogs.com/zhangweicheng/p/11674660.html