Concurrent programming - visibility and order

If you are interested in learning more about it, please visit my personal website: Yetong Space

JMM is the Java Memory Model, which defines the abstract concepts of main memory and working memory, and the bottom layer corresponds to CPU registers, cache, hardware memory, CPU instruction optimization, etc. JMM is reflected in the following aspects:

  • Atomicity: guarantees that instructions will not be affected by thread context switches
  • Visibility: guarantees that instructions will not be affected by the cpu cache
  • Orderliness: Guarantee that instructions will not be affected by parallel optimization of cpu instructions

One: Visibility

1.1: Problem display

You can look at the following piece of code first:

public class Demo {
    
    
  static boolean run = true;

  public static void main(String[] args) {
    
    
    new Thread(() -> {
    
    
      while(run) {
    
    }
      System.out.println("停止循环");
    }).start();

    try {
    
    
      Thread.sleep(1000);
    } catch (InterruptedException e) {
    
    
      e.printStackTrace();
    }
    System.out.println("停止t");
    run = false;
  }
}

It can be seen that even if the main thread changes run to false, the child thread still does not stop the loop.

insert image description here

At the beginning of the child thread, the value of run is read from the main memory to the working memory:

insert image description here

Because the child thread frequently reads the value of run from the main memory, the JIT compiler will cache the value of run to the cache in its own working memory, reducing access to run in the main memory and improving efficiency:

insert image description here

After 1 second, the main thread modifies the value of run and synchronizes it to the main memory, while the child thread reads the value of this variable from the cache in its own working memory, and the result is always the old value.

insert image description here

1.2: Problem solved

We can use volatile (volatile keyword) to solve the problem. It can be used to modify member variables and static member variables. It can prevent threads from looking up the value of variables from their own working cache, and must go to main memory to get its value. Thread operations on volatile variables directly operate on main memory. However, this method also reduces the operating efficiency of the program.

insert image description here

In addition, you can also use synchronized to solve this problem:

public class Demo {
    
    
  static boolean run = true;

  // 锁对象
  final static Object lock = new Object();

  public static void main(String[] args) {
    
    
    new Thread(() -> {
    
    
      while(true) {
    
    
        synchronized (lock) {
    
    
          if (!run) break;
        }
      }
      System.out.println("停止循环");
    }).start();

    try {
    
    
      Thread.sleep(1000);
    } catch (InterruptedException e) {
    
    
      e.printStackTrace();
    }
    System.out.println("停止t");
    synchronized (lock) {
    
    
      run = false;
    }
  }
}

insert image description here

Synchronized implements the principle of visibility:

  • The data must be written back to the main memory before the lock is released: once a code block or method is modified by synchronized, after it is executed, any modification made by the locked object must be written from the thread memory before it is released. back to main memory. That is to say, there will be no inconsistency between the thread memory and the main memory content.
  • After the lock is acquired, the data must be read from the main memory: Similarly, after the thread enters the code block and obtains the lock, the data of the locked object is also read directly from the main memory, because the previous thread is released. The modified content will be written back to the main memory, so the data read by the thread from the main memory must be the latest.

The difference between synchronized and volatile:

  • The essence of volatile is to tell jvm that the value of the current variable in the register (working memory) is uncertain and needs to be read from the main memory; synchronized is to lock the current variable, only the current thread can access the variable, and other threads are blocked .
  • volatile can only be used at the variable level; synchronized can be used at the variable, method, and class levels
  • Volatile can only realize the modification visibility of variables, but cannot guarantee atomicity; while synchronized can guarantee the modification visibility and atomicity of variables
  • Volatile will not cause thread blocking; synchronized may cause thread blocking.
  • Variables marked with volatile will not be optimized by the compiler; variables marked with synchronized can be optimized by the compiler

Two: command rearrangement

The JVM will adjust the execution order of statements according to the situation without affecting the correctness.

Look at the following code first, assigning values ​​to i and j, assigning values ​​to i first and then assigning values ​​to j is actually the same as assigning values ​​to j first and then assigning values ​​to i, so in actual execution, both orders are possible. This feature is called instruction reordering.

static int i;
static int j;

// 在某个线程内执行如下赋值操作
i = ...;
j = ...;

The meaning of instruction rearrangement, in fact, comes from the optimization of performance in the final analysis. Compared with the efficiency of cache, memory, and hard disk IO, the operating efficiency of CPU is exponentially different. CPU is a precious resource of the system, so how can it be better? The optimization and utilization of this resource can improve the performance of the entire computer system. In fact, instruction reordering is an optimization idea derived from life. For example, when cooking, we usually put the slowest cooked dishes at the beginning (such as soup), because in the process of waiting for these dishes to be cooked (IO waiting) we (CPU) can also do other things, which is a kind of time optimization. The same is true in the computer field, it will also do some optimization according to the type of instruction, the purpose is to utilize the resources of the CPU, so that the efficiency of the entire computer can be improved.

Three Reordering Scenarios

  • Compiler reordering: For program code language, the compiler can adjust and reorder the order of code statements without changing the semantics of single-threaded programs.
  • Instruction set parallel reordering: This is for the CPU instruction level. The processor uses instruction set parallel technology to overlap and execute multiple instructions. If there is no data dependency, the processor can change the machine instruction corresponding to the statement. Execution order.
  • Memory reordering: Because the CPU cache uses the buffer method (Store Buffer) for delayed writing, this process will cause visibility problems for multiple CPU caches. This visibility problem leads to inconsistencies in the sequential execution of instructions. , from the superficial result, it seems that the order of the instructions has been changed, and the memory reordering is actually the main cause of the visibility problem.

The principle of instruction reordering (as-if-serial semantics): The compiler and processing instructions do not optimize instruction reordering in any scenario, but follow certain principles, and only when they think that reordering will not modify the program The reordering optimization will only be performed when the result has an impact. If the reordering will change the result of the program, then such performance optimization is obviously meaningless. Obeying the as-if-serial semantic rules is a principle of reordering. As-if-serial means that the compiler and processor can be allowed to reorder, but there is a condition that no matter how reordering can not be changed The result of a single-threaded execution of the program.

In a complex multi-threaded environment, the compiler and processor cannot know the dependencies of code instructions through semantic analysis, so only the person who writes the code knows this problem. At this time, the person who writes the code needs to use a method The explicit ones tell the compiler and processor where there are logical dependencies that cannot be reordered. Therefore, a set of memory barriers are provided at the compiler level and the CPU level to prohibit reordering instructions. The coder needs to identify where there is data dependence and add a memory barrier instruction, then the computer will not perform instruction optimization on it at this time. .

However, because different CPU architectures and operating systems have their own corresponding memory barrier instructions, in order to simplify the work of developers and avoid the need for developers to understand various underlying system principles, a set of specifications is encapsulated in JAVA. To isolate these complex instruction operations from developers, this set of specifications is what we often call the Java Memory Model (JMM). JMM defines several happens before principles to guide the correctness of concurrent programming. Programmers can tell the compiler and processor where reordering is not allowed through the keywords Volatile, synchronized, and final.

Three: volatile

The memory barrier has two functions:

  • prevent reordering of instructions on either side of the barrier
  • Force the dirty data in the write buffer/cache to be written back to the main memory, invalidating the corresponding data in the cache

There are two types of memory barriers, Load Barrier and Store Barrier:

  • For Load Barrier, inserting Load Barrier before the instruction can invalidate the data in the cache and force the data to be reloaded from the main memory
  • For the Store Barrier, inserting the Store Barrier after the instruction allows the latest data update in the write cache to be written to the main memory, making it visible to other threads

The so-called four kinds of memory barriers in java are LoadLoad, StoreStore, LoadStore, and StoreLoad are actually a combination of the above two, completing a series of barriers and data synchronization functions:

  • LoadLoad barrier: For such a statement Load1; LoadLoad; Load2, before the data to be read by Load2 and subsequent read operations are accessed, the data to be read by Load1 is guaranteed to be read.
  • StoreStore barrier: For such a statement Store1; StoreStore; Store2, before Store2 and subsequent write operations are executed, the write operation of Store1 is guaranteed to be visible to other processors.
  • LoadStore barrier: For such a statement Load1; LoadStore; Store2, before Store2 and subsequent write operations are flushed out, the data to be read by Load1 is guaranteed to be read.
  • StoreLoad barrier: For such a statement Store1; StoreLoad; Load2, before Load2 and all subsequent read operations are executed, the write to Store1 is guaranteed to be visible to all processors. Its overhead is the largest of the four barriers. In most processor implementations, this barrier is a universal barrier that functions as the other three memory barriers.

The underlying implementation principle of volatile is memory barrier, Memory Barrier (Memory Fence), its memory barrier strategy is very strict and conservative, very pessimistic and insecure:

  • A write barrier will be added after the write instruction to the volatile variable
  • A read barrier will be added before the read instruction of the volatile variable

Write barrier (Store Barrier): Ensure that changes to shared variables before the barrier are synchronized to main memory

volatile static boolean ready = false;

public void actor2(I_Result r) {
    
    
	num = 2;
	ready = true;  // ready是volatile赋值带写屏障
	// 写屏障
}

Read barrier (Load Barrier): Ensure that after the barrier, the read of the shared variable loads the latest data in the main memory

volatile static boolean run = false;

public void actor1(I_Result r) {
    
    
	// 读屏障
	// ready是volatile读取值带读屏障
	if (ready) {
    
    
		r.r1 = num + num;
	} else {
    
    
		r.r1 = 1;
	}
}

insert image description here

Due to the role of the memory barrier, the reordering of volatile variables and other instructions is avoided, and the communication between threads is realized, so that volatile exhibits the characteristics of locks.

Volatile performance: The read performance consumption of volatile is almost the same as that of ordinary variables, but the write operation is slightly slower because it needs to insert many memory barrier instructions in the native code to ensure that the processor does not execute out of order.

四:happens-before

Happens-before stipulates that the write operation of shared variables is visible to the read operations of other threads. It is a set of rule summary of visibility and orderliness. Aside from the following happens-before rules, JMM cannot guarantee that a thread can read shared variables. The writing of the shared variable is visible to other threads reading the shared variable.

The write to the variable before the thread unlocks m is visible to the reads of the variable by other threads that lock m next:

  static int x;
  static Object m = new Object();
  
  public static void main(String[] args) {
    
    
    new Thread(() -> {
    
    
      synchronized(m) {
    
    
        x = 10;
      }
    }, "t1").start();

    new Thread(() -> {
    
    
      synchronized(m) {
    
    
        System.out.println(x);
      }
    }, "t2").start();
  }

The writing of a thread to a volatile variable is visible to the reading of the variable by other threads:

  volatile static int x;
  
  public static void main(String[] args) {
    
    
    new Thread(() -> {
    
    
      x = 10;
    }, "t1").start();

    new Thread(() -> {
    
    
      System.out.println(x);
    }, "t2").start();
  }

The writing to the variable before the thread starts is visible to the reading of the variable after the thread starts:

  static int x;
  
  public static void main(String[] args) {
    
    
    x = 10;
    
    new Thread(() -> {
    
    
      System.out.println(x);
    }, "t2").start();
  }

The write to the variable before the end of the thread is visible to other threads after they know that it is over (for example, other threads call t1.isAlive() or t1.join() to wait for it to end):

  static int x;
  
  public static void main(String[] args) throws InterruptedException {
    
    
    Thread t = new Thread(() -> {
    
    
      x = 10;
    });
    t.start();
    
    t.join();
    System.out.println(x);
  }

Thread t1 writes to variables before interrupting t2 (interrupt), and reads variables after other threads know that t2 is interrupted (via t2.interrupted or t2.isInterrupted):

  static int x;

  public static void main(String[] args) {
    
    
    Thread t2 = new Thread(() -> {
    
    
      while (true) {
    
    
        if (Thread.currentThread().isInterrupted()) {
    
    
          System.out.println(x);
          break;
        }
      }
    }, "t2");
    t2.start();

    new Thread(() -> {
    
    
      try {
    
    
        Thread.sleep(1000);
      } catch (InterruptedException e) {
    
    
        e.printStackTrace();
      }
      x = 10;
      t2.interrupt();
    }, "t1").start();

    while (!t2.isInterrupted()) {
    
    
      Thread.yield();
    }
    System.out.println(x);
  }

Writing to a variable's default value (0, false, null) is visible to other threads reading the variable:

  volatile static int x;
  static int y;

  public static void main(String[] args) {
    
    
    new Thread(() -> {
    
    
      x = 20;
      y = 10;
    }, "t1").start();
    
    new Thread(() -> {
    
    
      // x=20对t2可见,同时y=10也对t2可见
      System.out.println(x);
      System.out.println(y);
    }, "t2").start();
  }

Guess you like

Origin blog.csdn.net/tongkongyu/article/details/129347272