Concurrent programming note 1: the causes of concurrent bugs

Foreword:

    Recently I am learning the knowledge of concurrent programming, and I plan to study concurrency. I have dealt with concurrency problems before, but I have not learned enough system and knowledge is relatively scattered.
So I bought a geek-time concurrent course (java concurrent programming practice) and "Concurrent Programming Practical Combat" learn it systematically from the beginning. Record your own learning process and experience here.

    First of all, we have to know why concurrent programming bugs occur. The root cause is the CPU processing speed >> memory >> I/O. Here, ">>" is used, which means far greater than in geek time. More vividly, it is a
day in the sky and a year in the world. The sky refers to the CPU for memory and memory for I/O. It is used to highlight the gap in processing speed. Because of the difference between these three, the utilization of the cpu is relatively low, and it has to wait for the memory and I/O to be processed.
Therefore, in order to reduce the gap between the three, three optimizations have been made: 1. The
CPU adds cache to balance the speed difference with memory;
2. The operating system adds processes and threads to time- share the CPU and balance The speed difference between CPU and I/O device;
3. The compiler optimizes the order of instruction execution, so that the cache can be used more reasonably.

These three types of optimizations correspondingly bring three problems, all of which may cause concurrency problems:

1. Cache visibility problem:
    The first optimization above adds a cache between cpu and memory. In the case of multi-core cpu, each cpu may have its own cache. Now suppose that a value=0, when thread a adds 1 to the value in the cpu1 cache, and thread b adds 1 to the value in the cpu cache, when they refresh the value of value in the memory, the order may not be what we want The value becomes 2. The
possible execution process is that threads a and b read the value from the memory into their own cpu cache, thread a first adds 1 to the value, and then refreshes the value to 1 in the memory. At this time, thread b also adds 1 to the value and flushes it to the memory, overwriting the value written by a, and the value is still 1. This is the visibility problem of the cache. The respective cpu caches do not know, and other threads also have the same value Did the operation.
A classic example is to start multiple threads to increase the same variable, and the value obtained is not the expected value.

2. The atomicity problem caused by thread switching:
    This problem is actually easier to understand. We have learned the operating system and know that when the multi-process operating system performs tasks, it is not the serial process a that is executed, and then process b Each process has an execution time allocated by the operating system (called a time slice). When the time slice of process a is exhausted,
it switches to another process. Such alternate execution between processes, and because the time slice is very short, the user cannot perceive this switching. After introducing process switching, let's talk about high-level languages ​​like java. The corresponding underlying operating system instructions may be many.
For example, if we think count+=1, it is not an atomic operation for the operating system, it may be 1, 2, 3... several, then when two threads both increase count by 1, if the value of count in process a has not been increased After that, the time slice given to process a by the operating system is used up and switch to process b. At this time, the value read by process b is still not increased, so the calculation result does not meet our expectations.

3. Instruction execution order problem caused by compilation optimization:
    We write java know that the volatile keyword has the effect of making variables visible and prohibiting instruction reordering. In fact, this problem is addressed. The optimization of the compiler can greatly improve the execution efficiency, but it will also bring some problems. "int a = 1; int b = 2" is optimized to "int b = 2; int a = 1" but it is not a big problem.
Here is an example of Geek Time:

public class Singleton {
    
    
  static Singleton instance;
  static Singleton getInstance(){
    
    
    if (instance == null) {
    
    
      synchronized(Singleton.class) {
    
    
        if (instance == null)
          instance = new Singleton();
        }
    }
    return instance;
  }
}

This is a single case of double check lock. The code execution sequence is
1. Allocate a piece of memory M;
2. Initialize the Singleton object on memory M;
3. Then assign the address of M to the instance variable.
It may be optimized to:
1. Allocate a piece of memory M;
2. Assign the address of M to the instance variable;
3. Finally, initialize the Singleton object on the memory M.
This may lead to the following situation, thread a calls getInstance(), if a is in the optimized step 2, and then the time slice is used up, at this time thread b comes in and executes to the first instance == null, and it is found If instance is not null, an uninitialized instance is returned. At this time, access to the instance object may throw a null pointer exception.

Guess you like

Origin blog.csdn.net/sc9018181134/article/details/103114748