Process, thread, coroutine? Detailed explanation of the principle of coroutine in go - what is coroutine? Why have coroutines?

1. Why is there a coroutine?

We use factory to represent a computer, and the memory space is equivalent to the factory's land. Then we can think of the process as the factory building in the factory, which occupies the factory's land. ( Process is the smallest unit for allocating resources ) What is
Insert image description here
a thread ? We continue to use this scenario to understand that threads are equivalent to assembly lines in the factory (threads are the smallest unit of resource scheduling). Each factory can have multiple assembly lines (a process can have multiple threads). The existence of assembly lines It occupies the space of the factory (threads use the memory allocated to the process by the system, and the memory is shared between the same thread).
Insert image description here

We run two pieces of code to simulate the working status of thread concurrency, as shown in the figure below: Let the CPU switch between multiple threads, as shown in the figure below, the CPU first executes thread one to obtain the intermediate results, and then the system switches the CPU ;

Insert image description here

After switching, go to thread two, as shown in the figure below, obtain the intermediate results of thread two, and repeat the cycle. In this way, a single-core CPU can realize multi-threaded concurrent work.
Insert image description here

Through the above process, we can know that threads will occupy CPU time , and the scheduling of threads needs to be carried out by the system, which will be relatively expensive . If we still use the above scenario to understand, threads are equivalent to the production lines of the factory (the programs running in the threads That is the production process), which will take up workers' working hours. So the disadvantages of threads are as follows:

  • The thread itself takes up a lot of resources
  • Thread operations are expensive
  • Thread switching is expensive

So how do coroutines work? First, the coroutine includes execution code and temporary execution status (such as variable intermediate values). At this time, the CPU will be fixed on one thread for execution, so there is no thread switching overhead. By placing the relevant data of the coroutine on the thread , Let the CPU execute it. After the execution is completed, the intermediate results are stored in the coroutine , then cleared, and then executed in coroutine 2.
Insert image description here
The execution of coroutine 2 is similar to coroutine 1. It is first executed on the thread and then the relevant intermediate variables are saved.
Insert image description here

This way is to let the same thread execute multiple coroutines . The essence of a coroutine is to package the running status of a piece of data and can be scheduled between threads. Therefore, the coroutine does not replace the thread. The coroutine also runs on the thread (the thread is the resource of the coroutine, and the coroutine uses the thread as a resource). to run). Such benefits, that is to say, the benefits of coroutines:

  • Resource utilization: Coroutines can use any thread to run without waiting for CPU scheduling;
  • Fast scheduling: Coroutines can be quickly scheduled (avoiding system calls and switching), and fast switching;
  • Ultra-high concurrency: With limited threads, many coroutines can be concurrently executed;

2. The nature of coroutines

First, you can use go 函数名()to start a coroutine. In the Go language, the essence of a coroutine is a structure named g. Since there are many internal members of this structure, we capture several important variables for explanation, as follows:

Insert image description here

First of all, the leftmost one is the structure of the coroutine. This article mainly focuses on four structure variables. The first variable is a stack structure. There are two pointers in this structure, which refer to the high-order pointers of the data currently in the stack. hi and low pointer lo;

The second variable is the sched structure, which contains a gobuf structure. Gobuf stores the current running status of the coroutine. For example, sp is the stack pointer, pointing to a certain piece of data pushed on the stack, which is actually the current running status. For a function, the first pc is the program counter, which stores which line of code is currently running.

The third variable, atomicstatus, stores the status of the coroutine; the fourth variable, goid, stores the id of the coroutine.

3. How coroutines are executed in threads

We know that coroutines are executed using threads, so we look at the bottom layer of threads and understand the relationship between threads and coroutines. In go, a thread is essentially a structure named m. We also only focus on a few of them. related variables.

Insert image description here

Insert image description here
In Go, each thread performs a series of work in a loop, also called a single-thread loop, as shown in the figure below: the left side is the stack, and the right side is the function sequence executed by the thread, and the business method is the coroutine method.
Insert image description here
The ordinary coroutine stack can only record the business information of the business method, and there is no ordinary coroutine stack before the thread obtains the coroutine. Therefore, a g0 stack is opened in the memory, specifically used to record function call jump information. The following table is the execution environment.
Insert image description here
The above successfully executed the coroutine on the thread, but in actual use, it is actually a multi-threaded loop, as shown in the following figure:
Insert picture description here

Insert image description here
However, there will be concurrency problems in this process of multi-threaded acquisition of a coroutine, so the existence of a lock is required in this process. This thread loop is very similar to a thread pool. The operating system does not know the existence of the coroutine. Second, it executes a scheduling loop to execute the coroutine sequentially.

But the thread loop we are learning here allows coroutines to be executed only sequentially, which means that when the number of threads in the system is determined, using this thread loop can only execute coroutines equal to the number of threads in the system at the same time. In a certain situation In a sense, this is actually a sequential execution. And in a multi-thread loop, the thread needs to obtain the coroutine information from the queue in order to execute the coroutine task. In the process, it needs to grab the lock, which will also cause some problems.

4. GMP dispatch model

In this part, we mainly solve the problems of multi-thread loops mentioned above. When multiple threads acquire coroutine tasks globally, they often need to grab locks, which makes lock conflicts possible, as shown in the following figure:
Insert image description here
Solution Its essence is to reduce threads to minimize lock-grabbing operations in the global environment, and instead execute coroutine tasks locally without locks. The professional term for this idea is called local queue, which allows the thread to grab multiple coroutines at once after grabbing the lock, and link these captured coroutines into a local queue. When all the captured coroutines are executed After the completion, the global lock grabbing operation will be carried out, thus avoiding part of the lock grabbing operation.
Insert image description here
The GMP scheduling model introduced next is the specific scheduling model used to solve lock conflicts in go. The G in it refers to the coroutine structure g, M refers to the thread structure m, and P refers to a structure. The body is actually a local queue. The members of this structure are very complex. We mainly look at some members related to the scheduling model.

Insert image description here
Insert image description here

Next we summarize the role of P:

  • The intermediary between M and G, we can understand it as a feeder
  • P holds some G, so that it is not necessary to find G from the global
  • Greatly reduces concurrency conflicts

Notice:

  1. If there is no coroutine in a thread's local queue or global queue that can be executed, the thread will "steal" the coroutine from other threads, thus increasing the thread utilization.
  2. If you create a new coroutine, the system will randomly find a local queue and place the new coroutine in the runnext of P to jump in the queue (in go, the new coroutine is considered to have a high priority). If the local queue is full, this will be queued. The new coroutine is placed in the global queue.

5. Coroutine concurrency

In the previous article, we solved the lock-grabbing problem of thread loops using the scheduling model. The remaining problem is about how to make coroutines concurrent. At first glance, we may easily think that there is no problem with this problem, but in fact it will cause coroutines to concurrency. Cheng hunger problem. This problem refers to the fact that a certain coroutine being executed by a thread requires too much time, causing the execution of some time-sensitive coroutines in the queue to fail.

The basic solution is to temporarily suspend the current task after the coroutine has been executed for a period of time and execute subsequent coroutine tasks to prevent time-sensitive ctrip execution from failing. As shown in the figure below:
Insert image description here
When the coroutine currently executed in the thread is a long-term task, saving the running status of the coroutine first is to protect the scene. If it needs to continue to be executed later, put it into the local queue. Go, as shown, it is put into sleep state without execution, and then jumps directly to the schedule function.

This makes the local queue a small loop, but if the threads in the current system have a very large coroutine task in the local queue, then all threads will be busy for a period of time, and the tasks in the global queue It will be unable to run for a long time. This problem is also called the global queue hunger problem . The solution is to take out a task from the global queue with a certain probability when the local queue circulates, and let it also participate in the local loop.

This seems perfect, but in fact when the coroutine is running, it is difficult for us to interrupt the task of the coroutine. The solution is as follows:

  1. Active hooking: gopark method , when the business calls this method, the thread will directly return to the schedule function and switch the coroutine stack. The currently running coroutine will be in a waiting state, and the coroutine in the waiting state cannot immediately enter the task queue . Programmers cannot actively call the gopark function, but we can actively hang through functions with gopark such as Sleep. After Sleep for five seconds, the system will change the waiting status of the task to the running status and put it in the queue.
  2. When the system call is completed: the go program makes a system call in the running state , then when the underlying call of the system is completed, the exitsyscall function will be called, and the thread will stop executing the current coroutine and put the current coroutine into the queue.
  3. Mark preemption morestack() : This method will be called when a function jumps. Its original intention is to check whether the current coroutine stack space has enough memory. If not, the stack space must be expanded. When the system monitors that the coroutine is running for more than 10ms, it sets g.stackguard0 to 0xfffffade (this value is a preemption flag), allowing the program to judge whether to set the stackguard in g to preemption when only executing the morestack function. If so, If it is preempted by the mark, return to the schedule method and put the current coroutine back into the queue.

6. Signal-based preemptive scheduling

What should we do when the program can neither actively suspend nor make system calls or function calls during the execution process, that is to say, the above solutions for coroutine concurrency do not work? Therefore, signal-based preemptive scheduling is proposed. The signal here is actually a thread signal. There are many signal-based underlying communication methods in the operating system, and our thread can register the corresponding signal processing function.

Basic idea:

  • Register a handler for the SIGURG signal (which is rarely used elsewhere)
  • When GC is working (GC work means that some threads are stopped), send a signal to the target thread
  • The thread receives the signal and triggers scheduling.

Insert image description here

After the GC releases the signal, the thread currently processing the coroutine task will execute the doSigPreempt function, put the current coroutine back into the queue, and call the schedule function again.

Guess you like

Origin blog.csdn.net/weixin_39589455/article/details/131333177