Active scheduling of Go language scheduler(20)

The following content is reproduced from  https://mp.weixin.qq.com/s/zA7KY_25NGjip9pP38RIvg

Awa love to write original programs Zhang  source Travels  2019-05-24

This article is the 20th article of the "Go Language Scheduler Source Code Scenario Analysis" series, and it is also the first subsection of Chapter 5 "Active Scheduling".


 

The active scheduling of Goroutine refers to the scheduling that occurs when the currently running goroutine directly calls the runtime.Gosched() function to temporarily abandon the operation .

Active scheduling is completely controlled by the user code, and we can predict where scheduling will occur based on the code. For example, in the following program, a new goroutine we call g2 is created in the main goroutine to execute the start function, and g2 repeatedly calls the Gosched() function in the loop of the start function to give up its execution rights and actively give up the CPU to The scheduler performs the scheduling.

package main

import (
    "runtime"
    "sync"
)

const N = 1

func main() {
    var wg sync.WaitGroup
 
    wg.Add(N)
    for i := 0; i < N; i++ {
        go start(&wg)
    }

    wg.Wait()
}

func start(wg *sync.WaitGroup) {
    for i := 0; i < 1000 * 1000 * 1000; i++ {
        runtime.Gosched()
    }

    wg.Done()
}

Below we start from this program to analyze how active scheduling is achieved.

First, start the analysis from the active scheduling entry function Gosched().

runtime/proc.go : 262

// Gosched yields the processor, allowing other goroutines to run. It does not
// suspend the current goroutine, so execution resumes automatically.
func Gosched() {
    checkTimeouts() //amd64 linux平台空函数
   
    //切换到当前m的g0栈执行gosched_m函数
    mcall(gosched_m)
    //再次被调度起来则从这里开始继续运行
}

Because we need to pay attention to the state of g2 goroutine after the program is running, we use gdb to debug and analyze the source code together. First, use b proc.go:266 to set a breakpoint on the mcall(gosched_m) line of the Gosched function. Then run the program, and after the program is interrupted, disassemble the function currently being executed by the program

(gdb) disass
Dump of assembler code for function main.start:
     0x000000000044fc90 <+0>:mov   %fs:0xfffffffffffffff8,%rcx
     0x000000000044fc99 <+9>:cmp   0x10(%rcx),%rsp
     0x000000000044fc9d <+13>:jbe   0x44fcfa <main.start+106>
     0x000000000044fc9f <+15>:sub   $0x20,%rsp
     0x000000000044fca3 <+19>:mov   %rbp,0x18(%rsp)
     0x000000000044fca8 <+24>:lea   0x18(%rsp),%rbp
     0x000000000044fcad <+29>:xor   %eax,%eax
     0x000000000044fcaf <+31>:jmp   0x44fcd0 <main.start+64>
     0x000000000044fcb1 <+33>:mov   %rax,0x10(%rsp)
     0x000000000044fcb6 <+38>:nop
     0x000000000044fcb7 <+39>:nop
=> 0x000000000044fcb8 <+40>:lea   0x241e1(%rip),%rax        # 0x473ea0
     0x000000000044fcbf <+47>:mov   %rax,(%rsp)
     0x000000000044fcc3 <+51>:callq 0x447380 <runtime.mcall>
     0x000000000044fcc8 <+56>:mov   0x10(%rsp),%rax
     0x000000000044fccd <+61>:inc   %rax
     0x000000000044fcd0 <+64>:cmp   $0x3b9aca00,%rax
     0x000000000044fcd6 <+70>:jl     0x44fcb1 <main.start+33>
     0x000000000044fcd8 <+72>:nop
    0x000000000044fcd9 <+73>:mov   0x28(%rsp),%rax
     0x000000000044fcde <+78>:mov   %rax,(%rsp)
     0x000000000044fce2 <+82>:movq   $0xffffffffffffffff,0x8(%rsp)
     0x000000000044fceb <+91>:callq 0x44f8f0 <sync.(*WaitGroup).Add>
     0x000000000044fcf0 <+96>:mov   0x18(%rsp),%rbp
     0x000000000044fcf5 <+101>:add   $0x20,%rsp
     0x000000000044fcf9 <+105>:retq   
     0x000000000044fcfa <+106>:callq 0x447550 <runtime.morestack_noctxt>
     0x000000000044fcff <+111>:jmp   0x44fc90 <main.start>

You can see that the currently executing function is main.start instead of runtime.Gosched. The Gosched function is not found in the entire start function. It turns out that it was optimized by the compiler. The program now stops at the instruction 0x000000000044fcb8 <+40>: lea 0x241e1(%rip),%rax. The second callq instruction below this instruction is calling runtime.mcall. We first use si 2 to execute two assembly lines. The instruction causes the program to stop at the following instruction:

=> 0x000000000044fcc3 <+51>: callq 0x447380 <runtime.mcall>

Then use ir rsp rbp rip to record the values ​​of the CPU's rsp, rbp, and rip registers for backup:

(gdb) i r rsp rbp rip
rsp    0xc000031fb0     0xc000031fb0
rbp    0xc000031fc8     0xc000031fc8
rip     0x44fcc3             0x44fcc3 <main.start+51>

Continue to look at the callq instruction at 0x000000000044fcc3. It will first put the address 0x000000000044fcc8 of the next instruction next to it on the stack of g2, and then jump to the first instruction of the mcall function to start execution. Recall the execution flow of the mcall function we analyzed in detail in Chapter 2. Combining the current scenario, mcall will complete the following things in sequence:

  1. Take the return address 0x000000000044fcc8 of the call instruction above and save it in the sched.pc field of g2, and save the rsp (0xc000031fb0) and rbp (0xc000031fc8) we viewed above in the sched.sp and sched.bp fields of g2, respectively. These registers represent the dispatch site information of g2;

  2. Restore the values ​​stored in the sched.sp and sched.bp fields of g0 to the rsp and rbp registers of the CPU, so as to complete the switch from the stack of g2 to the stack of g0;

  3. Execute the gosched_m function on the g0 stack (the gosched_m function is the parameter passed to mcall when the runtime.Gosched function calls mcall).

Continue to look at the gosched_m function

runtime/proc.go : 2623

// Gosched continuation on g0.
func gosched_m(gp *g) {
    if trace.enabled { //traceback 不关注
        traceGoSched()
    }
    goschedImpl(gp)  //我们这个场景:gp = g2
}

The gosched_m function simply calls goschedImpl:

runtime/proc.go : 2608

func goschedImpl(gp *g) {
    ......
    casgstatus(gp, _Grunning, _Grunnable)
    dropg() //设置当前m.curg = nil, gp.m = nil
    lock(&sched.lock)
    globrunqput(gp) //把gp放入sched的全局运行队列runq
    unlock(&sched.lock)

    schedule() //进入新一轮调度
}

The goschedImpl function has a g pointer type parameter. The actual parameter passed to it in this scenario is g2. The goschedImpl function first sets the state of g2 from _Grunning to _Grunnable, and releases the current worker thread m and g2 through the dropg function. (Set m.curg to nil and g2.m to nil), and then put g2 into the global run queue by calling the globrunqput function we have analyzed.

After g2 is hung into the global run queue, the status and relationship of g2 and other related parts are shown in the following figure:

image

 

From the above figure, we can clearly see that g2 is hung in the global run queue of sched. The queue has a head pointer to the first g object in the queue, and a tail pointer to the end of the queue. A g object, each g object in the queue is linked to each other through the schedlink pointer member of g; the sched structure member of g2 stores all the on-site information required for scheduling (such as the value of the stack register sp and bp, the pc instruction register The value of g2, etc.), so that when g2 is scheduled by the schedule function next time, the gogo function will be responsible for restoring this information to the rsp, rbp and rip registers of the CPU, so that g2 can start at the address of 0x44fcc8 on the stack of g2. Execute the code of g2.

After hanging g2 into the global run queue, the goschedImpl function continues to call schedule() to enter the next round of scheduling loop. At this point, g2 voluntarily gives up the right to execute by actively calling the Gosched() function, and achieves the purpose of scheduling.


Finally, if you think this article is helpful to you, please help me click on the “Looking” at the bottom right corner of the article or forward it to the circle of friends, thank you very much!

image

Guess you like

Origin blog.csdn.net/pyf09/article/details/115254088