Detailed explanation of the role and usage of Context in Go language

KDP (Data Service Platform) is a data service product independently developed by KaiwuDB. With KaiwuDB as the core, it is a one-stop data service platform for AIoT scenarios, meeting industrial Internet of Things, digital energy, Internet of Vehicles, smart industries and other industries. Comprehensive business requirements for data collection, processing, calculation, analysis, and application in core business scenarios, realizing "business is data, data is service", and helping enterprises to mine greater business value from data.

When developing the real-time computing component of the data service platform, you may encounter such a problem: the real-time computing component provides users with the function of custom rules. After the user registers multiple rules and runs them for a period of time, and then modifies the definition of the rules and restarts, the problem will occur. A coroutine leak occurred.

1. Real case

This article will use pseudocode to comprehensively introduce the problem of coroutine leakage that may be encountered in the process of developing real-time computing components of the data service platform.

//规则的大致数据结构
type DummyRule struct{
    BaseRule    
    sorce []Source    
    sink  []Sink    
    //flow map key:flow 名称,value:flow 实例    
    flow map[string]Flow    
    ...
}

The above DummyRule is the rule data structure of this example, which includes multiple data sources Source, multiple data target Sinks, and data flow Flow. The specific process of the rules is as follows:

1 and 2 are two sources, first process the two sources of 1 and 2 respectively by addition; secondly call the Merge operation to synthesize a stream; then perform the Fanout operation to generate two identical streams, which flow into 7 and 8 respectively; finally pass 7 and The number type of 8 is converted into a character string and written into the out1.txt and out2.txt files respectively.

type Source struct{
  consumers       []file.reader  
  out             chan interface{}  
  ctx             context.Context  
  cancel          context.CancelFunc  
  ...
}

The figure above is the pseudocode of the Source class data source, consumers are the readers used to read file data, out is the channel used to pass to the next data stream, and ctx is the context of Go. It is a separate coroutine for consumers to read file data, and the read data will be put into out, waiting for the consumption of the next data stream.

type Sink struct{
   producers  []file.writer   
   in         chan interface{}   
   ctx        context.Context   
   cancel context.CancelFunc   
   ...
}

The above figure is the pseudo code of the Sink class data object, producers are the writers used to write files, in is the channel used to accept the previous data stream, ctx is the context of Go, producers write file data is also a separate coroutine .

func(fm FlatMap) Via(flow streams.Flow) streams.Flow{
    go fm.transmit(flow)
    return flow
}

The figure above is the source code of the data flow transfer. The usage of FlatMap is curFlow := prevFlow.Via(nextFlow), so that the previous Flow can be passed to the next Flow. You can see that a data flow transfer process is carried out in a coroutine.

From the previous source code, we can see that there are at least 10 coroutines in this example rule, but in fact, there are much more than 10 coroutines. It can be seen that in the real-time computing components of the data service platform, the management of coroutines is very complicated.

After repeated tests and investigations using tools such as go pprof, top, and go traces, we found that the coroutine leak was caused by the incorrect cancellation of the Sink's Context in the rules.

Context is an important language feature for managing goroutines. Learning to use Context correctly can better clarify and manage the relationship between goroutines. From the above examples, we can see the importance of Context. Learning how to use Context correctly can not only improve the code quality, but also avoid a lot of coroutine leak investigation work.

Two, into the Context

1 Introduction

Context is usually called context. In Go language, it is understood as the running state and scene of goroutine. There is a transfer of Context between upper and lower goroutines, and upper goroutines will pass Context to lower goroutines.

Before each goroutine runs, it needs to know the current execution state of the program in advance. Usually, these states are encapsulated in a Context variable and passed to the goroutine to be executed.

In network programming, when a network request Request is received and the Request is processed, it may be processed in multiple goroutines. And these goroutines may need to share some information of the Request. When the Request is canceled or timed out, all goroutines created from this Request will also be terminated.

The Go Context package not only implements the method of sharing state variables between program units, but also can pass signals such as expiration or revocation to the called program by setting the value of the ctx variable outside the called program unit in a simple way unit.

In network programming, if A calls B's API and B calls C's API, then if A calls B to cancel, then B's call to C should also be cancelled. The Context package makes it very convenient to pass request data, cancellation signals, and timeout information between request goroutines.

The core of the Context package is the Context interface:

// A Context carries a deadline, a cancellation signal, and other values across
// API boundaries
//
// Context's methods may be called by multiple goroutines simultaneously.
type Context interface{     
     // 返回一个超时时间,到期则取消context。在代码中,可以通过deadline为io操作设置超过时间     
     Deadline() (deadline time.Time, ok bool)     
     // 返回一个channel,用于接收context的取消或者deadline信号。     
     // 当channel关闭,监听done信号的函数会立即放弃当前正在执行的操作并返回。     
     // 如果context实例时不可能取消的,那么     
     // 返回nil,比如空context,valueCtx     
     Done()
}

2. How to use

For goroutines, their creation and calling relationship is always like a layer-by-layer call, like a tree structure, and the Context at the top should have a way to actively close the execution of subordinate goroutines. In order to realize this relationship, Context is also a tree structure, and the leaf nodes are always derived from the root node.

To create a Context tree, the first step should be to get the root node, and the return value of the Context.Backupgroup function is the root node.

func Background() Context{
    return background
}

This function returns an empty Context, which is generally created by the first goroutine that receives the request, and is the root node of the Context corresponding to the incoming request. It cannot be canceled, has no value, and has no expiration time. He often exists as the top-level Context that handles the Request.

With the root node, you can create descendant nodes. The Context package provides a series of methods to create them:

func WithCancel(parent Context) (ctx Context, cancel CancelFunc) {}
func WithDeadline(parent Context, d time.Time)(Context, CancelFunc) {}
func WithTimeout(parent Context, timeout time.Duration) (Context, CancelFunc) {}
func WithValue(parent Context, key, val interface{}) Context {}

The function receives a parent of Context type and returns a value of Context type, so that different Contexts are created layer by layer, the child node is obtained from copying the parent node, and some status values ​​​​of the child node are set according to the received parameters, and then Then the child node can be passed to the lower goroutine.

How to pass the changed state through Context?

In the parent goroutine, a cancel method can be obtained through the Withxx method, thus gaining the right to operate the child Context.

(1)WithCancel

 The WithCancel function is to copy the parent node to the child node and return an additional CancelFunc function type variable, which is defined as: type CancelFunc func()

Calling CancelFunc will cancel the corresponding child Context object. In the parent goroutine, the Context of the child node can be created through WithCancel, and the control of the child goroutine is also obtained. Once the CancelFunc function is executed, the child node Context is over. The child node needs the following code to determine whether it has ended and exit the goroutine :

select {
case <- ctx.Cone():
    fmt.Println("do some clean work ...... ")
}

(2)WithDeadline

 The function of WithDeadline is similar to that of WithCancel. It also copies the parent node to the child node, but its expiration time is determined by the expiration time of deadline and parent. When the expiration time of parent is earlier than deadline, the returned expiration time is the same as the expiration time of parent. When the parent node expires, all descendant nodes must be closed at the same time.

(3)WithTimeout

The WithTimeout function is similar to WithDeadline, except that what he passes in is the remaining life time of the Context from now on. They both also return control of the created child Context, a function variable of type CancelFunc.

When the top-level Request request function ends, we can cancel a certain Context, and the descendant goroutine judges the end according to select ctx.Done().

(4)Withvalue

WithValue function, return a copy of parent, calling the Value(key) method of this copy will get the value. In this way, we not only retain the original value of the root node, but also add new values ​​to the descendant nodes; note that if the same key exists, it will be overwritten.

3. Examples

package main
import (
        "context"        
        "fmt"        
        "time"
)
func main() {
        ctxWithCancel, cancel := context.WithTimeout(context.Background(), 5 * time.Second)                
        
        go worker(ctxWithCancel, "[1]")        
        go worker(ctxWithCancel, "[2]")                
        
        go manager(cancel)                
        
        <-ctxWithCancel.Done()        
        // 暂停1秒便于协程的打印输出        
        time.Sleep(1 * time.Second)        
        fmt.Println("example closed")
}
func manager(cancel func( )) {
        time.Sleep(10 * time.Second)         
        fmt.Println("manager called cancel()")         
        cancel() 
}                
func worker(ctxWithCancle context.Context, name string) {
        for {
                 select {                 
                 case <- ctxWithCancel.Done():                          
                          fmt.Println(name, "return for ctxWithCancel.Done()")                          
                          return                 
                 default:
                          fmt.Println(name, "working")                 
                          }                 
                          time.Sleep(1 * time.Second)        
        }
}

The architecture diagram of the Context of this process:

[1]working
[2]working
[2]working
[1]working
[1]working
[2]working
[2]working
[1]working
[1]working
[2]working
[1]return for ctxWithCancel.Done()
[2]return for ctxWithCancel.Done()example closed

It can be seen that the end of the worker this time is caused by the expiration of the timer of ctxWithCancel.

Change the duration of the manager to 2 seconds, keep the duration of WithTimeout unchanged, and run it again. The worker only worked for 2 seconds before being stopped by the manager in advance.

[1]working
[2]working
[2]working
[1]workingmanager called cancel()
[1]return for ctxWithCancel.Done()
[2]return for ctxWithCancel.Done()example closed

Guess you like

Origin blog.csdn.net/ZNBase/article/details/131410274