Golang Channel Implementation Principle and Source Code Analysis

Do not communicate by sharing memory; instead, share
memory by communicating.

Safe access to shared variables is a difficult point in concurrent programming. In the Golang language, sharing memory through communication is advocated, which is actually to use channels to pass shared variables. At any given time, only one goroutine can access the value of the variable, thereby avoiding data race.
The key point of this article is to analyze the implementation principle of Channel, accompanied by source code interpretation. Based on the source code analysis, we can better understand the process and reasons of Channel implementation. Comments are given on the key steps and variables of the source code. It is not necessary to fully understand each variable and source code. function, but you can understand Channel from the perspective of code exception handling, and you can understand why the process of channel creation, writing, reading, and closing needs to be divided into multiple situations.

1. Channel data structure

1.1 hchan structure

Reading the source code of the channel, we can find that the data structure of the channel is the hchan structure, which contains the following fields, and the meaning of each field has been annotated:

type hchan struct {
    
    
 qcount   uint           // 当前 channel 中存在多少个元素;
 dataqsiz uint           // 当前 channel 能存放的元素容量;
 buf      unsafe.Pointer // channel 中用于存放元素的环形缓冲区;
 elemsize uint16        //channel 元素类型的大小;
 closed   uint32		//标识 channel 是否关闭;
 elemtype *_type 		// channel 元素类型;
 sendx    uint   		// 发送元素进入环形缓冲区的 index;
 recvx    uint   		// 接收元素所处的环形缓冲区的 index;
 recvq    waitq  		// 因接收而陷入阻塞的协程队列;
 sendq    waitq  		// 因发送而陷入阻塞的协程队列;
 lock mutex				//互斥锁,保证同一时间只有一个协程读写 channel
}

By reading the data structure of the channel, it can be found that the channel uses the ring queue as the buffer of the channel. The length of the datasize ring queue is specified when the channel is created. The sendx and recvx fields represent the tail and head of the ring queue respectively. , where sendx indicates the location where data is written, and recvx indicates the location where data is read.

The fields recvq and sendq respectively indicate the coroutine queue waiting to receive and the coroutine queue waiting to send. When the channel buffer is empty or there is no buffer, the current coroutine will be blocked and added to the recvq and sendq coroutine queue respectively. Wake up when waiting for other coroutines to operate the channel. Among them, the read-blocked coroutine is awakened by the write coroutine, and the write-blocked coroutine is woken up by the read coroutine.

The fields elemtype and elemsize indicate the type and size of elements in the channel. It should be noted that a channel can only transmit one type of value. If you need to transmit any type of data, you can use the interface{} type.

The field lock is to ensure that only one coroutine reads and writes the channel at the same time.
insert image description here

1.2 Blocking coroutine queue waitq and sudog structure

In hchan, we can see that both recvq and sendq are of waitq type, which represents the coroutine waiting queue. This queue maintains all coroutines blocked on a channel. first and last are pointers to the sudog structure type, indicating the head and tail of the queue. The waitq is connected to a sudog doubly linked list , which stores the waiting goroutines. The sudog in the queue is also a structure, representing a node in the coroutine/sync.Mutex waiting queue, which contains coroutine and data information. The waitq and sudog structures contain the following fields, and the meaning of each field has been annotated:

type waitq struct {
    
    		//阻塞的协程队列
    first *sudog 		//队列头部
    last  *sudog		//队列尾部
}
type sudog struct {
    
    		//sudog:包装协程的节点
    g *g				//goroutine,协程;

    next *sudog			//队列中的下一个节点;
    prev *sudog			//队列中的前一个节点;
    elem unsafe.Pointer //读取/写入 channel 的数据的容器;
    
    isSelect bool		//标识当前协程是否处在 select 多路复用的流程中;
    
    c        *hchan 	//标识与当前 sudog 交互的 chan.
}

insert image description here

2.Channel constructor function

2.1 Common types of Channel

  • Unbuffered Channel: Commonly used in synchronization scenarios, such as coordinating the execution between two or more concurrent goroutines, transferring critical resources, etc.

  • Buffered struct Channel: It is often used for one-way transmission data flow, such as separating producer and consumer, so as to avoid unnecessary waiting time.

  • Buffered pointer-type Channel: Buffered pointer-type Channel The elements in the pipeline are variables of pointer type. It is often used for asynchronous data transmission, separating the consumer's read data from the producer's fill data.

2.2 Channel constructor function source code analysis

func makechan(t *chantype, size int) *hchan {
    
    
    elem := t.elem	//Channel中元素类型
    
    // 每个元素的内存大小为elem.size,channel的容量为size,计算出总内存mem
    mem, overflow := math.MulUintptr(elem.size, uintptr(size))
    if overflow || mem > maxAlloc-hchanSize || size < 0 {
    
    
        panic(plainError("makechan: size out of range"))
    }

    var c *hchan
    switch {
    
    
    case mem == 0:				//无缓冲型Channel
   		 //hchanSize默认为96
        c = (*hchan)(mallocgc(hchanSize, nil, true))
        // 竞争检测器使用此位置进行同步。
        c.buf = c.raceaddr()
    case elem.ptrdata == 0:		//有缓冲的 struct 型Channel
        c = (*hchan)(mallocgc(hchanSize+mem, nil, true))
        c.buf = add(unsafe.Pointer(c), hchanSize)
    default:					//有缓冲的 pointer 型Channel
        c = new(hchan)
        c.buf = mallocgc(mem, elem, true)
    }

    // 初始化hchan
    c.elemsize = uint16(elem.size)		//每个元素在内存中占用的字节数
    c.elemtype = elem					//元素类型
    c.dataqsiz = uint(size)				//队列中元素的数量上限
    
    lockInit(&c.lock, lockRankHchan)	//初始化读写保护锁

    return c
}

The function of this code is to create a channel and initialize each field in the channel.

  1. Calculate the total memory size: the space occupied by each element is t.elem.size, the capacity of the channel is size, and the total memory size to be allocated is mem.
  2. Judging whether memory needs to be allocated according to memthe value of : divided into unbuffered type, struct type with buffered elements, and pointer type channel with buffered elements;
    • If it is an unbuffered channel, only apply for a space whose size is the default hchanSize, which is 96;
    • If there is a buffered struct channel, allocate a space of 96 + mem at one time, and adjust the buf of chan to point to the starting position of mem;
    • If it is a buffered pointer channel, apply for space for chan and buf respectively, and the two do not need to be continuous;
  3. Initialize channel: set elemsize, elemtype, dataqsiz, lock and other fields. Among them, elemsize identifies the number of bytes occupied by each element in memory, elemType contains the element type (reflect.Type), dataqsiz stores the upper limit of the number of elements in the queue (if there is no buffer channel, the default is 1), lock compression for chan A lock that protects read and write operations.
  4. Finally, return the pointer c of the created channel.

3. Implementation principle of channel write operation

3.1 Channel write exception handling

func chansend1(c *hchan, elem unsafe.Pointer) {
    
    
    chansend(c, elem, true, getcallerpc())
}

func chansend(c *hchan, ep unsafe.Pointer, block bool, callerpc uintptr) bool {
    
    
    if c == nil {
    
    
        gopark(nil, nil, waitReasonChanSendNilChan, traceEvGoStop, 2)
        throw("unreachable")
    }

    lock(&c.lock)

    if c.closed != 0 {
    
    
        unlock(&c.lock)
        panic(plainError("send on closed channel"))
    }
    
    // ...
  • For uninitialized or empty chan, the write operation will cause a deadlock "unreachable";
  • For a closed channel, the write operation will cause a panic "send on closed channel";

3.2 There is a blocking read coroutine when the channel is written - at this time, the number of elements in the ring buffer is 0

func chansend(c *hchan, ep unsafe.Pointer, block bool, callerpc uintptr) bool {
    
    
    // ...
	
    lock(&c.lock)	// 加锁

    // ...
	//从阻塞度协程队列中取出一个 goroutine 的封装对象 sudog
    if sg := c.recvq.dequeue(); sg != nil {
    
    
		//在 send 方法中,基于 memmove 方法,直接将元素拷贝交给 sudog 对应的读协程sg,并完成解锁动作
        send(c, sg, ep, func() {
    
     unlock(&c.lock) }, 3)
        return true
    }
    
    // ...

Use the lock of the channel to lock before writing. If there is a blocked read coroutine in the channel when writing to the channel, then there must be no elements in the channel at this time, so the read Ctrip is awakened, and in order to improve efficiency, directly Pass it the data to send without storing into a buffer.
insert image description here

3.3 Channel write non-blocking read coroutine and there is still space in the ring buffer

func chansend(c *hchan, ep unsafe.Pointer, block bool, callerpc uintptr) bool {
    
    
    // ...
    lock(&c.lock)	//加锁
    // ...
    if c.qcount < c.dataqsiz {
    
    	//判断环形缓冲区是否有空间
        qp := chanbuf(c, c.sendx)	//将当前元素添加到环形缓冲区 sendx 对应的位置
        //memmove(dst, src, t.size) 进行数据的转移,本质上是一个内存拷贝
        //将发送的数据直接拷贝到 x = <-c 表达式中变量 x 所在的内存地址上
        typedmemmove(c.elemtype, qp, ep)
        c.sendx++
        if c.sendx == c.dataqsiz {
    
    
            c.sendx = 0
        }
        c.qcount++
        unlock(&c.lock)
        return true
    }

    // ...
}

Use the lock of the channel to lock before writing. If the channel writes without blocking the read coroutine and there is still space in the ring buffer, you can directly write to the channel at this time, that is, directly add the current element to the ring buffer sendx corresponding to position, and sendx++, qcount++ and unlock, return.
insert image description here

3.4 When channel writes, there is no blocking read coroutine but there is no space in the ring buffer

func chansend(c *hchan, ep unsafe.Pointer, block bool, callerpc uintptr) bool {
    
    
    // ...
    lock(&c.lock)	//加锁

    // ...
    //构造封装当前 goroutine 的 sudog 对象,建立 sudog、goroutine、channel 之间的指向关系
    gp := getg()
    mysg := acquireSudog()
    mysg.elem = ep
    mysg.g = gp
    mysg.c = c
    gp.waiting = mysg
    //把 sudog 添加到当前 channel 的阻塞写协程队列中
    c.sendq.enqueue(mysg)
    
    //park 当前协程
    atomic.Store8(&gp.parkingOnChan, 1)
    gopark(chanparkcommit, unsafe.Pointer(&c.lock), waitReasonChanSend, traceEvGoBlockSend, 2)
    //倘若协程从 park 中被唤醒,则回收 sudog(sudog能被唤醒,其对应的元素必然已经被读协程取走)
    gp.waiting = nil
    closed := !mysg.success
    gp.param = nil
    mysg.c = nil
    releaseSudog(mysg)
    return true
}

Use the lock of the channel to lock before writing. If the channel writes without blocking the read coroutine and there is no space in the ring buffer, the buffer cannot be written at this time, and the current coroutine needs to be added to the blocking write coroutine queue and wait. Woke up by a read coroutine. When awakened, the corresponding elements must have been taken away by the reading coroutine (for details, please refer to the reading process in the next chapter: Write coroutines that are blocked during reading), so the occupied space can be cleared directly.insert image description here

3.5 Summary of channel writing process

insert image description here

  1. First determine whether the channel is nil or not initialized, if it is empty, it will cause a deadlock
  2. If the channel is not empty, since the channel is a shared resource, it is necessary to lock the channel
  3. Continue to judge whether the channel is closed, if closed, trigger panic: send on closed channel
  4. If the channel is non-empty and not closed, it will officially enter the writing process, and first determine whether there is a blocked reading coroutine
    • If there is a blocked reading coroutine, and the number of elements in the ring buffer is 0 at this time, wake up the reading ctrip, pass the data to be sent directly to it, and complete the writing, unlock and return
    • If there is no blocked read coroutine, then judge whether there is space in the ring buffer
      • If there is space in the ring buffer, directly add the current element to the position of sendx in the ring buffer, and update the write position sendx and the number of channel elements qcount, and return to the function after unlocking.
      • If there is no space in the ring buffer, add the current coroutine to the blocking write coroutine queue, block the coroutine, wait for the read coroutine to wake up, and complete the unlocking

4. Implementation principle of channel read operation

4.1 Channel read exception handling: read empty channel

func chanrecv(c *hchan, ep unsafe.Pointer, block bool) (selected, received bool) {
    
    
    if c == nil {
    
    
        gopark(nil, nil, waitReasonChanReceiveNilChan, traceEvGoStop, 2)
        throw("unreachable")
    }
    // ...
}

As shown above, if you want to read an uninitialized empty channel, call runtime.gopark to suspend the current Goroutine, causing deadlock "unreachable";

4.2 The channel is closed when reading and there are no elements inside the ring buffer

func chanrecv(c *hchan, ep unsafe.Pointer, block bool) (selected, received bool) {
    
    
  
    lock(&c.lock)
 // ...
    if c.closed != 0 {
    
    
        if c.qcount == 0 {
    
    
            unlock(&c.lock)
            if ep != nil {
    
    
            //typedmemclr(ptr, size):从 ptr 开始的地址上清空 size 字节的数据,将要清空的内存空间设置成数据类型的零值。
            	// Channel 已经关闭并且缓冲区没有任何数据,返回c.elemtype的零值
                typedmemclr(c.elemtype, ep)
            }
            return true, false
        }
    } 

    // ...

If the Channel is closed and there is no data in the buffer, runtime.chanrecv will directly unlock and return zero.
The processing that the Channel has been closed but the buffer has data will be performed in subsequent judgments.

4.3 Write coroutines with blocking on reads - the ring buffer is unbuffered or full


func chanrecv(c *hchan, ep unsafe.Pointer, block bool) (selected, received bool) {
    
    
   	//加锁;
    lock(&c.lock)
	 // ...
	 //从阻塞写协程队列中获取到一个写协程
    if sg := c.sendq.dequeue(); sg != nil {
    
    
		//从发送队列中出队一个 sg,并通过 recv 函数将 sg 中的数据写入到接收端点 ep 中
		//recv函数内部会进行大量处理:
		//若 channel为无缓冲型,则直接读取写协程元素,并唤醒写协程;
		//若 channel 为有缓冲型,则读取缓冲区头部元素,并将写协程元素写入缓冲区尾部后唤醒写协程,更新读写索引;
        recv(c, sg, ep, func() {
    
     unlock(&c.lock) }, 3)
        return true, true
     }
     // ...
}

If there is a blocking write coroutine when reading, the ring buffer must be unbuffered or has been filled. At this time, the recv function is called, and the result after the call is as follows:

  • If the channel is unbuffered, read the write coroutine element directly and wake up the write coroutine;
  • If the channel is buffered, read the head element of the buffer, write the write coroutine element to the end of the buffer, wake up the write coroutine, and update the read and write index;

The general flow of the recv function:
1. If the sudog pointer sg is nil, it means that the current receiving operation has no target element. This situation usually occurs in the non-blocking receiving operation in select or the reading operation of buffered channel.
2. If there is buffered data in the channel or there is an unprocessed send operation, the data will be directly taken out from the buffer or send queue of the channel, and written into the target memory address specified in the sudog.
3. If there is no buffered data in the channel and there is no unprocessed send operation, create a new sudog structure for employees, add the receive request to the linked list, and schedule the current goroutine to sleep, waiting for other goroutine send operations to wake up.
4. When waking up, check whether there is a sending endpoint matching the receiving endpoint in the sending queue: if so, get the target element from the sending endpoint, write it to the specified target memory, and then unblock all blocks and return; if If not, continue to sleep and wait for other sending operations to wake up.

insert image description here

4.4 Non-blocking write coroutine when reading and there are elements in the buffer

func chanrecv(c *hchan, ep unsafe.Pointer, block bool) (selected, received bool) {
    
    
    // ...
    //加锁;
    lock(&c.lock)
    // ...
    if c.qcount > 0 {
    
    
        // 获取到 recvx 对应位置的元素
        qp := chanbuf(c, c.recvx)
        if ep != nil {
    
    
        	//typedmemmove(dst, src, size):从 src 指向的地址复制 size 字节的数据到 dst 指向的地址。
        	//将channel缓冲区或发送队列中读取到的目标元素(即 qp 指针)写入到接收端点的目标内存地址(即 ep 指针)中
            typedmemmove(c.elemtype, ep, qp)
        }
        //typedmemclr(ptr, size):从 ptr 开始的地址上清空 size 字节的数据,将要清空的内存空间设置成数据类型的零值。
        //清空刚才从 channel 缓冲区或发送队列中取出的元素
        typedmemclr(c.elemtype, qp)
        c.recvx++
        if c.recvx == c.dataqsiz {
    
    
            c.recvx = 0
        }
        c.qcount--
        unlock(&c.lock)
        return true, true
    }
    // ...

When reading, there is no blocking write coroutine and the buffer has elements, which is the general case, then directly read the corresponding elements of the ring buffer
insert image description here

4.5 Non-blocking write coroutine when reading and the buffer has no elements

func chanrecv(c *hchan, ep unsafe.Pointer, block bool) (selected, received bool) {
    
    
   // ...
   //加锁
   lock(&c.lock)
   // ...
   //构造封装当前 goroutine 的 sudog 对象
    gp := getg()
    mysg := acquireSudog()
    //完成指针指向,建立 sudog、goroutine、channel 之间的指向关系
    mysg.elem = ep
    gp.waiting = mysg
    mysg.g = gp
    mysg.c = c
    gp.param = nil
    //把 sudog 添加到当前 channel 的阻塞读协程队列中
    c.recvq.enqueue(mysg)
    atomic.Store8(&gp.parkingOnChan, 1)
     //park 挂起当前读协程
    gopark(chanparkcommit, unsafe.Pointer(&c.lock), waitReasonChanReceive, traceEvGoBlockRecv, 2)
	//倘若协程从 park 中被唤醒,则回收 sudog(sudog能被唤醒,其对应的元素必然已经被写入)
    gp.waiting = nil
    success := mysg.success
    gp.param = nil
    mysg.c = nil
    releaseSudog(mysg)
    //解锁,返回
    return true, success
}

When reading, there is no blocking write coroutine and there are no elements in the buffer, then directly use the gopark function to drive the current goroutine into a dormant state, waiting for other write goroutines to push data, close channel or delete the current goroutine to wake up, and the data has been awakened by other Coroutine processing, so the space is directly reclaimed.

4.6 Summary of channel reading process

insert image description here

  1. First determine whether the channel is nil or not initialized, if it is empty, it will cause a deadlock
  2. If the channel is not empty, since the channel is a shared resource, it is necessary to lock the channel
  3. Continue to judge whether the channel is closed. If it is closed, judge whether there are elements in the ring buffer. If there are no elements, return the zero value of the corresponding element.
  4. If the channel is not empty and not closed, it will enter the writing process formally, and first judge whether there is a blocked writing coroutine
    • If there is a blocked write coroutine, it means that the ring buffer is unbuffered or has been filled, so judge whether the channel is unbuffered
      • If the channel is unbuffered, read the write coroutine element directly and wake up the write coroutine;
      • If the channel is buffered, read the head element of the ring buffer, write the write coroutine element to the end of the buffer, wake up the write coroutine, and update the read and write index;
    • If there is no blocked write coroutine, determine whether there is space in the ring buffer
      • If there is space in the ring buffer, directly add the current element to the position of sendx in the ring buffer, and update the write position sendx and the number of channel elements qcount, and return to the function after unlocking.
      • If there is no space in the ring buffer, add the current coroutine to the blocking write coroutine queue, block the coroutine, wait for the read coroutine to wake up, and complete the unlocking

4.7 Two protocols for reading channels

When reading the channel, we will find that if the channel is closed and there are no elements, it will return a zero value, so we need to judge whether we really read a zero value or read a zero value because the channel is closed, so the source code defines two Read the channel's protocol. They are as follows:

got1 := <- ch
got2,ok := <- ch

According to the return value of the second bool type, it is used to judge whether the current channel is closed. If ok is false, it means that the channel is closed and the buffer is empty.
In both formats, read channel operations are compiled into different methods:

func chanrecv1(c *hchan, elem unsafe.Pointer) {
    
    
    chanrecv(c, elem, true)
}

//go:nosplit
func chanrecv2(c *hchan, elem unsafe.Pointer) (received bool) {
    
    
    _, received = chanrecv(c, elem, true)
    return
}

5. Blocking and non-blocking mode

5.1 Overview of blocking and non-blocking modes

Blocking and non-blocking refer to two ways of waiting for results while accessing resources, and the main difference between them is whether the program can continue to perform other operations while waiting for the call to complete and return.

  • Blocking means that when a process requests an I/O operation (such as reading or writing a disk file), if the device is not ready to read and write data, the calling process will be suspended and continue to wait in line until the read or write operation Completed successfully. Blocking operations occupy process resources until the desired result is obtained.

  • The function of the non-blocking call is exactly the same as the result of the blocking call, and a status code is returned immediately after execution to indicate the success or failure of the operation. If the operation cannot be performed immediately, instead of waiting, it returns failure and tells the application that it can try again later.

Channel operations are blocking by default, which means that when a <- channel read or channel <- value write statement is executed, the program will wait until a goroutine receives data from the channel or another goroutine sends the data sent to this channel.
When using the select statement, the default behavior is non-blocking, that is, when all branches cannot be executed immediately, select will return immediately instead of blocking and waiting, which allows us to use the mutual exclusion and blocking logic design of branches Capability of non-blocking IO.

ch := make(chan int)
select{
    
    
  case <- ch:
  default:
}

5.2 Non-blocking mode logic

When interpreting the above source code, you can see that both the write operation function chansend and the read operation function chanrecv have a parameter: block bool, but the source code has been simplified, so the function of the block is not reflected.
In non-blocking mode, the read/write channel method will pass this bool response parameter block to identify whether the read/write is successful. •
Under the condition that the read/write operation can be completed immediately, the non-blocking mode will Returns true.
• The operation that causes the current goroutine to enter a deadlock or needs to be suspended will return false in non-blocking mode;

func selectnbsend(c *hchan, elem unsafe.Pointer) (selected bool) {
    
    
    return chansend(c, elem, false, getcallerpc())
}

func selectnbrecv(elem unsafe.Pointer, c *hchan) (selected, received bool) {
    
    
    return chanrecv(c, elem, false)
}

In the multiplexing branch wrapped by the select statement, the read and write channel operations will be compiled into selectnbrecv and selectnbsend methods, and the bottom layer also reuses the chanrecv and chansend methods, but at this time because the third input parameter block is set to false, As a result, the follow-up will go into the non-blocking processing branch.

6. Close the channel process

func closechan(c *hchan) {
    
    
    if c == nil {
    
    		//关闭未初始化过的 channel 会 panic;
        panic(plainError("close of nil channel"))
    }

    lock(&c.lock)//加锁
    if c.closed != 0 {
    
    
        unlock(&c.lock)
        panic(plainError("close of closed channel"))//重复关闭 channel 会 panic
    }

    c.closed = 1

    var glist gList
    // release all readers
    for {
    
    
        sg := c.recvq.dequeue()
        if sg == nil {
    
    
            break
        }
        if sg.elem != nil {
    
    
            typedmemclr(c.elemtype, sg.elem)
            sg.elem = nil
        }
        gp := sg.g
        gp.param = unsafe.Pointer(sg)
        sg.success = false
        glist.push(gp)		//将阻塞读协程队列中的协程节点统一添加到 glist
    }

    // release all writers (they will panic)
    for {
    
    
        sg := c.sendq.dequeue()
        if sg == nil {
    
    
            break
        }
        sg.elem = nil
        gp := sg.g
        gp.param = unsafe.Pointer(sg)
        sg.success = false
        glist.push(gp)			//将阻塞写协程队列中的协程节点统一添加到 glist
    }
    unlock(&c.lock)

    // Ready all Gs now that we've dropped the channel lock.
    for !glist.empty() {
    
    
        gp := glist.pop()
        gp.schedlink = 0
        goready(gp, 3)	// 唤醒 glist 当中的所有协程

insert image description here

  1. First judge whether the channel is nil or not initialized, if the empty channel is closed, panic(plainError("close of nil channel"))
  2. If the channel is not empty, since the channel is a shared resource, it is necessary to lock the channel
  3. Continue to judge whether the channel is closed, if it is closed, trigger panic(plainError("close of closed channel"))
  4. If the channel is not empty and not closed, it will officially enter the closing process:
    • If there is a blocking read coroutine queue, add the coroutine nodes in the blocked read coroutine queue to the glist. At this time, there must be no blocking write coroutine queue
    • If there is a blocking write coroutine queue, add the coroutine nodes in the blocked write coroutine queue to the glist, and at this time there must be a non-blocking read coroutine queue
    • Wake up all coroutines in glist.

Guess you like

Origin blog.csdn.net/qq_45808700/article/details/130990891