channel 同一个世界，同一个梦

忠于自己，万物可期 --- 来自雪糕棍的心灵鸡汤

我们主要围绕下面三个关键点来了解channel，一把梭哈从新手到项目实战，争取给各位读友带来不一样的沉浸式体验。

为什么channel仅在golang中出现？
channel工作原理是什么？
channel的日常使用以及坑坑

为什么在golang出现这样的“怪兽级”工具

Do not communicate by sharing memory; instead, share memory by communicating --- golang并发哲学

建议，不要通过共享内存进行通信，而是通过通信来共享内存。从golang语法特征看，并发解决方案可以分为sync包，channel两大阵营，其中sync中主要是waitGroup，lock，cond，once，sync.Pool 这些。而channel表示一次只允许一个goroutine处理对应的数据

其中sync，可以写一段伪代码，源码如下：

func main() {
	var a = 0
	var lock sync.Mutex
	for i := 0; i < 1000; i++ {
		go func(idx int) {
			lock.Lock()
			defer lock.Unlock()
			a += 1 // 保证了a的数据完整性
			fmt.Printf("goroutine %d, a=%d\n", idx, a)
		}(i)
	}
	// 等待 1s 结束主程序
	// 确保所有协程执行完
	time.Sleep(time.Second)
}

channel，则可以以一种更高级的写法表现，源码如下：

func counter(out chan<- int) {
	for i := 0; i < 100; i++ {
		out <- i + 1
	}
	close(out)
}

func printer(in <-chan int) {
	for i := range in {
		fmt.Println(i)
	}
}

func main() {
	ch1 := make(chan int)
	ch2 := make(chan int)
	go counter(ch1)
	printer(ch1)
}

channel原理

打开 src/runtime/chan.go 源码第一行注释如下：

Invariants:
At least one of c.sendq and c.recvq is empty,
except for the case of an unbuffered channel with a single goroutine
blocked on it for both sending and receiving using a select statement,
in which case the length of c.sendq and c.recvq is limited only by the
size of the select statement.

For buffered channels, also:
c.qcount > 0 implies that c.recvq is empty.
c.qcount < c.dataqsiz implies that c.sendq is empty.

第一个单词Invariants，词汇有限，还是寻求一下度娘援助。

还是google一下这个单词，（ps:认知水平有限，各位看官包容一下！！！）

哦哦，就是不变的属性/特点。这里我们继续看，什么，啥意思？？？

第一句咋个又开始蒙了，什么叫 c.sendq 和 c.recvq 至少有一个是空的，c.sendq 是什么？干嘛的？c.recvq 又是什么？顿时心中一万头那个啥掠过。

在该源码文件中搜索一下

type hchan struct {
	qcount   uint           // total data in the queue
	dataqsiz uint           // size of the circular queue
	buf      unsafe.Pointer // points to an array of dataqsiz elements
	elemsize uint16
	closed   uint32
	elemtype *_type // element type
	sendx    uint   // send index
	recvx    uint   // receive index
	recvq    waitq  // list of recv waiters
	sendq    waitq  // list of send waiters

	// lock protects all fields in hchan, as well as several
	// fields in sudogs blocked on this channel.
	//
	// Do not change another G's status while holding this lock
	// (in particular, do not ready a G), as this can deadlock
	// with stack shrinking.
	lock mutex
}

发现hchan 这个结构体中声明了这两个属性，从注释我们可以很容易知道一个是写等待队列，一个是读等待队列。

回到开头的那段话，大概内容就是说发送等待的队列和接收等待的队列是互斥的，同一个时间内，只允许一个里面放入值。

这个两个队列除了在select语句中被限制其长度外，在select语句体中会阻塞单个goroutine协程的无缓冲通道的发送和接收。对于有缓冲的通道，c.qcount> 0 对应的读等待队列 c.recvq 为空；c.qcount < c.dataqsiz 对应的写等待队列 c.sendq 为空。

作为一个刚入门的新手小白，只能说不明觉厉。来来来，我们大手拉小手回到 hchan 这个结构体中去深入理解channel。

recvq、sendq 均是waitq类型，那我们看看waitq到底是个什么东西，源码如下：

type waitq struct {
	first *sudog
	last  *sudog
}

再看下sudog源码：

// sudog represents a g in a wait list, such as for sending/receiving
// on a channel.
//
// sudog is necessary because the g ↔ synchronization object relation
// is many-to-many. A g can be on many wait lists, so there may be
// many sudogs for one g; and many gs may be waiting on the same
// synchronization object, so there may be many sudogs for one object.
//
// sudogs are allocated from a special pool. Use acquireSudog and
// releaseSudog to allocate and free them.
type sudog struct {
	// The following fields are protected by the hchan.lock of the
	// channel this sudog is blocking on. shrinkstack depends on
	// this for sudogs involved in channel ops.

	g *g

	next *sudog
	prev *sudog
	elem unsafe.Pointer // data element (may point to stack)

	// The following fields are never accessed concurrently.
	// For channels, waitlink is only accessed by g.
	// For semaphores, all fields (including the ones above)
	// are only accessed when holding a semaRoot lock.

	acquiretime int64
	releasetime int64
	ticket      uint32

	// isSelect indicates g is participating in a select, so
	// g.selectDone must be CAS'd to win the wake-up race.
	isSelect bool

	// success indicates whether communication over channel c
	// succeeded. It is true if the goroutine was awoken because a
	// value was delivered over channel c, and false if awoken
	// because c was closed.
	success bool

	parent   *sudog // semaRoot binary tree
	waitlink *sudog // g.waiting list or semaRoot
	waittail *sudog // semaRoot
	c        *hchan // channel
}

sudog 表示为一个等待队列的 G，诸如一个 channel 的发送或接收。waitq 是由 first 和 last 构成的双向链表。

由于此处涉及Go并发GMP模型就不展开描述了，先留个坑位，以后出专门章节进行详细开展。

channel的创建、发送、接收和关闭

创建channel

ch := make(chan string)  // 创建无缓冲的channel

ch := make(chan string, 2) // 创建有缓冲的channel

经过 go tool分析代码，在创建时调用了runtime.makechan:

 initChannel.go:14     0x1089a61               e87aa2f7ff              CALL runtime.makechan(SB)

下面粘贴部分源码如下：

func makechan(t *chantype, size int) *hchan {

    elem := t.elem
    
    ...
    mem, overflow := math.MulUintptr(elem.size, uintptr(size))
    
    var c *hchan
	switch {
	case mem == 0:
		// Queue or element size is zero.
		c = (*hchan)(mallocgc(hchanSize, nil, true))
		// Race detector uses this location for synchronization.
		c.buf = c.raceaddr()
	case elem.ptrdata == 0:
		// Elements do not contain pointers.
		// Allocate hchan and buf in one call.
		c = (*hchan)(mallocgc(hchanSize+mem, nil, true))
		c.buf = add(unsafe.Pointer(c), hchanSize)
	default:
		// Elements contain pointers.
		c = new(hchan)
		c.buf = mallocgc(mem, elem, true)
	}

	c.elemsize = uint16(elem.size)
	c.elemtype = elem
	c.dataqsiz = uint(size)
	lockInit(&c.lock, lockRankHchan)
    
    ...
	return c
}

上面代码结构还是很清晰的，把部分参数校验语句摘出去，可以知道主要目的是为生成*hchan。其中重中之重就是switch语句啦。

Line6：为两个整数类型（足够大可以保持任意指针位数的整数类型）乘积，并判断其是否溢出。mem的值分为三种情况：

1.mem值为0时，调用mallocgc在堆上分配hchanSize大小的空间，并利用Race detector 监控地址进行同步操作

2.元素中不包含指针，在堆上分配hchanSize + mem 大小的空间给channel和buf缓冲区

3.default：元素中包含指针时，分别给channel和buf分配内存

channel发送数据

c <- "the channel A finished"

编译代码可以知道调用运行时 runtime.chansend1 函数：

给 chansend 函数“套娃”，我们还是看 chansend 这个函数。

func chansend1(c *hchan, elem unsafe.Pointer) {
	chansend(c, elem, true, getcallerpc())
}

接下来我们分析下部分源码，依照惯例，删除部分校验的代码，将代码拆分来分析：

1.阻塞等待接收goroutine：

func chansend(c *hchan, ep unsafe.Pointer, block bool, callerpc uintptr) bool {
	...
    // lock
	lock(&c.lock)
    // channel 已关闭
	if c.closed != 0 {
		unlock(&c.lock)
		panic(plainError("send on closed channel"))
	}
    // 当channel未关闭时，情况如下：
	if sg := c.recvq.dequeue(); sg != nil {
		// Found a waiting receiver. We pass the value we want to send
		// directly to the receiver, bypassing the channel buffer (if any).
		send(c, sg, ep, func() { unlock(&c.lock) }, 3)
		return true
	}
	...
}

步骤：

先对channel上锁，保证线程安全
检查channel是否已关闭，如果关闭，则释放锁，抛出panic；
当channel未关闭时，c.recvq.dequeue() 表示存在等待接收的Goroutine，取出第一个非空sudog，调用send()函数；

func send(c *hchan, sg *sudog, ep unsafe.Pointer, unlockf func(), skip int) {
	...
	if sg.elem != nil {
        // step1
		sendDirect(c.elemtype, sg, ep)
		sg.elem = nil
	}
	gp := sg.g
	unlockf()
	gp.param = unsafe.Pointer(sg)
	sg.success = true
	if sg.releasetime != 0 {
		sg.releasetime = cputicks()
	}
    // step2
	goready(gp, skip+1)
}

send函数主要做了两件事：

1.将发送的值直接拷贝到接收值的内存地址上；

2.goready()将当前待接收的阻塞状态的Goroutine 从_Gwaiting 到 _Grunnable 的状态转化，在下一轮调用中唤醒该Goroutine

2.channel 缓冲区未满：

func chansend(c *hchan, ep unsafe.Pointer, block bool, callerpc uintptr) bool {
    ...
    // 缓冲区未满的情况下
    if c.qcount < c.dataqsiz {
		// Space is available in the channel buffer. Enqueue the element to send.
		qp := chanbuf(c, c.sendx)
		if raceenabled {
			racenotify(c, c.sendx, nil)
		}
		typedmemmove(c.elemtype, qp, ep)
		c.sendx++
		if c.sendx == c.dataqsiz {
			c.sendx = 0
		}
		c.qcount++
		unlock(&c.lock)
		return true
	}

	
    ...
}

c.qcount表示队列中的元素数量，dataqsiz 表示环形队列的总大小，调用chanbuf() 函数获取指向缓冲区数组中位于sendx位置的元素的指针，调用typedmemmove()将当前要发送的值拷贝到缓冲区，由于buf缓冲区是环形队列，将依次递增 sendx 索引直到sendx等于队列长度时置为0以及解锁，并返回true；

3.既没有等待接收goroutine，buf区也没剩余空间的情况，非阻塞的发送，直接释放锁：

if !block {
    unlock(&c.lock)
    return false
}

4.阻塞发送

func chansend(c *hchan, ep unsafe.Pointer, block bool, callerpc uintptr) bool {
    ...
    // Block on the channel. Some receiver will complete our operation for us.
	// 获取当前goroutine的指针
    gp := getg()
    // 获取一个sudog，用于设置要发送的数据和状态
	mysg := acquireSudog()
	mysg.releasetime = 0
	if t0 != 0 {
		mysg.releasetime = -1
	}
	// No stack splits between assigning elem and enqueuing mysg
	// on gp.waiting where copystack can find it.
	mysg.elem = ep
	mysg.waitlink = nil
	mysg.g = gp
	mysg.isSelect = false
	mysg.c = c
	gp.waiting = mysg
	gp.param = nil
	c.sendq.enqueue(mysg)
	// Signal to anyone trying to shrink our stack that we're about
	// to park on a channel. The window between when this G's status
	// changes and when we set gp.activeStackChans is not safe for
	// stack shrinking.
	atomic.Store8(&gp.parkingOnChan, 1)
    //将当前goroutine状态变为_Gwaiting,阻塞等待channel
	gopark(chanparkcommit, unsafe.Pointer(&c.lock), waitReasonChanSend, traceEvGoBlockSend, 2)
	// Ensure the value being sent is kept alive until the
	// receiver copies it out. The sudog has a pointer to the
	// stack object, but sudogs aren't considered as roots of the
	// stack tracer.
    // 确保发送的值保持活动状态，直到接收方将其复制出来。
	KeepAlive(ep)

	// someone woke us up.
	if mysg != gp.waiting {
		throw("G waiting list is corrupted")
	}
    // 唤醒goroutine，释放阻塞状态，完成数据发送
	gp.waiting = nil
	gp.activeStackChans = false
	closed := !mysg.success
	gp.param = nil
	if mysg.releasetime > 0 {
		blockevent(mysg.releasetime-t0, 2)
	}
	mysg.c = nil
    // 释放sudog内存，将其缓存到pp.sudogcache上
	releaseSudog(mysg)
	if closed {
		if c.closed == 0 {
			throw("chansend: spurious wakeup")
		}
		panic(plainError("send on closed channel"))
	}
	return true
}

这里大体总结下就是，当阻塞发送，将当前goroutine组装成一个sudog，将其放入到channel的发送队列中，通过调用gopark将当前goroutine设置为_Gwaiting状态，channel进入阻塞等待中，当在下一次调度中唤醒goroutine，释放阻塞，并释放sudog。

channel接收数据

1.channel接收语句：

// 第1种
tmp := <- c

// 第2种
tmp, ok := <- c

2.channel接收调用

// entry points for <- c from compiled code
//go:nosplit
func chanrecv1(c *hchan, elem unsafe.Pointer) {
	chanrecv(c, elem, true)
}

//go:nosplit
func chanrecv2(c *hchan, elem unsafe.Pointer) (received bool) {
	_, received = chanrecv(c, elem, true)
	return
}

同chansend相似，情况如下：

等待接收goroutine

1.当前channel无缓冲区或缓冲区已满

func chanrecv(c *hchan, ep unsafe.Pointer, block bool) (selected, received bool) {
	...

	if sg := c.sendq.dequeue(); sg != nil {
		// Found a waiting sender. If buffer is size 0, receive value
		// directly from sender. Otherwise, receive from head of queue
		// and add sender's value to the tail of the queue (both map to
		// the same buffer slot because the queue is full).
		recv(c, sg, ep, func() { unlock(&c.lock) }, 3)
		return true, true
	}
    ...
}

来了来了，划重点了，核心逻辑为调用 recv 函数：

func recv(c *hchan, sg *sudog, ep unsafe.Pointer, unlockf func(), skip int) {
	if c.dataqsiz == 0 {
		if raceenabled {
			racesync(c, sg)
		}
		if ep != nil {
			// copy data from sender
			recvDirect(c.elemtype, sg, ep)
		}
	} else {
		// Queue is full. Take the item at the
		// head of the queue. Make the sender enqueue
		// its item at the tail of the queue. Since the
		// queue is full, those are both the same slot.
		qp := chanbuf(c, c.recvx)
		if raceenabled {
			racenotify(c, c.recvx, nil)
			racenotify(c, c.recvx, sg)
		}
		// copy data from queue to receiver
		if ep != nil {
			typedmemmove(c.elemtype, ep, qp)
		}
		// copy data from sender to queue
		typedmemmove(c.elemtype, qp, sg.elem)
		c.recvx++
		if c.recvx == c.dataqsiz {
			c.recvx = 0
		}
		c.sendx = c.recvx // c.sendx = (c.sendx+1) % c.dataqsiz
	}
	sg.elem = nil
	gp := sg.g
	unlockf()
	gp.param = unsafe.Pointer(sg)
	sg.success = true
	if sg.releasetime != 0 {
		sg.releasetime = cputicks()
	}
	goready(gp, skip+1)
}

主要逻辑可以分为以下三个部分：

a.当缓冲区大小为0，则直接从sender拷贝数据

b.当缓冲区已满的情况下，从buf队列头部接收数据，将sender的值拷贝到buf 队列尾部

c. 调用goready()将等待接收的goroutine状态从_Gwaiting置为_Grunnable，等待下一次调度。

buf缓冲区还存在数据

// chanrecv receives on channel c and writes the received data to ep.
// ep may be nil, in which case received data is ignored.
// If block == false and no elements are available, returns (false, false).
// Otherwise, if c is closed, zeros *ep and returns (true, false).
// Otherwise, fills in *ep with an element and returns (true, true).
// A non-nil ep must point to the heap or the caller's stack.
func chanrecv(c *hchan, ep unsafe.Pointer, block bool) (selected, received bool) {
	...    
    if c.qcount > 0 {
		// Receive directly from queue
		qp := chanbuf(c, c.recvx)
		if raceenabled {
			racenotify(c, c.recvx, nil)
		}
		if ep != nil {
			typedmemmove(c.elemtype, ep, qp)
		}
		typedmemclr(c.elemtype, qp)
		c.recvx++
		if c.recvx == c.dataqsiz {
			c.recvx = 0
		}
		c.qcount--
		unlock(&c.lock)
		return true, true
	}

	if !block {
		unlock(&c.lock)
		return false, false
	}
    ...
}

这里有个关键代码块，ep != nil 调用typedmemmove()函数，ep 是什么？我们可以看到chanrecv函数注释上有清楚说明，大概意思就是在通道c上调用chanrecv接收并写入接收的数据到ep，当ep不为空时，则指向堆或者调用者的栈。这句话简单地说就是当接收数据的内存地址ep不为空时，就会将缓冲区的数据拷贝到内存中去，并通过调用typedmemclr清除队列中的数据。

阻塞接收（既无待发送的goroutine,并且缓冲区无数据）

func chanrecv(c *hchan, ep unsafe.Pointer, block bool) (selected, received bool) {
	... 
    // no sender available: block on this channel.
	gp := getg()
	mysg := acquireSudog()
	mysg.releasetime = 0
	if t0 != 0 {
		mysg.releasetime = -1
	}
	// No stack splits between assigning elem and enqueuing mysg
	// on gp.waiting where copystack can find it.
	mysg.elem = ep
	mysg.waitlink = nil
	gp.waiting = mysg
	mysg.g = gp
	mysg.isSelect = false
	mysg.c = c
	gp.param = nil
	c.recvq.enqueue(mysg)
	// Signal to anyone trying to shrink our stack that we're about
	// to park on a channel. The window between when this G's status
	// changes and when we set gp.activeStackChans is not safe for
	// stack shrinking.
	atomic.Store8(&gp.parkingOnChan, 1)
	gopark(chanparkcommit, unsafe.Pointer(&c.lock), waitReasonChanReceive, traceEvGoBlockRecv, 2)

	// someone woke us up
	if mysg != gp.waiting {
		throw("G waiting list is corrupted")
	}
	gp.waiting = nil
	gp.activeStackChans = false
	if mysg.releasetime > 0 {
		blockevent(mysg.releasetime-t0, 2)
	}
	success := mysg.success
	gp.param = nil
	mysg.c = nil
	releaseSudog(mysg)
	return true, success
}

逻辑同chansend() 相同，具体可参考channel发送进行分析

关闭channel

close(c)

如何优雅地关闭channel

关闭channel原则：

不要在消费端关闭channel，不要在有多个并行的生产者时对channel执行关闭操作。

1.使用async.once 保证只关闭一次channel

type MyChannel struct {
    C    chan T
    once sync.Once
}

func NewMyChannel() *MyChannel {
    return &MyChannel{C: make(chan T)}
}

func (mc *MyChannel) SafeClose() {
    mc.once.Do(func() {
        close(mc.C)
    })
}

2.async.Mutex（不推荐）

type MyChannel struct {
    C      chan T
    closed bool
    mutex  sync.Mutex
}

func NewMyChannel() *MyChannel {
    return &MyChannel{C: make(chan T)}
}

func (mc *MyChannel) SafeClose() {
    mc.mutex.Lock()
    defer mc.mutex.Unlock()
    if !mc.closed {
        close(mc.C)
        mc.closed = true
    }
}

func (mc *MyChannel) IsClosed() bool {
    mc.mutex.Lock()
    defer mc.mutex.Unlock()
    return mc.closed
}

注意：Go运行机制并不能保障敞开channel和向channel发送值同时执行不会产生数据竞争。如果对同一channel执行通道发送操作的同时调用SafeClose函数，则可能会产生数据竞争。

channel日常使用坑点

坑一：给一个 nil 的 channel 发送/接收消息，会一直阻塞；

发送数据

func TestChannelNil(t *testing.T) {
	var data chan int

	<-data
}

接收数据

func TestChannelNil(t *testing.T) {
	var data chan int

	data <- 1
}

为什么会deadlock呢？

从上面的源码我们可以很容易得出结论，由于channel未初始化，所以其buf大小为0，没有buf，发送方一直被阻塞着，等待下一个goroutine接收；同理接收一样，都在各自阻塞等待的channel中，造成了发送方/接收方永远处于阻塞状态中。

坑二：向一个已经close的channel发送消息，会发生panic；

func TestClosedChannel(t *testing.T) {
	c := make(chan int, 1)
    close(c)
    
	c<- 1
}

坑三：channel引发的死锁现象

对于无缓冲通道

// channel无数据发送，直接执行读操作
func TestChannel1(t *testing.T) {
    c := make(chan int)
    <- c
}

// channel写入操作，无协程读取
func TestChannel(t *testing.T) {
	c := make(chan int)
	c <- 666
}

//正确写法：
func TestUnbufferedChannel(t *testing.T) {
	data := make(chan int, 1)

	data <- 6
	go func() {
		fmt.Println(<-data)  // 6
	}()
}

对于缓冲通道

// channel 无数据，对应的接收操作被阻塞
func TestBufferedChannel(t *testing.T) {
	c := make(chan int, 1)

	<-c
}

func TestBufferedChannel(t *testing.T) {
	c := make(chan int, 1)

	c <- 1
    
    c <- 2
}