go coroutines and channels

Coroutines and Channels

First, let's review some concepts learned in the operating system.

Processes are the basic unit of program execution and run in an independent memory address space;

A process consists of multiple threads (threads). The existence of threads is to perform multiple tasks at the same time, maximize the use of time, and prevent waiting. The memory address space is shared between threads.

This point can be seen clearly from the windows resource manager, as follows, each application is a process, and there are two threads running at the same time under the Typora program.

insert image description here
Concurrency is a concept based on multi-threading. It divides the execution time of the CPU into many small intervals. Multiple threads are constantly switching and executing. From the upper layer, it looks like they are executing at the same time, but they are still linear in nature. .

Parallelism means that a program runs on multiple CPUs at the same time in a specific event, and multi-core processors provide the possibility for parallelism.

Therefore, concurrency may also be parallelism.

Coroutine (goroutine)

Go natively supports concurrency, relying on the concepts of goroutine and channel.

The concept of goroutines is to distinguish it from concepts such as processes, threads, and coroutines.

Among them, coroutines are also called coroutines, and this is the coroutine in the conventional sense. Goroutines are only valid in Go.

Coroutines are a lighter concept than threads, using very little memory and resources.
It separates the stack to dynamically increase or reduce the use of memory. The management of the stack is also automatic, and the space is automatically released after the coroutine exits.

A coroutine can run between multiple threads or within a thread, and its creation is so cheap that 100,000 of them can exist in the same address space.

This concept also exists in other languages ​​(C#, Java, etc.), and it differs from goroutines in that:

  • Go coroutines mean parallel (or can be deployed in parallel), coroutines are generally not theoretically, Go coroutines are more powerful than coroutines.
  • Go coroutines communicate through channels, and coroutines communicate through yield and resume operations.

A goroutine is described with a simple model:
it is a function that executes concurrently with other coroutines in the same address space.
Create and run a coroutine by adding the go keyword before the function or method name, and exit quietly (without any return value) after running.

//并行的运行list.Sort,不等待
go list.Sort() 

main()The function that must be included in a Go program can be regarded as a coroutine, although it is not started by the go keyword, the goroutine can also run during the program initialization process (the init() function runs).

The concept of simply ending the coroutine is not specific enough, and the coroutine needs to cooperate with the channel.

channel

The hard part of concurrent programming is getting the right access to shared variables, the way mutexes are complicated, and Go encourages a different approach,

That is, the shared value is passed on the channel. Like the Unix pipeline, the channel is used to send typed data. At any given time, only one coroutine can access the data in the channel, thus completing the inter-coroutine communication, and avoids all the pitfalls caused by shared memory.

This way of communicating through channels guarantees synchronization and at the same time ownership of the data is transferred.

This design philosophy ultimately boils down to one sentence:
don't communicate by sharing memory, share memory by communicating.

1 Declaration and initialization

The basic form of declaring a channel is as follows, with an uninitialized channel value nil.

var identifier chan datatype

A channel can only transmit one type of data, such as chan intor chan string, it can be of any type, including an empty interface interface{}and the channel itself.

Same as map, channel is also a reference type, so use make to initialize, you can specify the second parameter to specify the size of the buffer, that is, the number of data that the channel can hold, this value is 0 by default, which means no buffering, The unbuffered channel combines communication, value exchange, and synchronization to ensure that the calculations of the two coroutines are in a known state.

var ci chan string
// 无缓冲的整数通道
ci = make(chan string)
// 无缓冲的整数通道
cj := make(chan int, 0)
// 指向文件的指针的缓冲通道
cs := make(chan *os.File, 100)

2 communication operator <-

Operators represent the transfer of data intuitively: information flows in the direction of the arrows.

Flow to the channel (send) is ch <- int1represented , which means to send the variable int1 through the channel ch.

Outflow (receive) from the channel is int2 = <- chrepresented by , which means that the variable int2 receives data from the channel ch, if int2 has not been declared, it can be used int2 := <- ch.

<- chIt is used to indicate discarding the current value and obtaining the next value of the channel, which can be used for verification, such as:

if <- ch != 1000 {
    
    
    ...
}

Channel names usually start with or contain chan for readability.
The sending and receiving of the channel are atomic operations, which are always completed without interfering with each other.
The following example demonstrates the use of communication operators.

package main

import (
	"fmt"
	"time"
)

func main() {
    
    
	ch := make(chan string)

	go sendData(ch)
	go getData(ch)

	// 当前程序暂停 1 秒钟
	time.Sleep(1e9)
}

func sendData(ch chan string) {
    
    
	ch <- "Washington"
	ch <- "Tripoli"
	ch <- "London"
	ch <- "Beijing"
	ch <- "Tokyo"
}

func getData(ch chan string) {
    
    
	var input string
	// time.Sleep(2e9)
	for {
    
    
		input = <-ch
		fmt.Printf("%s ", input)
	}
}
//Output:
Washington Tripoli London Beijing tokyo

If two coroutines need to communicate, they must be given the same channel as a parameter.

In the above example, two coroutines are started in the main() function:
sendData() sends 5 strings through the channel ch,
getData() receives them in order and prints them out.

Some synchronization details are as follows:

1 main() waited for 1 second for the two coroutines to complete, if not (comment out time.Sleep(1e9)), sendData() would have no chance to output.

2 getData() uses an infinite loop: it ends as sendData() completes sending and ch becomes empty.

3 If we remove one or all go keywords, the program cannot run, and the Go runtime will throw a panic. This is because the runtime checks if all coroutines are waiting for something (reading from or writing to a channel), which means deadlock and the program cannot proceed.

The order of sending and receiving channels is unpredictable. If you use the print status to output, due to the time delay between the two, the order of printing is different from the order of actual occurrence.

3 channel blocking

As mentioned earlier, communication is synchronous and unbuffered by default, so the sending /and receiving are blocked until the other party is ready.

1 For the same channel, the send operation (in a coroutine or function) is blocked until the receiver is ready:
if the data in ch is not received, no other data can be passed to the channel.

2 For the same channel, the receive operation is blocked until the sender is available (in a coroutine or function)

The coroutine in the following example continuously sends data to the channel in an infinite loop, but since there is no receiver, only the number 0 is output.

package main

import "fmt"

func main() {
    
    
	ch1 := make(chan int)
	go pump(ch1)       // pump hangs
	fmt.Println(<-ch1) // prints only 0
}

func pump(ch chan int) {
    
    
	for i := 0; ; i++ {
    
    
		ch <- i
	}
}
//Output:0

Define a new coroutine to receive channel values ​​that can be continuously output.

package main

import "fmt"

func main() {
    
    
	ch1 := make(chan int)
	go pump(ch1)
	go suck(ch1)
	time.Sleep(1e9)
}

func pump(ch chan int) {
    
    
	for i := 0; ; i++ {
    
    
		ch <- i
	}
}

func suck(ch chan int) {
    
    
	for {
    
    
		fmt.Println(<-ch)
	}
}

4 semaphore (Semaphore)

In Go language, semaphore (Semaphore) is usually used to limit the number of concurrent access resources.

Semaphore is a counter object that can be used to control the number of threads accessing shared resources at the same time.

Semaphore generally has two basic operations:

  • The Acquire operation will try to acquire a license (or signal), and if the number of currently available licenses is 0, it will block and wait until a license is available.
  • The Release operation releases a permit so that other blocked threads can acquire the permit.

In Go language, you can use sync.Mutex and sync.Cond to implement Semaphore.

In the Go language, sync.Cond is a condition variable object used to communicate between multiple goroutines.
It is often used with a mutex (sync.Mutex) to achieve inter-thread synchronization.

When a goroutine wants to wait for a certain condition of another goroutine to be met, it can call the Wait() method to block itself and wait for the notification of the condition variable.

When a condition is met, notifications can be sent via the Signal() or Broadcast() methods to wake up waiting goroutines.

Here is a sample code to implement Semaphore using sync.Mutex and sync.Cond:

package main

import (
	"fmt"
	"sync"
)

type Semaphore struct {
    
    
	// 当前可用的许可数量
	count int
	// 条件变量对象,用于等待和唤醒 goroutine
	cond *sync.Cond
}

// 创建许可数量
func NewSemaphore(count int) *Semaphore {
    
    
	return &Semaphore{
    
    
		count: count,
		cond:  sync.NewCond(&sync.Mutex{
    
    }),
	}
}

// 操作许可
func (s *Semaphore) Acquire() {
    
    
	// 获取条件变量对应的锁
	s.cond.L.Lock()
	defer s.cond.L.Unlock()

	// 如果当前没有可用的许可,就等待条件变量
	for s.count <= 0 {
    
    
		// 释放锁并等待条件变量满足
		s.cond.Wait()
	}
	s.count-- // 获取许可,许可数量减一
}

// 释放许可
func (s *Semaphore) Release() {
    
    
	s.cond.L.Lock()
	defer s.cond.L.Unlock()

	// 释放许可,许可数量加一
	s.count++
	// 唤醒等待条件变量的 goroutine 中的一个
	s.cond.Signal()
}

func main() {
    
    
	sem := NewSemaphore(3)
	// 协调多个 goroutine 的执行
	var wg sync.WaitGroup
	for i := 0; i < 5; i++ {
    
    
		wg.Add(1) // 增加计数器的值
		go func(id int) {
    
    
			sem.Acquire()
			defer sem.Release()
			fmt.Printf("Goroutine %d acquired 获取许可 semaphore\n", id)
			// do some work
			fmt.Printf("Goroutine %d released 释放许可 semaphore\n", id)
			wg.Done() //减少计数器的值
		}(i)
	}

	wg.Wait() //等待计数器归零
	fmt.Println("All goroutines have finished")
}
root@debiancc:~/www/test# go run test.go 
Goroutine 4 acquired 获取许可 semaphore
Goroutine 4 released 释放许可 semaphore
Goroutine 2 acquired 获取许可 semaphore
Goroutine 2 released 释放许可 semaphore
Goroutine 3 acquired 获取许可 semaphore
Goroutine 3 released 释放许可 semaphore
Goroutine 0 acquired 获取许可 semaphore
Goroutine 0 released 释放许可 semaphore
Goroutine 1 acquired 获取许可 semaphore
Goroutine 1 released 释放许可 semaphore
All goroutines have finished
root@debiancc:~/www/test#

5 channel factory

A channel factory mode is often used in programming, that is, the channel is not passed as a parameter to the coroutine, but a function is used to generate a channel and return it.

package main

import (
	"fmt"
	"time"
)

func main() {
    
    
	stream := pump()
	go suck(stream)
	time.Sleep(1e9)
}

func pump() chan int {
    
    
	ch := make(chan int)
	go func() {
    
    
		for i := 0; ; i++ {
    
    
			ch <- i
		}
	}()
	return ch
}

func suck(ch chan int) {
    
    
	for {
    
    
		fmt.Println(<-ch)
	}
}

Continuously execute execution inflow and outflow within one second.

6 Use a for loop for channels

The range statement of the for loop can be used on the channel ch, and the value can be obtained from the channel, like this:

for v := range ch {
    
    
	fmt.Printf("The value is %v\n", v)
}

Such use must still be coordinated with the writing and closing of the channel, and cannot exist alone.

package main

import (
	"fmt"
	"time"
)

func main() {
    
    
	suck(pump())
	time.Sleep(1e9)
}

func pump() chan int {
    
    
	ch := make(chan int)
	go func() {
    
    
		for i := 0; ; i++ {
    
    
			ch <- i
		}
	}()
	return ch
}

func suck(ch chan int) {
    
    
	go func() {
    
    
		for v := range ch {
    
    
			fmt.Println(v)
		}
	}()
}

7 Close the channel

Channels can be closed explicitly, but only the sender needs to close the channel, never the receiver.

ch := make(chan float64)
defer close(ch)

To test whether the channel is closed use the ok operator.

v, ok := <-ch   // ok is true if v received value

a complete example

package main

import "fmt"

func main() {
    
    
	ch := make(chan string)
	go sendData(ch)
	getData(ch)
}

func sendData(ch chan string) {
    
    
	ch <- "Washington"
	ch <- "Tripoli"
	ch <- "London"
	ch <- "Beijing"
	ch <- "Tokio"
	close(ch)
}

func getData(ch chan string) {
    
    
	for {
    
    
		input, open := <-ch
		if !open {
    
    
			break
		}
		fmt.Printf("%s ", input)
	}
}
root@debiancc:~/www/test# go run test.go 
Washington Tripoli London Beijing Tokio
root@debiancc:~/www/test# 

But reading the channel with for-range is better, because this will automatically detect if the channel is closed.

for input := range ch {
    
    
  	process(input)
}

Select

Obtaining values ​​from different concurrently executed coroutines can be done through the keyword select, which is very similar to the switch control statement, and its behavior is like a "are you ready" polling mechanism;

select listens for data entering the channel, or when a value is sent using the channel.

select {
    
    
case u:= <- ch1:
        ...
case v:= <- ch2:
        ...
        ...
default: // no value ready to be received
        ...
}

What select does is: choose to handle one of the many communication situations listed.

  • If both are blocked, it will wait until one of them can handle it.
  • If more than one can be handled, choose one at random.
  • If no channel operations can be processed and a default statement is written, it will execute: default is always runnable (that is, ready, ready to execute).

Using default can ensure that the sending is not blocked, but the listening mode without default may also be used, and exit the loop through the break statement.

A complete example:

package main

import (
	"fmt"
	"time"
)

func main() {
    
    
	ch1 := make(chan int)
	ch2 := make(chan int)

	go pump1(ch1)
	go pump2(ch2)
	go suck(ch1, ch2)

	time.Sleep(1e9)
}

func pump1(ch chan int) {
    
    
	for i := 0; ; i++ {
    
    
		ch <- i * 2
	}
}

func pump2(ch chan int) {
    
    
	for i := 0; ; i++ {
    
    
		ch <- i + 5
	}
}

func suck(ch1, ch2 chan int) {
    
    
	for {
    
    
		select {
    
    
		case v := <-ch1:
			fmt.Printf("Received on channel 1: %d\n", v)
		case v := <-ch2:
			fmt.Printf("Received on channel 2: %d\n", v)
		}
	}
}

There are 2 channels ch1 and ch2, and three coroutines pump1(), pump2() and suck().

In an infinite loop, ch1 and ch2 are filled with integers via pump1() and pump2();

suck() also polls for input in an infinite loop,

Obtain the integers of ch1 and ch2 through the select statement and output them.

Which case is selected depends on which channel received the message.

The program ends 1 second after main executes.

example

1 lazy generator

A generator is a function that returns the next value in a sequence when called,

For example:

generateInteger() => 0
generateInteger() => 1
generateInteger() => 2
....

The generator returns the next value in the sequence instead of the entire sequence each time;

This feature is also known as lazy evaluation:
evaluate only when you need it, while retaining related variable resources (memory and cpu):

This is a technique for evaluating expressions when needed.

For example, generating an infinite number of sequences of even numbers:
it might be difficult to generate such a sequence and use it one by one, and the memory would overflow! But a function with channels and go coroutines can easily achieve this requirement.

The following example implements a generator implemented using int type channels.

Channels are named yield and resume, words that are often used in coroutine code.

package main

import (
	"fmt"
)

var resume chan int

func integers() chan int {
    
    
	yield := make(chan int)
	count := 0
	go func() {
    
    
		for {
    
    
			yield <- count
			count++
		}
	}()
	return yield
}

func generateInteger() int {
    
    
	return <-resume
}

func main() {
    
    
	resume = integers()
	fmt.Println(generateInteger()) //=> 0
	fmt.Println(generateInteger()) //=> 1
	fmt.Println(generateInteger()) //=> 2
}

The difference between make and new

In Go language, both make and new can be used to allocate memory, but their usage scenarios and behaviors are different.

1 make is used to create reference types (slices, maps, and channels) such as slices, maps, and pipes, and returns a value of an initialized reference type (slice, map, or channel). These types are references to underlying data structures and need to be initialized before they can be used.

Example:

// 创建一个长度为 5,容量为 10 的整型切片
slice := make([]int, 5, 10) 

// 创建一个字符串类型的管道
channel := make(chan string) 

2 new is used to allocate memory space of value type (struct, int, float64, etc.), and return a pointer to this type.
Allocated memory is set to zero, the default value for the type.

Example:

var ptr *int

// 分配一个整型的内存空间,并将 ptr 指向该空间
ptr = new(int)

In general, the main difference between make and new is:
make can only be used to initialize reference types and return initialized values;
while new is used to allocate memory space of value types and return pointers to this type .

slice := new([]int, 5, 10)Is this okay?

Can't.

In the Go language, the new function is used to allocate memory space of a value type (struct, int, float64, etc.) and return a pointer to that type.

The slice (slice) is a reference type, so you cannot use new to create a slice.

What is the difference between value types and reference types?

Value type (Value Type) and reference type (Reference Type) are two different data types.

Their main differences are:

  • Variables of value types directly store the value of the data.
  • A variable of reference type stores a reference to a data storage location.

More specifically, variables of value types store the data itself, rather than a reference to the data.
This means that when a variable of a value type is assigned to another variable, a new copy is created and the two variables do not affect each other. Value-type data includes integers, floating-point numbers, Boolean values, characters, and more.

A variable of reference type stores a reference to a data storage location.
When a variable of reference type is assigned to another variable, they will all point to the same data storage location, and modification of any one of them will affect the other variable. Reference-type data includes slices, maps, pipes, interfaces, functions, and more.

Guess you like

Origin blog.csdn.net/weiguang102/article/details/130539435