Go language learning: Go memory model [must see]

Introduction

The Go memory model stipulates some conditions. Under these conditions, the value returned by reading a variable in one goroutine can ensure that it is the value written to the variable in another goroutine. [It took me 3 and a half hours to translate this article]

Happens Before (Happens Before)

In a goroutine, read operations and write operations must behave as if they were executed in the order specified in the program. This is because the compiler and processor in a goroutine may rearrange the execution order of read and write operations (as long as such out-of-order execution does not change the behavior defined in the language specification in the goroutine) .

Because of out-of-order execution, the execution order observed by one goroutine may be different from the execution order observed by another goroutine . For example, if one goroutine executes a = 1; b = 2;, another goroutine may observe that the value of b is updated before a.

In order to specify the necessary conditions for reading and writing, we defined happens before , a partial sequence for performing memory operations in a Go program. If the event occurs in the event e1 e2 before , then we say e1 e2 occur after . Similarly, if e1 does not occur before e2 nor after e2, then we say that e1 and e2 occur simultaneously .

In a single goroutine, the happens-before order is the order in the program.

A read operation r to the variable v can be allowed to observe a write operation w to v, if the following conditions are met at the same time:

r does not occur before w
After w and before r, no other write operations w'to v occur.

In order to ensure that a read operation r to the variable v observes a write operation w to v, it must be ensured that w is the only write operation allowed by r . This means that the following conditions must be met at the same time:

w occurs before r
Any other write operations to the shared variable v occur before w or after r.

These two conditions are stricter than the previous two conditions. It requires that no other write operations occur simultaneously with w or r.

In a single goroutine, there is no concurrency, so these two definitions are equivalent: a read operation r observes the most recent write operation w to v. When multiple goroutines access a shared variable v, they must use synchronized events to establish happens-before conditions to ensure that the read operation observes the expected write operation.

In the memory model, the behavior of initializing a variable with zero is the same as the behavior of a write operation.

The behavior of reading and writing a value exceeding the size of a single machine word [32-bit or 64-bit] is the same as the behavior of multiple unordered operations on a single machine word.

Synchronize

initialization

The program initialization operation runs in a single goroutine, but this goroutine may create other concurrently executing goroutines.

If package p imports package q, the completion of the init function of q occurs before any init function of p is executed.

The execution of the function main.main [that is, the main function] occurs after all init functions are completed.

Goroutine creation

The execution of the go statement that starts a new goroutine occurs before the goroutine starts to execute.

For example, in this program:

var a string

func f() {
    
    
	print(a) // 后
}

func hello() {
    
    
	a = "hello, world"
	go f() // 先
}

Calling the hello function will print out "hello, world" at a later event point. [Because the a = "hello, world" statement is executed before the go f() statement, and the function f executed by goroutine is executed after the go f() statement, the value of a has been initialized]

Goroutine destruction

The exit of the goroutine is not guaranteed to occur before any event in the program. For example, in this program:

var a string

func hello() {
    
    
	go func() {
    
     a = "hello" }()
	print(a)
}

There is no synchronization event after the assignment of a, so there is no guarantee that other goroutines can observe the assignment operation. In fact, an aggressive compiler may delete the entire go statement.

If the effect of assignment in one goroutine must be observed by another goroutine, use synchronization mechanisms such as locks or pipe communication to establish a relative order.

Pipeline communication

Pipeline communication is the main method of synchronization between goroutines. The sending operation of a pipe matches [corresponding] the receiving operation of a pipe (usually in another goroutine).

A send operation on a buffered pipe occurs before the corresponding receive operation is completed.

This program:

var c = make(chan int, 10) // 有缓冲的管道
var a string

func f() {
    
    
	a = "hello, world"
	c <- 0 // 发送操作，先
}

func main() {
    
    
	go f()
	<-c // 接收操作，后
	print(a)
}

Able to ensure that the output "hello, world". Because the assignment operation to a is completed before the sending operation, and the receiving operation is completed after the sending operation.

Closing a pipe occurs before receiving a zero value from the pipe.

In the previous example, the c <- 0statement is replaced close(c)effect is the same.

A receive operation on an unbuffered pipe occurs before the corresponding send operation is completed.

This program (same as above, using an unbuffered pipe, swapping the send and receive operations):

var c = make(chan int) // 无缓冲的管道
var a string

func f() {
    
    
	a = "hello, world"
	<-c // 接收操作，先
}

func main() {
    
    
	go f()
	c <- 0 // 发送操作，后
	print(a)
}

It will also make sure to output "hello, world".

If the pipe is buffered (for example, c = make(chan int, 1)) then the program cannot guarantee the output "hello, world". (It may print an empty string, crash, or do other things)

The kth receive operation on a pipeline of capacity C occurs before the k+Cth send operation is completed.

This rule generalizes the previous rule to buffered pipes. It allows the use of a buffered pipeline to implement a counting semaphore model : the number of elements in the pipeline corresponds to the number being used [semaphore count], and the capacity of the pipeline corresponds to the maximum number of simultaneous use. Sending an element to obtain the semaphore, Receive an element to release the semaphore. This is a common usage to limit concurrency.

The following program for each start a goroutine processing list, but the use of limitthe pipeline to ensure the same time only three work functions at runtime.

var limit = make(chan int, 3)

func main() {
    
    
	for _, w := range work {
    
    
		go func(w func()) {
    
    
			limit <- 1 // 获取信号量
			w()
			<-limit // 释放信号量
		}(w)
	}
	select{
    
    }
}

lock

syncThe package implements two lock data types, sync.Mutexand sync.RWMutex.

Any sync.Mutexor sync.RWMutextype of variable land n < m , the n-th l.Unlock()operation in the m-th l.Lock()place before the operation returns.

This program:

var l sync.Mutex
var a string

func f() {
    
    
	a = "hello, world"
	l.Unlock() // 第一个 Unlock 操作，先
}

func main() {
    
    
	l.Lock()
	go f()
	l.Lock() // 第二个 Lock 操作，后
	print(a)
}

Guaranteed to print out "hello, world".

Once

syncProviding packet Oncetype, provides a secure mechanism for initializing the plurality goroutine. Multiple threads can execute once.Do(f) once for a particular f, but only one will run f(), and other calls will block until f() returns.

From a once.Do(f)call f()is returned in any once.Do(f)occur before returning.

In this program:

var a string
var once sync.Once

func setup() {
    
    
	a = "hello, world" // 先
}

func doprint() {
    
    
	once.Do(setup)
	print(a) // 后
}

func twoprint() {
    
    
	go doprint()
	go doprint()
}

Calling twoprint will only call setup once. The setup function is completed before calling the print function. The result will be printed twice "hello, world".

Incorrect synchronization

Note that a read operation r may observe the value written by the write operation w that occurred simultaneously with it. When this happens, there is no guarantee that read operations that occur after r can observe write operations that occur before w .

In this program:

var a, b int

func f() {
    
    
	a = 1
	b = 2
}

func g() {
    
    
	print(b)
	print(a)
}

func main() {
    
    
	go f()
	g()
}

It may happen that the function g outputs 2 and then 0. [The value of b is output as 2, indicating that the write operation of b has been observed. But afterwards, the value of a is 0, indicating that the write operation of a before the write of b is not observed! It cannot be assumed that the value of b is 2, then the value of a must be 1! 】

This fact invalidates some common processing logic.

For example, in order to avoid the overhead caused by locking, the twoprint program may be incorrectly written as:

var a string
var done bool

func setup() {
    
    
	a = "hello, world"
	done = true
}

func doprint() {
    
    
	if !done {
    
     // 不正确！
		once.Do(setup)
	}
	print(a)
}

func twoprint() {
    
    
	go doprint()
	go doprint()
}

Such writing does not guarantee that the writing to done is observed in the doprint. This version may incorrectly output empty strings.

Another incorrect code logic is to loop and wait for a value to change:

var a string
var done bool

func setup() {
    
    
	a = "hello, world"
	done = true
}

func main() {
    
    
	go setup()
	for !done {
    
     // 不正确！
	}
	print(a)
}

As before, in main, observing the write to done does not mean that the write to a is observed, so this program may also print an empty string. Worse, there is no guarantee that writes to done will be observed by main because there is no synchronization event between the two threads. The loop in main cannot be guaranteed to complete.

A similar procedure is as follows:

type T struct {
    
    
	msg string
}

var g *T

func setup() {
    
    
	t := new(T)
	t.msg = "hello, world"
	g = t
}

func main() {
    
    
	go setup()
	for g == nil {
    
     // 不正确
	}
	print(g.msg)
}

Even if main observes g != nil and exits the loop, there is no guarantee that it has observed the initial value of g.msg.

In all these examples, the solution is the same: use explicit synchronization .