The essence of GO language interview-how is escape analysis performed?

In compilation principles, the method of analyzing the dynamic range of pointers is called escape analysis. In layman's terms, when a pointer to an object is referenced by multiple methods or threads, we say that the pointer has escaped.

Escape analysis in Go language is an optimization and simplification of memory management after the compiler performs static code analysis. It can determine whether a variable is allocated on the heap or stack.

Students who have written C/C++ all know that calling the famous malloc and new functions can allocate a piece of memory on the heap. The responsibility for the use and destruction of this memory lies with the programmer. If you are not careful, a memory leak will occur.

In the Go language, there is basically no need to worry about memory leaks. Although there is also a new function, the memory obtained by using the new function is not necessarily on the heap. The difference between the heap and the stack is "blurred" for programmers. Of course, all this is done for us by the Go compiler behind the scenes.

The most basic principle of escape analysis in Go language is: if a function returns a reference to a variable, then it will escape.

To put it simply, the compiler will analyze the characteristics of the code and the code life cycle. Variables in Go will only be allocated to the stack if the compiler can prove that they will not be referenced again after the function returns. In other cases, they will be allocated to the stack. Pile on.

There is no keyword or function in the Go language that can directly cause variables to be allocated on the heap by the compiler. Instead, the compiler determines where to allocate variables by analyzing the code.

Taking the address of a variable may be allocated on the heap. However, after the compiler performs escape analysis, if it is found that this variable will not be referenced after the function returns, it will still be allocated on the stack.

The compiler will decide whether to escape based on whether the variable is referenced externally:

  1. If there is no reference outside the function, it will be placed on the stack first;
  2. If there is a reference outside the function, it must be placed on the heap;

When writing C/C++ code, in order to improve efficiency, pass-by-value (passing by value) is often "upgraded" to pass-by-reference in an attempt to avoid the running of the constructor and directly return a pointer.

You must still remember that there is a big pit hidden here: a local variable is defined inside the function, and then the address (pointer) of this local variable is returned. These local variables are allocated on the stack (static memory allocation). Once the function is executed, the memory occupied by the variables will be destroyed. Any action on this return value (such as dereferencing) will disrupt the running of the program, and even Cause the program to crash directly. For example, the following code:

int *foo ( void )   
{
    
       
    int t = 3;
    return &t;
}

Some students may be aware of the pitfall above and use a smarter approach: use the new function inside the function to construct a variable (dynamic memory allocation), and then return the address of this variable. Because variables are created on the heap, they are not destroyed when the function exits. But is this enough? When and where should the object created by new be deleted? The caller may forget to delete or directly pass the return value to other functions, and then it can no longer be deleted, which means a memory leak occurs. Regarding this pitfall, you can read Clause 21 of "Effective C++", which is very good!

C++ is recognized as the language with the most complex syntax. It is said that no one can fully master the syntax of C++. All this is very different in the Go language. Put the C++ code like the example above into Go without any problems.

The problems that arise in C/C++ mentioned earlier are strongly promoted as a language feature in Go. It’s really the arsenic of C/C++ and the honey of Go!

Dynamically allocated memory in C/C++ requires us to release it manually, which causes us to walk on thin ice when writing programs. This has its advantages: the programmer has complete control over the memory. But there are also many shortcomings: often forgetting to release memory, leading to memory leaks. Therefore, many modern languages ​​have added garbage collection mechanisms.

Go's garbage collection makes the heap and stack transparent to programmers. It truly frees the hands of programmers, allowing them to focus on business and complete code writing "efficiently". Leave those complex memory management mechanisms to the compiler, and programmers can enjoy their lives.

The "saucy operation" of escape analysis rationally allocates variables to where they should go. Even if you apply for memory with new, if I find that it is no longer useful after exiting the function, then I will throw you on the stack. After all, memory allocation on the stack is much faster than on the heap; conversely, even if you appear to have It's just an ordinary variable, but after escape analysis it is found that it is referenced elsewhere after exiting the function, then I will allocate it to the heap.

If variables are allocated on the heap, the heap cannot be automatically cleaned up like the stack. It will cause Go to perform garbage collection frequently, and garbage collection will occupy a relatively large system overhead (occupying 25% of the CPU capacity).

Compared with the stack, the heap is suitable for memory allocation of unpredictable sizes. But the price paid for this is slower allocations and memory fragmentation. Stack memory allocation will be very fast. Stack allocation memory only requires two CPU instructions: "PUSH" and "RELEASE" to allocate and release; while heap allocation memory first needs to find a memory block of appropriate size, and then it must be released through garbage collection.

Through escape analysis, you can try to allocate variables that do not need to be allocated on the heap directly to the stack. Fewer variables on the heap will reduce the cost of allocating heap memory, while also reducing the pressure on the gc and improving the running speed of the program. .

Extension 1: How to check whether a variable has escaped?
Two methods: use the go command to view the escape analysis results; disassemble the source code;

For example, use this example:

package main

import "fmt"

func foo() *int {
	t := 3
	return &t;
}

func main() {
	x := foo()
	fmt.Println(*x)
}

Use go command:

go build -gcflags '-m -l' main.go

Added -lto prevent foo function from being inlined. Get the following output:

# command-line-arguments
src/main.go:7:9: &t escapes to heap
src/main.go:6:7: moved to heap: t
src/main.go:12:14: *x escapes to heap
src/main.go:12:13: main ... argument does not escape

The variables in the foo function tescaped, just as we expected. What puzzles us is why the main function xalso escapes? This is because some function parameters are of interface type, such as fmt.Println(a...interface{}). It is difficult to determine the specific type of its parameters during compilation, and escapes will also occur.

Disassembly source code:

go tool compile -S main.go

Intercepting part of the results, the instructions marked in the figure indicate tthat memory was allocated on the heap and an escape occurred.
Insert image description here

Extension 2: Have the variables in the following code escaped?
Example 1:

package main
type S struct {}

func main() {
  var x S
  _ = identity(x)
}

func identity(x S) S {
  return x
}

Analysis: Go language functions are passed by value. When calling a function, a copy of the parameters is directly copied on the stack, and there is no escape.

Example 2:

package main

type S struct {}

func main() {
  var x S
  y := &x
  _ = *identity(y)
}

func identity(z *S) *S {
  return z
}

Analysis: The input of the identity function is directly regarded as the return value. Because there is no reference to z, z does not escape. The reference to x has not escaped the scope of the main function, so x has not escaped.

Example 3:

package main

type S struct {}

func main() {
  var x S
  _ = *ref(x)
}

func ref(z S) *S {
  return &z
}

Analysis: z is a copy of x. A reference to z is taken in the ref function, so z cannot be placed on the stack. Otherwise, how can z be found by reference outside the ref function, so z must escape to the heap. Although in the main function, the result of ref is directly discarded, the Go compiler is not so smart and cannot analyze this situation. There is never a reference to x, so x will not escape.

Example 4: What if you assign a reference to a structure member?

package main

type S struct {
  M *int
}

func main() {
  var i int
  refStruct(i)
}

func refStruct(y int) (z S) {
  z.M = &y
  return z
}

Analysis: The refStruct function takes a reference to y, so y escapes.

Example 5:

package main

type S struct {
  M *int
}

func main() {
  var i int
  refStruct(&i)
}

func refStruct(y *int) (z S) {
  z.M = y
  return z
}

Analysis: A reference to i is taken in the main function and passed to the refStruct function. The reference of i is always used in the scope of the main function, so i does not escape. Compared with the previous example, there is a small difference, but the resulting program effect is different: In Example 4, i is first allocated in the main stack frame, then allocated in the refStruct stack frame, and then escaped to the heap. , allocated once on the heap, a total of 3 allocations. In this example, i is allocated only once and then passed by reference.

Example 6:

package main

type S struct {
  M *int
}

func main() {
  var x S
  var i int
  ref(&i, &x)
}

func ref(y *int, z *S) {
  z.M = y
}

Analysis: In this example, i has escaped. According to the analysis of the previous example 5, i will not escape. The difference between the two examples is that S in Example 5 is in the return value, and the input can only "flow" into the output. In this example, S is in the input parameter, so the escape analysis fails and i needs to escape to the heap.

Guess you like

Origin blog.csdn.net/zy_dreamer/article/details/132795412