golang escape analysis

GC with language to write our program has brought great convenience, but at the same time shielding a lot of low-level details, such as an object is assigned or allocated on the heap on the stack. While it is common for the code do not care so much, but as obsessive-compulsive disorder program ape, or want to make myself out to write code for optimal performance, we still need to know what is the escape, and how to determine whether the escape occurred.

What is the heap and stack?

First of all you need to know, we are talking about what is heap and stack. This is not a data structure inside the "heap" and "stack", but the concept of an operating system inside.

Stack

In the program, each function block will have its own memory area to its own local variable storage (memory footprint less), return address, return data type values, a memory area which has a specific structure and addressing, size has been determined at compile time, it is also addressed very quickly, very little overhead. This address is called a memory stack. Stack is the thread level, the size has been determined at the time of creation, so when the data is too large and they will "stack overflow" occurs.

stack

In the program, global variables, local variables large memory footprint, escape occurs where there are local variables that stack, this memory is not a specific structure, there is no fixed size, can be adjusted as needed. In short, there is a lot of data to be stored when there is heap inside. Heap is the process level. When a variable time needs to be allocated in the heap, the cost will be relatively large, to go with the language of GC, the pressure will increase gc, but also easily lead to memory fragmentation.

Why are some variables to be allocated on the heap, and some to be allocated on the stack?

The problem from the start with C ++. In C ++, suppose we have the following code:

```c++
int* f1() {
int i = 5;
return &i;
}

int main() {
int i = f1();
i = 6;
return 0;
}


这时候程序结果是无法预期的，因为在函数f1中，i是一个局部变量，会分配在栈上，而栈在函数返回之后就失效了(Plan9 汇编中SP指针被修改)，于是i的地址所存的值是不可预期的，后续在main中对返回的i的地址中的值的修改可能会修改掉程序运行的数据，造成结果无法预期。

所以对于需要返回一个地址回去的情况，在C++中需要用new来分配一块堆上的内存才行，因为堆是进程级别的，也就是全局的，除非程序猿手动释放，否则不会被回收（释放不好会段错误，忘了释放会内存泄漏），于是就可以使得这个地址不会再被使用到，可以安全地返回。

## 如何进行逃逸分析？

在golang中，所有内存都是由runtime管理的，程序猿不需要关心具体变量分配在哪里，什么时候回收，但是编译器需要知道这一点，这样才能确定函数栈帧大小、哪些变量需要"new"在堆上，所以编译器需要进行`逃逸分析`。简单来说，`逃逸分析`决定了一个变量是分配在栈上还是分配在堆上。

golang逃逸分析最基本的原则是：`如果一个函数返回的是一个（局部）变量的地址，那么这个变量就发生逃逸`。

在golang里面，变量分配在何处和是否使用new无关，意味着程序猿无法手动指定某个变量必须分配在栈上或者堆上(自己撸asm的当我没说)，所以我们需要通过一些方法来确定某个变量到底是分配在了栈上还是堆上。

我们用以下代码作为例子：

```go
package main

func main() {
    a := f1()
    *a++
}

//go:noinline
func f1() *int {
    i := 1
    return &i
}

In the code above, to increase the f1 noinline mark, do not let go the compiler function inlining.

Use the compiler parameters

golang provides a compilation of the parameters so that we can visually see whether a variable escape occurred only need to specify when go build -gcflags '-m'to:

$ go build -gcflags '-m' escape.go
# command-line-arguments
./escape.go:3:6: can inline main
./escape.go:11:9: &i escapes to heap
./escape.go:10:2: moved to heap: i

Such can visually see the line 10,11, i escape occurs, will be allocated on the heap memory.

In addition to using the compiler parameters outside, we can also use a more low-level, more hard-core, more accurate way to determine whether an object to escape, that is: look directly compilation!

Assembler

We use the go tool compile -Sgenerated assembly code:

$ go tool compile -S escape.go | grep escape.go:10
    0x001d 00029 (escape.go:10) PCDATA  $2, $1
    0x001d 00029 (escape.go:10) PCDATA  $0, $0
    0x001d 00029 (escape.go:10) LEAQ    type.int(SB), AX
    0x0024 00036 (escape.go:10) PCDATA  $2, $0
    0x0024 00036 (escape.go:10) MOVQ    AX, (SP)
    0x0028 00040 (escape.go:10) CALL    runtime.newobject(SB)
    0x002d 00045 (escape.go:10) PCDATA  $2, $1
    0x002d 00045 (escape.go:10) MOVQ    8(SP), AX
    0x0032 00050 (escape.go:10) MOVQ    $1, (AX)

You can see, there are 00040 calls runtime.newobject(SB)this method, we see this method should understand!

to sum up

Providing the above two methods can be used to determine whether a variable escape occurred, using translation parameters is relatively simple, relatively hard core assembler. With these two methods has analyzed the escape, will be able to further optimize the amount of memory on the heap, GC relieve the pressure.