Understand the evolution history of Go GC in one article, it’s so detailed!

Recently, I was discussing Go GC issues with friends from the Go Employment Training Camp , and I found the content summarized by teacher Liu Danbing. It was so well written that I would like to share it with you.

Our discussions and thoughts have also been compiled into this article. We hope it will inspire you.

Garbage Collection (GC) is an automatic memory management mechanism provided in programming languages, which automatically releases unnecessary memory objects and gives up memory resources. No manual execution by the programmer is required during the GC process. The GC mechanism is supported by many modern programming languages. The performance and advantages of GC capabilities are also one of the contrast indicators between different languages.

Golang has also experienced many changes in the evolution of GC, including the mark and sweep algorithm before Go V1.3. Disadvantages of mark and sweep before Go V1.3.

You can focus on the changes in the following versions:

  • Three-color concurrency marking method for Go V1.5
  • Why does Go V1.5’s three-color marking require STW?
  • Why does the three-color mark of Go V1.5 need a barrier mechanism ("strong-weak" three-color invariant, insertion barrier, deletion barrier)
  • Go V1.8 hybrid write barrier mechanism
  • Full-scenario analysis of Go V1.8 hybrid write barrier mechanism

1. Mark and sweep algorithm before Go V1.3

Next, let’s take a look at the common mark-and-clear algorithm that was mainly used before Golang 1.3. This algorithm mainly has two main steps:

  • Mark phase
  • Sweep phase
1. Specific steps of mark removal algorithm

The first step is to pause the program's business logic, classify reachable and unreachable objects, and then mark them.

The figure shows the reachable relationship between the program and the objects. Currently, the reachable objects of the program include five objects, including objects 1-2-3 and objects 4-7.

The second step is to start marking. The program finds all its reachable objects and marks them. As shown below:

Therefore, five objects including Object 1-2-3 and Object 4-7 are marked.

The third step is to clear the unmarked objects after marking them. The results are as follows.

The operation is very simple, but there is one thing that requires extra attention: when the mark and sweep algorithm is executed, the program needs to be paused! That is STW(stop the world), during the STW process, the CPU does not execute user code and is entirely used for garbage collection. This process has a great impact, so STW is also the biggest problem for some recycling mechanisms and a point that hopes to be optimized. Therefore, during the period of executing the third step, the program will temporarily stop any work and get stuck waiting for the recycling to be completed.

Step 4 : Stop pausing and let the program continue running. This process is then repeated in a loop until the process program life cycle ends.

The above is the algorithm of mark and sweep recycling.

2 Disadvantages of mark and sweep

The mark-and-clear algorithm is clear and the process is straightforward, but it also has very serious problems.

  • STW, stop the world; let the program pause and the program gets stuck (an important problem) ;
  • Marking requires scanning the entire heap;
  • Clearing data will create heap fragmentation.

The above was implemented before Go V1.3. The basic process of executing GC is to first start the STW pause, then perform marking, then perform data recycling, and finally stop STW, as shown in the figure.

From the above figure, all GC time is wrapped within the STW range. It seems that the program is paused for too long, which affects the running performance of the program. Therefore, Go V1.3 has made simple optimizations to advance the steps of STW and reduce the time range of STW pause. As follows

The above figure mainly advances the STW step one step, because when Sweep is cleared, STW does not need to be stopped, because these objects are already unreachable objects, and there will be no problems such as recycling write conflicts.

But no matter how optimized, Go V1.3 faces an important problem, that is, the mark-and-sweep algorithm will suspend the entire program .

How does Go face this problem? Next, the G V1.5 version will use the three-color concurrent marking method to optimize this problem.

3. Three-color concurrent marking method of Go V1.5

Garbage collection in Golang mainly uses the three-color marking method. The GC process and other user goroutines can run concurrently, but it requires a certain period of STW (stop the world) . The so-called three-color marking method is actually determined by three stages of marking. What are the clear objects? Let's take a look at the specific process.

In the first step , every time a new object is created, the default color is marked "white", as shown in the figure.

As shown in the figure above, the memory object relationship that our program can reach is shown on the left. The tag table on the right is used to record the current tag color classification of each object. What needs to be noted here is that the so-called "program" is a collection of root nodes of some objects. So if we expand the "program", we will get a representation similar to the following, as shown in the figure.

In the second step , every time GC recycling starts, all objects will be traversed starting from the root node, and the traversed objects will be put from the white collection into the "gray" collection as shown in the figure.

What should be noted here is that this traversal is a one-time traversal in a non-recursive form. It traverses one level of objects that can be reached from the program. As shown in the figure above, the currently reachable objects are Object 1 and Object 4, so naturally this At the end of the round traversal, object 1 and object 4 will be marked in gray, and these two objects will be added to the gray mark table.

The third step is to traverse the gray collection, put the object referenced by the gray object from the white collection into the gray collection, and then put the gray object into the black collection, as shown in the figure.

This traversal only scans gray objects, and changes the reachable objects in the first layer of gray objects from white to gray, such as: Object 2, Object 7. The previous gray objects 1 and 4 will be marked as black. , while moving from the gray mark table to the black mark table.

Step 4 : Repeat step 3 until there are no objects in gray, as shown in the picture.

When all of our reachable objects have been traversed, there will no longer be gray objects in the gray mark table. Currently, the data in all memory only has two colors, black and white. Then the black objects are the objects that are reachable (required) by our program logic. These data currently support the normal business operation of the program. They are legal and useful data and cannot be deleted. The white objects are all unreachable objects. The current program logic does not depend on them. them, then the white objects are junk data currently in memory and need to be cleared.

Step 5 : Recycle all objects in the white marked table. That is, recycle garbage, as shown in the figure.

Above we delete and recycle all the white objects, and what remains are all the dependent black objects.

The above is 三色并发标记法, it is not difficult to see, the characteristics that we have clearly reflected above 三色. However, there may be many concurrent processes that will be scanned, and the memory executing concurrent processes may depend on each other. In order to ensure the security of data during the GC process, we will add STW before starting the three-color marking, and determine the black and white after scanning. Object and then release STW. But it is obvious that the performance of such GC scan is too low.

So how does Go solve the stutter (stw, stop the world) problem in the mark and sweep algorithm?

4. Three-color marking method without STW

Let’s start with something interesting. If we add STW without STW, then there will be no performance problems. Then let’s assume what will happen if STW is not added to the three-color marking method?
We are still based on the above three-color concurrent marking method, which must rely on STW. Because if the program is not paused, the logic of the program changes the object reference relationship. If this action is modified during the marking stage, it will affect the marking results. For the correctness, let’s take a look at a scenario. If the three-color marking method does not use STW in the marking process, what will happen?

We set the initial state to have gone through the first round of scanning. Currently, there are object 1 and object 4 in black, object 2 and object 7 in gray, and the others are white objects, and object 2 points to object 3 through the pointer p. ,as the picture shows.

Now if the three-color marking process does not start STW, then during the GC scanning process, read and write operations may occur on any object. As shown in the figure, before object 2 has been scanned, object 4 has been marked as black. At this point, pointer q is created and points to white object 3.

At the same time, the gray object 2 removes the pointer p, so the white object 3 is actually hung under the black object 4 that has been scanned, as shown in the figure.

Then we point to the algorithm logic of three-color marking normally and mark all gray objects as black, then object 2 and object 7 are marked as black, as shown in the figure.

Then the last step of the three-color marking is performed, and all white objects are recycled as garbage, as shown in the figure.

But in the end we discovered that object 3, which was originally a legal reference to object 4, was "accidentally" recycled by the GC.

It can be seen that there are two situations that are not expected to occur in the three-color marking method.

  • Condition 1: A white object is referenced by a black object (white is hung under black)
  • Condition 2: The gray object and the white object in the reachable relationship between it are destroyed (gray loses the white object at the same time).
    If the above two conditions are met at the same time, object loss will occur!

Moreover, in the scene as shown in the figure, if the white object 3 in the example has many downstream objects, they will also be cleaned up.

In order to prevent this phenomenon from happening, the simplest way is STW, which directly prohibits other user programs from interfering with the object reference relationship. However, the STW process has obvious waste of resources and has a great impact on all user programs . So is it possible to reasonably improve GC efficiency and reduce STW time while ensuring that objects are not lost? The answer is yes, we just need to use a mechanism to try to destroy the above two necessary conditions.

5. Barrier mechanism

We let the GC collector ensure that the object is not lost when one of the following two conditions is met. These two methods are "strong three-color invariant" and "weak three-color invariant".

(1) “Strong-weak” three-color invariant
  • strong trichromatic invariant

There is no pointer from the black object to the white object.

Strong three-color discoloration is actually mandatory and does not allow black objects to reference white objects, so that white objects will not be accidentally deleted.

  • Weak trichromatic invariant

All white objects referenced by black objects are in a gray protection state.

The weak three-color invariant emphasizes that a black object can refer to a white object, but this white object must have other gray objects referencing it, or there must be a gray object upstream of the link that can reach it. In this way, the black object refers to the white object, and the white object is in a dangerous state of being deleted, but the reference to the upstream gray object can protect the white object and make it safe.

In order to follow the above two methods, the GC algorithm has evolved into two barrier methods, they "insert barrier" and "delete barrier".

(2) Insert barrier

具体操作: When object A refers to object B, object B is marked gray. (Hang B downstream of A, B must be marked gray)

满足: Strong three-color invariant . (There is no longer a situation where a black object refers to a white object, because white will be forced to turn gray)

The pseudocode is as follows:

添加下游对象(当前下游对象slot, 新下游对象ptr) {   
  //1
  标记灰色(新下游对象ptr)   
  
  //2
  当前下游对象slot = 新下游对象ptr  				  
}

Scenes:

A.添加下游对象(nil, B)   //A 之前没有下游, 新添加一个下游对象B, B被标记为灰色
A.添加下游对象(C, B)     //A 将下游对象C 更换为B,  B被标记为灰色

The logic of this pseudocode is to write the barrier. We know that the memory slot of the black object has two locations, and . The stack space is characterized by small capacity, but requires fast response speed. Because function calls are frequently used, "insert barrier" "mechanism is not used in stack space object operations . It is only used in heap space object operations.

Next, we use several pictures to simulate the entire detailed process, hoping that you can see the overall process more clearly.







But if the stack is not added, after all three-color markers are scanned, there may still be white objects referenced on the stack (such as object 9 in the picture above). Therefore, the stack must be scanned again for three-color markers, but this time for the object If it is not lost, STW pause should be started for this mark scan until the three-color mark of the stack space ends.





Finally, all white nodes remaining in the stack and heap space scan are cleared. The approximate time of STW this time is between 10~100ms.


(3) Delete barrier

具体操作: The deleted object is marked as gray if it is gray or white.

满足: Weak three-color invariant . (Protect the path from gray objects to white objects from being broken)

pseudocode:

添加下游对象(当前下游对象slot, 新下游对象ptr) {
  //1
  if (当前下游对象slot是灰色 || 当前下游对象slot是白色) {
  		标记灰色(当前下游对象slot)     //slot为被删除对象, 标记为灰色
  }
  
  //2
  当前下游对象slot = 新下游对象ptr
}

Scenes:

A.添加下游对象(B, nil)   //A对象,删除B对象的引用。  B被A删除,被标记为灰(如果B之前为白)
A.添加下游对象(B, C)		 //A对象,更换下游B变成C。   B被A删除,被标记为灰(如果B之前为白)

Next, we use several pictures to simulate the entire detailed process, hoping that you can see the overall process more clearly.

The recycling accuracy of this method is low. Even if an object is deleted and the last pointer pointing to it is deleted, it can still survive this round and be cleaned up in the next round of GC.

6. Hybrid write barrier mechanism of Go V1.8

Shortcomings of inserting and deleting write barriers:

  • Insert write barrier: STW is required to rescan the stack at the end to mark the survival of the white objects referenced on the stack;
  • Delete the write barrier: The recycling accuracy is low. When GC starts, STW scans the stack to record the initial snapshot. This process will protect all surviving objects at the beginning.

Go V1.8 introduces a hybrid write barrier mechanism, which avoids the process of stack re-scan and greatly reduces the STW time. Combines the best of both worlds.


(1) Mixed write barrier rules

具体操作:

1. The GC begins to scan all objects on the stack and marks all reachable objects as black (no second repeated scan will be performed thereafter, no STW required),

2. During GC, any new objects created on the stack are black.

3. Deleted objects are marked gray.

4. The added objects are marked in gray.

满足: Deformed weak trichromatic invariant .

pseudocode:

添加下游对象(当前下游对象slot, 新下游对象ptr) {
  	//1 
		标记灰色(当前下游对象slot)    //只要当前下游对象被移走,就标记灰色
  	
  	//2 
  	标记灰色(新下游对象ptr)
  		
  	//3
  	当前下游对象slot = 新下游对象ptr
}

We note here that barrier technology is not applied on the stack because it is necessary to ensure the operating efficiency of the stack.

(2) Specific scenario analysis of hybrid write barrier

Next, we use several pictures to simulate the entire detailed process, hoping that you can see the overall process more clearly.

Note that the mixed write barrier is a barrier mechanism of Gc, so this mechanism will only be triggered when the program executes GC.

GC starts: scan the stack area and mark all reachable objects as black


Scenario 1: The object is dereferenced by a heap object and becomes downstream of the stack object

pseudocode

//前提:堆对象4->对象7 = 对象7;  //对象7 被 对象4引用
栈对象1->对象7 = 堆对象7;  //将堆对象7 挂在 栈对象1 下游
堆对象4->对象7 = null;    //对象4 删除引用 对象7

Scenario 2: The object is deleted from a stack object and becomes the downstream of another stack object.

pseudocode

new 栈对象9;
对象8->对象3 = 对象3;      //将栈对象3 挂在 栈对象9 下游
对象2->对象3 = null;      //对象2 删除引用 对象3

Extension: Ask our questions

As shown in the figure above: If object 9 refers to object 5 and there is no barrier on the stack, object 5 will still be white in the end. Will this not cause accidental deletion?
The hybrid write barrier is used on the heap but not on the stack. If a black object in the stack refers to a white object and there is no write barrier, the white object will be recycled in the end, which causes us trouble.

After research and consultation with teacher Liu Danbing, we came to the conclusion:

This situation will not happen. Object 9 cannot see object 5 and is unreachable. If object 5 is a reachable object, it will not turn white.

White means that the link has been broken and cannot be referenced, otherwise it will not be marked white during STW traversal.

Think again:

If object 2 deletes its reference to object 3, and no new object re-references 3, will object 3 be recycled in this round of GC?

The barrier mechanism will not be applied to the stack, so it will not be recycled in this round and will be marked white until the next scan.

Scenario 3: The object is dereferenced by one heap object and becomes the downstream of another heap object

pseudocode

堆对象10->对象7 = 堆对象7;       //将堆对象7 挂在 堆对象10 下游
堆对象4->对象7 = null;         //对象4 删除引用 对象7

Scenario 4: The object deletes its reference from a stack object and becomes downstream of another heap object

pseudocode

堆对象10->对象7 = 堆对象7;       //将堆对象7 挂在 堆对象10 下游
堆对象4->对象7 = null;         //对象4 删除引用 对象7

The hybrid write barrier in Golang is satisfied 弱三色不变式and combines the advantages of deleting write barriers and inserting write barriers. It only needs to scan the stack of each goroutine concurrently at the beginning to make it black and keep it. This process does not require STW, and after the marking is completed , because the stack is always black after scanning, there is no need to perform a re-scan operation, reducing the STW time.

7. Summary

The above is the entire marking-clearing logic and scenario demonstration process of Golang's GC.

GoV1.3- Ordinary mark and clear method, the overall process requires starting STW, which is extremely inefficient.

GoV1.5- three-color marking method, the heap space activates the write barrier, but the stack space does not activate. After all scans, the stack needs to be rescanned (requires STW), the efficiency is average

GoV1.8-three-color marking method, mixed write barrier mechanism, stack space is not activated, and heap space is activated. The whole process requires almost no STW and is highly efficient.

Copyright statement

The content of this article is reproduced with permission from the author

Original link: https://www.yuque.com/aceld/golang/zhzanb

study together

My article will be published first on the public account of the same name. Welcome to follow: Wang Zhongyang Go

Guess you like

Origin blog.csdn.net/w425772719/article/details/135283921