GC in golang

Garbage Collection (GC for short) is an automatic memory management mechanism provided in a programming language, which automatically releases unnecessary objects and frees up memory resources without manual execution by programmers.

GOLANG's garbage collection mechanism

Now go1.14 uses the three-color marking method and GC mixed write barrier. The GC process and other user goroutines can run concurrently, but STW (stop the world) takes a certain amount of time. During the STW process, the CPU does not execute user code. All for garbage collection

GC version changes

Go V1.3 Mark and sweep
Go V1.5 Three-color marking method
Go V1.8 Add mixed write barrier

Mark removal of Go V1.3

Mark removal process

  1. Trigger GC, pause the program business logic (STW), find out reachable and unreachable objects
    Insert picture description here
    as shown in the figure, the arrow indicates the reference relationship, the five objects 1, 2, 3, 4, 7 are reachable objects, 5 and 6 These two objects are unreachable objects
  2. Start marking, find all its reachable objects, and mark them, as shown in the red object part in the picture above.
  3. Finish marking and start clearing unmarked objects
  4. Complete the removal, stop the pause, and then repeat the above four steps in a loop until the program process life cycle ends

Diagram of the entire process:
Insert picture description here

Question: What are the conditions that trigger GC

Disadvantages of mark sweep

  • There is STW. Because the entire program needs to be suspended during the GC phase, the program freezes, which seriously affects performance (most importantly)
  • The marking process requires scanning the entire heap and stack (stack information), which undoubtedly lengthens the duration of the entire STW, or affects performance
  • Clearing data will generate heap fragments, that is, generate some discontinuous fragmented memory space. For subsequent reuse, reorganization increases the difficulty

Optimize mark removal

The main problem with the GC strategy of mark removal is that the STW time is too long. The optimization direction is to reduce the STW time, as shown in the following figure:
Insert picture description here
Adjust the execution order of the GC process, move the Sweep operation out of the STW, and make it execute concurrently with the program business logic. This achieves the purpose of shortening STW time

However, the effect is limited. The process of finding reachable objects that can be marked as mentioned above involves the entire memory space and takes up the bulk of time. Then optimizing GC should start from marking, and try to reduce STW in the marking stage. However, the traditional common mark removal is difficult to guarantee that objects will not be lost without STW (this part will be mentioned repeatedly later), so the traditional marking is discarded later Three-color notation

Three-color notation for Go V1.5

The three colors refer to:

  • White mark table
  • Grey mark table
  • Black mark table

The process of three-color marking

1. Start marking, as long as it is a newly created object, the default color is marked as "white", the above picture is an example, the current objects 1-7 are all white
2. Start from the program (object root node), that is, from the picture above Starting from the program, start to traverse all the objects, the traversed objects 从白色集合放入到灰色集合, the current object 1, the object 4 is gray, and the rest are white
3. Traverse the gray set, put the gray object application object from the white set into the gray set, and then put the original Put the gray set into the black set, currently 1, 4 is black, 2, 7 is gray, 3, 5, and 6 are white
4. Repeat the third step until there are no objects in the gray table
5. Reclaim all white marked objects, Which is to recycle garbage

Discussion on Three-color Marking

First of all, the emergence of the three-color marking method is definitely to optimize the efficiency of the GC, the most important point is to shorten the STW time

Question: The three-color marking method also scans the entire stack space. What are the advantages of the previous marking removal?

Answer: The next possible method is to move the marking process out of the STW process, but simple removal will bring a lot of risk, after the following two conditions are met at the same time会误删除引用对象(this phenomenon is called a dangling pointer):

  1. A white object is currently referenced by a black object (that is, white is hung under black)
  2. The reference relationship between the gray object and the white is destroyed (that is, the gray loses the white at the same time)

After reading the above question and answer, if there is no STW in the marking phase, it is necessary to prevent the above two conditions from happening at the same time

So the object reference conditions put forward 强三色不变式and 弱三色不变式two rules, respectively from the above two conditions, prevent the occurrence of the referenced object is accidentally deleted

Strong three-color invariant It is
mandatory that black objects are not allowed to refer to white objects, which breaks the condition 1.Insert picture description here

Weak three-color invariant A
black object can refer to white, but at the same time, the white object must have references to it from other gray objects, or there must be a gray object upstream of the white object's reference link, breaking condition 2
Insert picture description here

In the three-color mark, as long as one of the strong/weak three-color invariants is satisfied, the reference object can be guaranteed not to be lost

Summary:
Compared with traditional marking, the three-color marking method improves the efficiency of GC by moving the marking process out of STW. However, the three-color mark will lose the reference object when the two conditions are met at the same time. Later, the strong/weak three-color invariant rule is proposed to prevent the two conditions from being established at the same time. The following question is how to realize the strong/weak three-color invariant Rules, the barrier mechanism

Barrier mechanism

Realize strong/weak tricolor invariant

Can be divided into zhen

  • Insert write barrier (a mechanism that starts when an object is referenced)
  • Delete write barrier (a mechanism that is triggered when the object is deleted, that is, when the reference relationship is dereferenced)

Insert write barrier

Specific operation: when object A refers to object B, mark object B as gray (if you want to hang B downstream of A, B must be marked as gray)

Satisfaction: strong three-color invariant (there will be no black object referencing the white object, because the white object will be forced to become a gray object)

Pseudo-code: the
Insert picture description here
insertion barrier only restricts the objects on the heap, not the objects on the stack

We know that the object is stored on the heap or on the stack, because every time it is inserted, it will affect performance. If you need to open the write barrier on the stacks of hundreds of Goroutines at runtime, it will bring huge additional overhead, plus the stack space is relatively small, but the corresponding speed is required, because the function call pops are frequently used, so on the stack There is no barrier inserted, only on the heap

Insufficiency of inserting write barriers
Because inserting barriers does not limit the objects on the stack, after the three-color notation marks a complete heap, STW must be started, all black objects on the stack are set to white, and the stack space is scanned again. Because the stack space is relatively small, it takes a short time, about 10-100ms

Remove write barrier

Specific operation: the deleted (dereference relationship) object, if it is white, then the marked will be gray.
Satisfaction: weak three-color invariant (protect the path from gray object to white object will not be broken)

Insufficiency of the delete barrier:

The recovery accuracy is low. Even if an object is deleted from the last pointer to it, it can still survive this round of GC and can only be cleared in the next round of GC. However, this problem only affects the current available memory size and can be ignored

to sum up

Version 1.5 uses three-color notation in the marking process. There are four main stages in the recycling process. Among them, marking and cleaning are executed concurrently, but before and after the marking phase, STW needs a certain amount of time to do GC preparation and stack re-scan.

Go V1.8's three-color notation + hybrid write barrier mechanism

The process of three-color notation was introduced earlier. At the same time, to improve the efficiency of GC, there is no STW in the marking phase, but when the two conditions are met at the same time, the object will be lost. In order to prevent the two conditions from happening at the same time, a strong/weak three-color invariant is proposed, and the corresponding implementations are insert write barrier and delete write barrier. But both have their own shortcomings. In order to ensure performance, hybrid write barriers cannot be executed on objects on the stack, and STW is still needed to ensure marking. Delete write barrier recovery accuracy is low (at the same time, a short STW is still needed to record the current snapshot before recovery)

In response to the above shortcomings, Go combines the advantages of inserting write barriers and deleting write barriers in V1.8, and proposes a hybrid write barrier (hybrid write barrier) mechanism

Specific operation:

  1. At the beginning of GC, all objects are white by default. Prioritize 栈上的可达对象all scans and mark them as black (the second scan will not be repeated when the write barrier is inserted, no STW is required)
  2. During GC, any 栈上创建的新对象white objects in and referenced are black
  3. Deleted white objects on the heap (dereference relationship) are marked in gray
  4. White objects added on the pile are marked as gray

Note that the hybrid write barrier is a barrier mechanism of Gc, so this mechanism is only triggered when the program executes GC.

Satisfaction:
The weak three-color invariant of deformation. Combining the advantages of inserting and deleting write barriers, you only need to scan the stack of each goroutine concurrently at the beginning to make it black and keep it. This process does not require STW, and after the mark is over, because the stack is always black after scanning , There is no need to perform re-scan operations anymore, reducing STW time.

to sum up

In V1.3, the traditional common mark removal method is adopted, and STW is required throughout the process, which is extremely inefficient. At this time, the important direction to improve GC efficiency is to shorten STW in the marking phase, however 传统的标记清除不能保证没有STW的情况下不丢失数据. V1.5 version proposes a three-color marking method, combined with insert write barrier or delete write barrier, 大大缩短STW的同时,保证不丢失对象。V1.8 combines the advantages of insert/delete write barrier, and proposes a hybrid write barrier.在保证不丢失数据的同时,几乎没有STW的存在

GC trigger conditions:

  1. Quantitative trigger: The default configuration of the Go language runtime will trigger a new round of garbage collection when the heap memory reaches 2 times of the previous garbage collection. This behavior can be adjusted by the environment variable GOGC, and its value is 100 by default , That is, a 100% increase in heap memory will trigger GC.
  2. Timing trigger: If there is no trigger within a certain period of time, a new cycle will be triggered. The starting condition is controlled by the runtime.forcegcperiod variable, and the default is 2 minutes;
  3. Manual trigger: The user program will actively notify the runtime execution through the runtime.GC function during the program running. This method will block the caller until the current garbage collection cycle is completed. During the garbage collection period, the entire program may also be suspended through STW:
  4. Triggered when there is insufficient space: When there is no free space in the memory management unit of the current thread, creating an object below 32KB may trigger garbage collection, and when creating an object above 32KB, it will definitely try to trigger

Reference article:
https://draveness.me/golang/docs/part3-runtime/ch07-memory/golang-garbage-collector/
https://www.kancloud.cn/aceld/golang/1958308

Guess you like

Origin blog.csdn.net/csdniter/article/details/110575573