Detailed explanation of the principle of G1 garbage collector

1. Development background of G1 garbage collector:

1. Defects of CMS garbage collector:

The purpose of the JVM team to design the G1 collector is to replace the CMS collector, because the CMS collector has many problems in many scenarios, and the defects are fully exposed, as follows:

(1) The CMS collector is very sensitive to CPU resources. In the concurrency stage, although it will not cause the user thread to stall, it will occupy CPU resources and cause the reference program to slow down and the total throughput to decrease. The number of recycling threads started by CMS by default is: (number of CPUs + 3) / 4

(2) The CMS collector cannot handle floating garbage. Since the user thread is still running during the concurrent cleaning phase of the CMS, new garbage will naturally be generated along with the running of the program. This part of garbage appears after the marking process and is called "floating garbage". , CMS cannot process them in this collection, so it has to be cleaned up in the next GC.

(3) Since "floating garbage" will be generated during the garbage collection phase, the CMS collector cannot wait until the old generation is almost completely filled up like other collectors, and needs to reserve a part of memory space for concurrent collection. floating garbage. Under the default settings of JDK5, the CMS collector will be activated when 68% of the old space is used (JDK6 increased to 92%), and the trigger percentage can also be increased through the value of the parameter to reduce the number of memory recycling and improve performance
-XX:CMSInitiatingOccupancyFraction. If the memory reserved during the running of the CMS cannot meet the needs of other threads of the program, a "Concurrent Mode Failure" failure will occur. At this time, the virtual machine will start a backup plan: temporarily enable the Serial Old collector to re-collect the garbage in the old age, so that Pause time is very long. Therefore, if the parameter -XX:CMSInitiatingOccupancyFractionsetting is too high, it will easily lead to "Concurrent Mode Failure" failure, and the performance will be reduced instead.

(4) CMS is a collector implemented based on the "mark-clear" algorithm, which will generate a large number of discontinuous memory fragments. When there are too many space fragments in the old generation, if you cannot find a large enough contiguous memory to store objects, you will have to trigger a Full GC in advance. In order to solve this problem, the CMS collector provides a -XX:UseCMSCompactAtFullCollection switch parameter, which is used to add a defragmentation process after the Full GC. You can also use the -XX:CMSFullGCBeforeCompaction parameter to set how many times to execute the non-compressed Full GC, followed by Let's do a defragmentation process.

2. Features of the G1 garbage collector:

The G1 (Garbage First) collector is a new collector provided by JDK7, and it is designated as the official GC collector in JDK9. It is suitable for servers with "multi-core CPU and large memory". It attempts to meet garbage collection (GC) pause time goals with high probability while achieving high throughput. Compared to the CMS collector, the most prominent improvements are:

  • Based on the "mark-sort" algorithm, there will be no memory fragmentation after collection.
    Pause times can be controlled very precisely, enabling low-pause garbage collection without sacrificing throughput.
  • Before introducing the garbage collection process of G1, let's briefly understand the memory model and main data structures in G1. These data results are very important for us to understand the garbage collection process of G1

Second, the memory model of the G1 garbage collector:

The G1 collector does not adopt the traditional physical isolation layout of the new generation and the old generation, but only logically divides the new generation and the old generation, and divides the entire heap memory into 2048 independent memory block Regions of equal size. Each Region is a logical A continuous section of memory, the specific size depends on the actual size of the heap, the overall control is between 1M - 32M, and it is the Nth power of 2 (1M, 2M, 4M, 8M, 16M and 32M), and uses different Region is used to represent the new generation and the old generation. G1 no longer requires the same type of Regions to be adjacent in physical memory, but to achieve logical continuity through the dynamic allocation of Regions.

The G1 collector tracks the garbage accumulation in the Region, and reclaims the region with the highest priority each time according to the set garbage collection time, avoiding the garbage collection of the entire new generation or the entire old generation, making the stop the world shorter and shorter Controllable, and at the same time, the highest recovery efficiency can be obtained within a limited time.

Note: There is no distinction between survivor0 and survivor1 in G1, there is only one, and the size is allocated dynamically.

Through the area division and priority area recycling mechanism, it is ensured that the G1 collector can obtain the highest garbage collection efficiency in a limited time.

1. Partition Region:

insert image description here
The G1 garbage collector divides the heap memory into several Regions, and each Region partition can only play one role, one of Eden, S, and O, and the blank area represents unallocated memory.

Finally, there is a special area H (Humongous), which is specially used to store huge objects. If the size of an object exceeds 50% of the Region capacity, G1 considers it a huge object. In other garbage collectors, these giant objects are allocated in the old age by default, but if it is a short-lived giant object, putting it in the old age will have a negative impact on the garbage collector and trigger frequent GC in the old age. In order to solve this problem, G1 divides an H area to store huge objects. If an H area cannot hold a huge object, then G1 will search for a continuous H area to store it. If it cannot find a continuous H area, it will have to start Full GC.

2、Remember Set:

In both serial and parallel collectors, GC scans the entire heap to determine whether an object is on a reachable path. However, in order to avoid STW-style full-heap scanning, G1 allocates an RSet (Remembered Set) for each partition, which is similar to a reverse pointer internally, recording the references of other Regions to the current Region, thus bringing a Great advantage: when reclaiming a Region, you don’t need to perform a full heap scan, you only need to scan its RSet to find external references to determine whether the referenced objects in this partition are alive, and then determine the survival of objects in this partition , and these references are one of the roots of the initial mark.

In fact, not all references need to be recorded in RSet. If the reference source is the object of this partition, then it does not need to be recorded in RSet; at the same time, every time G1 GC, all the new generation will be scanned, so the reference source It is an object of the young generation, and does not need to be recorded in RSet; so in the end, only the cross-generation references between the old generation and the new generation need to be recorded.

3、Card Table:

If a thread modifies the reference inside the Region, it must notify RSet to change the records in it. It should be noted that if there are many referenced objects, the setter needs to process each reference, and the setter overhead will be very high, so the G1 collector introduces Card Table to solve this problem.

A Card Table logically divides a Region into several continuous areas of fixed size (between 128 and 512 bytes), and each area is called a card Card, so Card is the smallest available granularity in the heap memory. The allocated objects will occupy a number of physically continuous cards. When looking for references to objects in the partition, they can be searched through the card Card (see RSet). deal with. Each Card uses a Byte to record whether it has been modified, and the Card Table is a collection of these Bytes, which is a byte array, and the space address of each partition is identified by the subscript of the Card array. By default, each Card is not referenced. When an address space is referenced, the value of the array index corresponding to this address space is marked as "0", that is, it is marked as dirty and referenced. In addition, RSet will also download this array Mark it down.

A Region may have multiple threads modifying concurrently, so RSet may also be modified concurrently. To avoid conflicts, the G1 garbage collector further divides the RSet into multiple HashTables, and each thread modifies in its own HashTable. Ultimately, logically, an RSet is a collection of these HashTables. Hash table is a common way to implement RSet. Its advantage is that it can remove duplication, which means that the size of RS will be equivalent to the number of modified pointers. In the case of no duplication, the number of RS and write operations The number is equivalent.

The Key of the HashTable is the starting address of other Regions, and the Value is a collection, and the elements in it are the Index of the Card Table.

The relationship between the first three data structures is as follows:
insert image description here
The dotted line of RS in the figure indicates that RSet is not a separate and different data structure from Card Table, but that RS is a conceptual model. In fact, Card Table is an implementation of RS.

G1's use of memory is in units of regions, while the allocation of objects is in units of cards.

4. Write barrier of RSet:

A write barrier means that every time a Reference reference type performs a write operation, a Write Barrier will be generated to temporarily interrupt the operation and perform some additional actions.

For write barriers, it is necessary to filter out unnecessary write operations, because the instruction overhead of writing barriers is very expensive, which can not only speed up the speed of the setter, but also reduce the burden on the collector. The write barrier of the G1 collector is complementary to the RSet. When a write barrier is generated, it will check whether the object pointed to by the reference to be written is in a different Region from the Reference type data. If they are different, the relevant reference information will be recorded in the reference through CardTable In the RSet corresponding to the Region where the pointing object is located, the RSet can be greatly reduced by filtering.

(1) Fence before writing: When an assignment statement is about to be executed, the object on the left side of the equation will modify the reference to another object, then the partition where the object originally referenced by the object on the left side of the equation will lose a reference, and the JVM needs to be in the Before the assignment statement takes effect, the object that lost the reference is recorded. But the JVM does not maintain the RSet immediately, but through batch processing, the RSet will be updated in the future

(2) Post-write fence: When an assignment statement is executed, the object on the right side of the equation gets a reference to the object on the left side, and the RSet of the partition where the object on the right side of the equation is located should also be updated. Also in order to reduce overhead, after the post-write fence occurs, RSet will not be updated immediately, but also just record the update log for batch processing in the future

When the G1 garbage collector performs garbage collection, adding RSet to the enumeration range of the GC root node can ensure that no global scan is performed and there will be no omissions. In addition, the rest of the generational garbage collectors used by the JVM also have write barriers. For example, every time a reference to an object in the old generation is modified to point to an object in the young generation, it will be captured and recorded by the write barrier. When collecting the old generation, you can avoid scanning the entire old generation to find the root.

The write barrier of G1's garbage collector uses a two-level log buffer structure:

  • global set of filled buffer: a global set shared by all threads, storing the collection of filled log buffers
  • thread log buffer: each thread's own log buffer. All threads will put the write barrier record into their own log buffer first, and when it is full, they will put the log buffer into the global set of filled buffer, and then apply for a log buffer;

5、Collect Set:

Collect Set (CSet) refers to the collection of Regions to be recycled selected by the G1 garbage collector in the Evacuation phase. In any collector, all partitions of the CSet will be released, and the internal surviving objects will be transferred to the allocated free space. partition. The soft real-time performance of G1 is realized through the selection of CSet. Corresponding to the two modes of the algorithm, fully-young generational mode and partially-young mode, the selection of CSet can be divided into two types:

  • Fully-young generational mode: also known as young GC, in this mode the CSet will only contain young regions, and G1 matches the soft real-time goal by adjusting the number of regions in the new generation;

  • Partially-young mode: also known as Mixed GC, this mode will select all young regions, and select a part of the old region, the selection of the old region will be based on the count of surviving objects in the Marking cycle phase, and filter out the partition with the highest recycling income Add to CSet (the Region with the fewest surviving objects is recycled)


-XX:G1MixedGCLiveThresholdPercentCSet access conditions for candidate old generation partitions can be set through the activity threshold (default 85%), thereby intercepting those objects with huge recovery costs; at the same time, each mixed collection can include candidate old generation partitions, and the heap can be classified according to CSet The proportion of the total size of -XX:G1OldCSetRegionThresholdPercent(default 10%) sets the upper limit of the number.

It can be seen from the above that the collection of G1 is based on the operation of CSet. There is no obvious difference between young generation collection and mixed collection. The biggest difference lies in the trigger conditions of the two collections.

3. The garbage collection process of G1:

G1 provides two GC modes, Young GC and Mixed GC, both of which are Stop The World (STW), but before talking about garbage collection, let's introduce G1's object allocation strategy.

1. Object allocation strategy:

Each allocated Region can be divided into two parts, allocated and unallocated, and the boundary between them is called top. Generally speaking, to assign an object to a Region, you only need to simply increase the value of top. The process is as follows:
insert image description here

(1) Thread local allocation buffer Thread Local allocation buffer (TLab):

If objects are allocated in a shared space, then we need to use a synchronization mechanism to solve the problem of concurrency conflicts. In order to reduce the synchronization time lost by concurrency conflicts, G1 allocates a local allocation buffer TLAB for each application thread and GC thread When allocating object memory, it is allocated in this buffer, and there is no need for any synchronization between threads, which improves GC efficiency. But when a thread exhausts its own Buffer, it needs to apply for a new Buffer. At this time, there will still be concurrency problems. The G1 collector uses the CAS (Compate And Swap) operation.

Obviously, using TLAB technology will bring fragments. For example, when a thread allocates in its own Buffer, although there is still space left in the Buffer, the allocated object is too large to accommodate the free space. At this time, the thread can only apply for a new one. Buffer, and the free space in the original Buffer is wasted. Both the size of the buffer and the number of threads will affect the number of these fragments.

Every time garbage is collected, each GC thread can also exclusively occupy a local buffer (GCLAB) to transfer objects, and copy surviving objects to Survivor space or old generation space;

For objects that are promoted from Eden/Survivor space to Survivor/old generation space, there is also a GC-exclusive local buffer for operations, which is called the promotion local buffer (PLAB).

(2) Allocation in Eden area:

For objects that cannot be allocated in the TLAB space, the JVM will try to allocate them in the Eden space. If the Eden space cannot accommodate the object, the space can only be allocated in the old generation.

(3) Humongous area allocation:

Humongous objects will exclusively occupy one or more consecutive partitions, where the first partition is marked as Starts Humongous, and adjacent consecutive partitions are marked as Continues Humongous. Since you can't enjoy the optimization brought by TLab, and you need to scan the whole heap to determine a continuous memory space, the cost of determining the start position of huge objects is very high. If possible, the application should avoid generating huge objects.

G1 has made an internal optimization. Once it finds that there is no reference pointing to the giant object, it can be recycled directly in the young generation collection cycle.

2、G1 Young GC:

When the Eden area is full and the JVM fails to allocate objects to the Eden area, a STW-style young generation collection young GC will be triggered, and the surviving objects in the Eden area will be copied to the to survivor area; the surviving objects in the from survivor area will be copied according to The thresholds for the number of survival times are promoted to PLAB, to survivor area, and the old generation respectively; if the survivor space is insufficient, some data in the Eden area will be directly promoted to the old generation space. In the end, the data in the Eden space is empty, the GC stops working, and the application thread continues to execute.

The young GC is also responsible for maintaining the age (survival times) of the object, and assisting in judging the whereabouts of the aging (tenuring) object when it is promoted. The young GC first maintains the sum of the promotion object size and age information in the age table, and then fills the capacity according to the age table, Survivor size, and Survivor -XX:TargetSurvivorRatio (default 50%), maximum tenure threshold -XX:MaxTenuringThreshold (default 15), Calculate an appropriate tenure threshold, and all objects exceeding the tenure threshold will be promoted to the old generation.

At this time, we need to consider a problem, if only GC new generation objects, how do we find all root objects? Are all objects in the old generation roots? It will take a lot of time to scan in this way. So we need to use the RSet we introduced above, which records the references of other regions to the current region. Therefore, when performing Young GC, when scanning the root, only this area needs to be scanned, instead of the entire old age.

2.1. The detailed recycling process of young GC:

(1) The first stage, root scanning:

The root refers to the object pointed to by the static variable, the local variable on the method call chain being executed, and so on. The root reference together with the external references recorded by the RSet serve as the entry point for scanning live objects.

(2) In the second stage, update RSet:

Process the cards in the dirty card queue and update the RSet. After this stage is completed, the RSet can accurately reflect the references of the old generation to the objects in the region partition where it is located

(3) The third stage: processing RSet:

Identify the objects in Eden that are pointed to by the old generation objects, and these objects in Eden that are pointed to are considered to be surviving objects

(4) The fourth stage: object copy:

The surviving objects in the Eden area will be copied to the to survivor area; the surviving objects in the from survivor area will be promoted to the PLAB, to survivor area, and old age according to the survival times threshold; if the survivor space is not enough, some data in the Eden area will be directly Promoted to the old generation space.

(5) The fifth stage: processing references:

When dealing with soft references, weak references, and phantom references, the data in the Eden space will eventually be empty, and the GC will stop working, while the objects in the target memory are stored continuously without fragmentation, so the copy process can achieve the effect of memory organization and reduce fragmentation.

3、G1 Mixed GC:

    年轻代不断进行垃圾回收活动后,为了避免老年代的空间被耗尽。当老年代占用空间超过整堆比 IHOP 阈值 -XX:InitiatingHeapOccupancyPercent(默认45%)时,G1就会启动一次混合垃圾回收Mixed GC,Mixed GC不仅进行正常的新生代垃圾收集,同时也回收部分后台扫描线程标记的老年代分区。Mixed GC步骤主要分为两步:

(1) Global concurrent marking (global concurrent marking)

(2) Copy surviving objects (evacuation)

What needs special attention here is that Mixed GC is not Full GC. Only when Mixed GC is too late to reclaim the old
region, that is, when objects in the old generation need to be allocated, but it is found that there is not enough space, a Full GC will be triggered at this time

3.1. Global concurrent marking

Before mixed recycling, global concurrent marking will be performed first. In G1 GC, it is not a necessary part of a GC process, but mainly provides marking services for Mixed GC. The execution process of global concurrent marking is divided into five steps:

(1) Initial mark (initial mark, STW):

All GC Roots nodes and directly reachable objects will be marked. This stage needs to stop the world, but it takes a short time.

The initial marking process is closely related to young GC. In fact, when the IHOP threshold is reached, G1 does not immediately initiate a concurrent marking cycle, but waits for the next young generation collection, and uses the STW period of the young generation collection to complete the initial marking. This method is called borrowing.

(2) Root region scan (root region scan):

Scan the reachable old generation area objects in the initially marked survival area (ie the survivor area), and mark the root object. This phase runs concurrently with the application, and the next STW's young GC cannot begin until this phase completes.

Because RSet does not record references from the young region, there may be a situation where a surviving object in the old generation is only referenced by objects in the young generation. In a young GC, these surviving young generation objects will be copied to the Survivor Region, so these Survivor regions need to be scanned to find references to these objects pointing to the old generation, as part of the concurrent marking phase scanning the root of the old generation.

(3) Concurrent Marking:

Perform reachability analysis on objects in the heap from GC Roots to find surviving objects. This process may be interrupted by young GC, and new references (or reference updates) generated during the concurrent marking phase will be recorded by SATB's write barrier At the same time, the concurrent marking thread will also periodically check and process the records of the STAB global buffer list, and update the object reference information. During this phase, if all objects in the region are found to be garbage, the region is immediately reclaimed. At the same time, during the concurrent marking process, the survival ratio of objects in each region is calculated.

In the concurrent marking stage, we have to understand the three-color marking algorithm, which we will introduce below

(4) Remark (Remark, STW):

The re-marking phase is to correct the part of the mark record that has changed due to the continued operation of the application during the concurrent mark period, which is to process the remaining SATB log buffer and all updates, and find out all unaccessed surviving objects .

In the CMS collector, remarking uses incremental updates, while G1 uses an initial snapshot algorithm SATB algorithm that is faster than CMS
: snapshot-at-the-beginning.

SATB
will create a snapshot of surviving objects at the beginning of marking, so as to ensure that all garbage objects in the concurrent marking phase can be identified through snapshots. When the assignment statement occurs, the application will change its object graph, then the JVM needs to record the overwritten object, so the write-before fence will record the value in the SATB log or buffer before the reference changes (each thread will Exclusively occupy a SATB buffer, initially with 256 record space). When the space is exhausted, the thread will allocate a new SATB buffer to continue using, and the original buffer will be added to the global list. Finally, in the concurrent marking stage, the concurrent marking thread will periodically check and process the records of the global buffer list while marking, and then scan the reference field to update the RSet according to the marking bit of the marking bitmap slice, correcting the SATB error
.

SATB's log buffer is the same as the log buffer used by RSet's write barrier. It has a two-level structure and the mechanism of action is the same.

(5) Clear (Cleanup, STW):

This stage is mainly to sort the recovery value and cost of each Region, and formulate a recovery plan according to the user's expected GC pause time. (This stage does not actually do garbage collection, nor does it perform a copy of surviving objects)

The detailed operations performed in the cleanup phase are as follows:

① RSet combing: The heuristic algorithm will define different levels for partitions according to the activity and RSet size, and RSet mathematics also helps to find useless references.

② Organize heap partitions: identify collections of old generation partitions with high recovery benefits (based on free space and pause goals) for mixed collections;

③ Identify all idle partitions: find partitions with no surviving objects, which can be directly reclaimed during the cleanup phase without waiting for the next collection cycle.

insert image description here
If you do not consider the operation of maintaining the Remembered Set, it can be divided into four steps in the above figure (similar to CMS), in which the initial marking, concurrent marking, and re-marking are the same as the CMS collector, and only the fourth stage of screening and recovery is somewhat different.

3.2. Copy surviving objects (evacuation):

insert image description here

When G1 initiates global concurrent marking, it will not start mixed collection immediately. G1 will wait for the next young generation collection first, and then determine the CSet for the next mixed collection in the young gc collection phase

After the global marking is completed, G1 will know which old regions have the most recyclable garbage, just wait for the right time to start mixed recycling, and mixed recycling will not only recycle the young region, but also recycle some old regions (no need to recycle All old regions). According to the pause target, G1 may not be able to recycle all old region candidate partitions at once, and can only select several regions with high priority for recycling, so G1 may generate multiple consecutive hybrid collections and alternate execution of application threads, and these The selected region is the CSet, and the single mixed recovery algorithm is exactly the same as the Young GC algorithm above, except that there are more memory segments of the old generation in the recovery collection CSet; and the second step is to collect these regions The surviving objects are copied to the space region, and these reclaimed regions are put into the free region list.

G1 will calculate the number of partitions added to the CSet each time and the number of mixed collections, and in the last young generation collection and the next mixed collection, G1 will determine the partition set (Choose CSet) that will be added to the CSet next time, And it is determined whether to end the hybrid collection cycle.

(1) After the concurrent marking ends, the regions that are 100% garbage in the old generation will be recycled directly, and the regions that are only partially garbage will be divided into 8 recycling (can be set by -XX:G1MixedGCCountTarget, the default threshold is 8), so The recovery set (CSet) of Mixed GC includes one-eighth of the old age memory segment, the Eden area memory segment, and the Survivor area memory segment.

(2) Since the memory segments of the old generation are recycled 8 times by default, G1 will give priority to recycling the memory segments with more garbage. The higher the ratio of garbage to the memory segment, the more it will be recycled first. And a threshold determines whether the memory segment is reclaimed -XX:G1MixedGCLiveThresholdPercent, the default is 65%, which means that the proportion of garbage in the memory segment must reach 65% before it will be reclaimed. If the proportion of garbage is too low, it means that the proportion of surviving objects is high, and it will take more time to copy.

(3) Mixed recycling does not have to be performed 8 times. There is a threshold -XX:G1HeapWastePercent, the default value is 10%, which means that 10% of the entire heap memory is allowed to be wasted, which means that if it is found that recyclable garbage occupies the heap memory If the proportion is less than 10%, no mixed recovery will be performed, because GC will spend a lot of time, but the recovered memory is very little

insert image description here
G1 garbage collection process summary: Young CG and Mixed GC are the main activities of G1 to reclaim space. When the application starts running, the free space of the heap memory is still relatively large, and the collection of the young generation will only be triggered when the young generation is full; as the memory of the old generation grows, when the IHOP threshold is reached (the ratio of the old generation to the entire heap, the default is 45% -XX:InitiatingHeapOccupancyPercent) , G1 starts to prepare to collect the old generation space. Go through concurrent mark cycles first, identifying high-yield old generation partitions. But then G1 will not start a mixed collection immediately, but let the application thread run for a while, waiting to trigger a young generation collection. In this STW, G1 will start to sort out the mixed collection cycle. Then let the application thread run again. During the next few young generation collections, old generation partitions will be added to the CSet, which triggers mixed collections. These consecutive mixed collections are called mixed collections.

4、Full GC:

    当 G1 无法在堆空间中申请新的分区时,G1便会触发担保机制,执行一次STW式的、单线程的 Full GC,Full GC会对整堆做标记清除和压缩,最后将只包含纯粹的存活对象。参数-XX:G1ReservePercent(默认10%)可以保留空间,来应对晋升模式下的异常情况,最大占用整堆50%,更大也无意义。

G1 will trigger Full GC in the following scenarios, and will record to-space-exhausted and Evacuation Failure in the log:

(1) When copying surviving objects from the young generation partition, it is impossible to find an available free partition
(2) When transferring surviving objects from the old generation partition, it is impossible to find an available free partition
(3) When allocating huge objects, it is not possible to find enough free partitions in the old generation Continuous partitioning
Since the application of G1 often has a large heap memory, the collection cost of Full GC is very expensive, and the occurrence of Full GC should be avoided.

5. Three-color marking algorithm:

The three-color marking algorithm is an important algorithm in the concurrent collection phase. It is a useful method to describe the tracking collector, and it can be used to deduce the correctness of the collector. First, we divide objects into three types.

Black: The root object, or the object and its sub-objects have been scanned
Gray: The object itself has been scanned, but the sub-objects in the object have not been scanned
White: The unscanned object, after all objects have been scanned, will eventually be The white ones are unreachable objects, that is, garbage objects.
Next, we will use a set of evolution diagrams to deepen our understanding of the three-color marking algorithm. When the GC starts scanning objects, scan the objects according to the following steps:

5.1. The normal process of the three-color marking algorithm:

(1) The root object is set to black, and the sub-objects are set to gray:
insert image description here
(2) Continue to traverse from gray, and set the objects that have scanned sub-objects to black.
insert image description here
(3) After traversing all reachable objects, all reachable objects become black. Unreachable objects are white and need to be cleaned up.
insert image description here

5.2. Abnormalities of the three-color marking algorithm:

If the application is running during the marking process, then the object's pointer may change. In this case, we will encounter a problem: the problem of object loss. Let's look at the following situation, when the garbage collector scans the following situation: At
insert image description here
this time, the application performs the following operations:

A.c=C
B.c=null

In this way, the state diagram of the object becomes as follows:
insert image description here
At this time, when the garbage collector marks and scans again, it will look like this:

insert image description here
Obviously, C is white at this time, which is considered as garbage and needs to be cleaned up, which is obviously unreasonable. So how do we ensure that the objects marked by GC are not lost when the application is running? There are two possible ways:

Record objects when inserting and
record objects when deleting.
This corresponds to two different implementations of CMS and G1:

(1) CMS uses incremental update (Incremental update): As long as a reference to a white object is found to be assigned to a field of a black object in the write barrier, then the white object will be turned gray, that is, Record when inserted.

(2) G1 uses the STAB (snapshot-at-the-beginning) method: record all objects when deleting, it has three steps:

① Generate a snapshot of the surviving object when starting to mark
② When concurrent marking, all changed objects are enqueued (in the write barrier, all objects pointed to by old references become non-white) ③ May
exist Free garbage, will be collected next time

4. Parameter optimization

The above briefly introduces the working principle of G1. After knowing the principle, we can better optimize the operation of the program in the process of using G1 in combination with the setting of some common parameters.

-XX:MaxGCPauseMillis
GC maximum pause time, default 200ms. This is a soft goal. G1 will try its best to achieve it. If it cannot be achieved, it will gradually make self-adjustments.

For Young GC, the number of Eden areas will be gradually reduced, and if the Eden space is reduced, the processing time of Young GC will be reduced accordingly.

For Mixed GC, G1 will adjust the proportion of Cset each time, the default maximum is 10%. Of course, if the number of Csets selected each time is less, the number of Mixed GCs to be experienced will increase accordingly.

Cset
When the total space of Eden is reduced, Young GC will be triggered more frequently, and the execution frequency of Mixed GC will be accelerated, because Mixed GC is triggered by Young GC, or it can be said to be executed at the same time. Frequent GC will affect the throughput of the application. The recovery time of each Mixed GC is too short, and the amount of garbage collected is too small. Maybe the final garbage cleaning speed of the GC cannot keep up with the speed of the application, which may cause serial Full GC , which is to be avoided.

Therefore, the pause time must not be set as small as possible. Of course, it cannot be set too large. Instead, expect G1 to process it as soon as possible. This may result in fewer Mixed GC triggers after all concurrent marks, but each The longer the time, the longer the STW time, and the more obvious impact on the application.

-XX:G1NewSizePercentThere are two numerical values ​​for and -XX:G1MaxNewSizePercent
young generation ratio, the lower limit: -XX:G1NewSizePercent, the default value is 5%, the upper limit: -XX:G1MaxNewSizePercent, the default value is 60%.

G1 will dynamically adjust the size of the new generation, mainly the number of Eden Regions, according to the actual GC situation (mainly the pause time). It is best to have a larger Eden space, because the frequency of Young GC is higher, and a large Eden space can reduce the number of occurrences of Young GC. But at the same time, it is also necessary to balance the regions of the new generation and the old generation in the Mixed GC. If the Eden is large, there is not much space left for the old generation to recycle, which may eventually lead to a Full GC.

Of course, G1 can still set a fixed young generation size (parameters -XX:NewRatio, -Xmn), but pausing the target at the same time will lose its meaning.

-XX:G1MixedGCLiveThresholdPercent
Specify the threshold of the living space ratio of the Region included in the Cset, and the default is 85%. In the global concurrent marking phase, if the space ratio of the surviving objects in a Region is lower than this value, it may be included in the Cset.

This value directly affects the area that Mixed GC chooses to recycle. When you find that the GC time is long, you can try to lower this threshold and try to give priority to the Region with a high proportion of garbage collection. However, this may also lead to incomplete garbage collection. Eventually Full GC is triggered.

-XX:InitiatingHeapOccupancyPercent
Specifies the usage ratio of the old generation that triggers global concurrent marking. The default value is 45%, that is, the old generation accounts for more than 45% of the heap.

If the old generation usage rate still exceeds 45% after the Mixed GC ends, the global concurrent marking process will be triggered again, which will lead to frequent old generation GCs and affect application throughput. At the same time, there is not much space in the old generation, and the space recovered by Mixed GC must be relatively small. If this value is too high, it is easy to cause the failure of young generation promotion and trigger Full GC, so multiple adjustment tests are required.

5. Disadvantages of the G1 collector:

(1) If the pause time is too short, it may cause each selected recycling collection to only occupy a small part of the heap memory, and the collection speed of the collector will gradually fail to keep up with the distribution speed of the allocator, which will cause garbage to accumulate slowly, and eventually It causes the heap space to be full, triggering Full GC and degrading performance.

(2) G1 is higher than CMS in both the memory usage generated by garbage collection and the additional execution load when the program is running.

(3) CMS has a high probability of G1 in small memory applications. Therefore, the CMS collector is used in the case of small memory, and the G1 collector can be used in the case of large memory (the G1 collector is more than 6GB)

Reference article:
https://blog.csdn.net/a745233700/article/details/121724998?spm=1001.2014.3001.5502

Guess you like

Origin blog.csdn.net/adaizzz/article/details/130061466