Introduction to G1 Garbage Collector and Recycling Process

1. What is G1?
Like CMS, G1 is also concerned with pause time, but it is controllable. It is designed to replace CMS. Because it is a space organization, it does not have the serious space fragmentation problem of CMS, while providing controllable Pause time.

Features:
1. G1 is different from the previous garbage collectors that are divided into successive young generations, old generations and permanent generations, but regions (regions), which divide the heap into large and small regions (usually about 2048) , Each area is eden, survivor, old
2. Generally give priority to reclaim the area that contains the most garbage, so it is called Garbage-First (G1)
3. The previous garbage collector is either the new generation or the old generation, and G1 takes into account the young Generation and old generation
4. Controllability: Because G1 can choose to recycle part of the area, it can achieve the controllability of the pause time.
You can see the heap structure of G1, which is divided into one area.
Insert picture description here

Several terms introduce
Remembered Sets : Each area has an RSet, which is used to record the object references entering the block (for example, the object in block A refers to block B, and the Rset of block B needs to record this information), It is possible to avoid scanning the entire area, and only need to scan Rset, which is used to realize the parallelization of the collection process and enable independent collection of blocks. In general, Remembered Sets consumes less than 5% of memory.

Collection Sets : The collection of regions to be collected in the GC. Regions of various generations may be stored in CSet. Live objects in CSet will be moved (copied) in gc. After GC, the region in the CSet will become an available partition.

The Card Table Java virtual machine uses a data structure called CardTable (card table) to **mark whether an object in a certain memory area in the old generation holds a reference to a new generation object, **the number of card tables depends on the old generation The size of and the memory size corresponding to each card. Each card corresponds to a bit in the card table. When an object in the old generation holds a reference to the new generation object, the JVM places the card corresponding to this object The location is marked as dirty (bit bit is set to 1), so that in the Minor GC, instead of scanning the entire old generation, scan the memory area corresponding to Dirty of Card.

Humongous region : It is the partition in G1 that stores huge objects. A huge object refers to an object that occupies more than 50% of the region's capacity. If one H area cannot hold a huge object, it will be stored through several consecutive H partitions. Because the transfer of giant objects will affect the GC efficiency, when the concurrent marking phase finds that the giant objects are no longer alive, they will be recycled directly. ygc will also reclaim huge objects in some cases.

TLAB (Thread Local Allocation Buffer) local thread buffer : Since objects are generally allocated on the heap, and the heap is shared by threads, there may be multiple threads requesting space on the heap, and every object allocation must be thread synchronized , Will reduce the efficiency of distribution. So when the thread is initialized, it will also apply for a piece of memory of a specified size in the eden area, which is only used by the current thread, so that each thread has a separate space. If you need to allocate memory, allocate it in its own space. The absence of competition can greatly improve the efficiency of distribution.

Allocation on the stack : For those thread-private objects (objects that cannot be accessed by other threads) they can be scattered and allocated on the stack instead of on the heap, so that GC is not needed, and the thread is destroyed. Recycling. The stack space is small, and it is impossible to allocate on the stack for large objects. Allocation on the stack depends on escape analysis and scalar substitution.

Snapshot-At-The-Beginning (SATB) : SATB is an incremental marking algorithm used in the concurrent marking phase of G1 GC. SATB can be understood as taking a snapshot of the objects in the heap memory before the GC starts. At this time, the live objects are considered to be alive, thus forming an object graph

2. The garbage collection process of
G1 The garbage collection process of G1 may include 4 parts:
1. Cenozoic GC
2. Concurrent marking cycle
3. Mixed collection 4.
Full GC

1. Cenozoic GC :
Similar to other garbage collectors, it reclaims the eden area, transfers the survivor area or promotes the old. The difference is that the size of the new generation will be dynamically adjusted, based on the historical ygc information and the pause time set by -XX:MaxGCPauseMillis.

2. Concurrent marking cycle : (somewhat similar to CMS) is mainly to mark recyclable objects and reclaim completely free areas. The process:

  • Initial marking : marking the object directly reachable from the root node, accompanied by a ygc, STW
  • Root area scan : because ygc has been performed, only the survivor area has objects, and scans the references from the survivor area to the old generation (you cannot have ygc, you must wait for the end, because ygc will cause the survivor area to change)
  • Concurrent marking : scan and find the surviving objects in the entire heap, and mark them (ygc can interrupt the marking process)
  • Re-marking : STW, using SATB, will create a snapshot of the surviving objects at the beginning of marking, speeding up the re-marking speed
  • Exclusive cleaning : Calculate and sort the surviving objects and recovery ratios in each area, identify the areas available for mixed recovery, and update the RemeberedSet. This stage marks the areas that need mixed recovery, STW
  • Concurrent cleaning : Identify and clean up completely free areas

3. Mixed recycling In the
concurrent marking cycle, although some objects are recycled, generally speaking, the proportion of recycling is relatively low.
Hybrid recycling not only performs ygc, but also recycles the areas marked with the most garbage.

4.
Full GC will trigger Full GC when the memory is insufficient during the mixed recovery process. ①.
concurrent mode failure
②. Large object allocation failure
③. Promotion failure

Three.G1 GC instance

参数设置
-Xms10M -Xmx10M -Xmn3m -XX:+PrintGCDetails -XX:+UseG1GC
 -XX:MetaspaceSize=10m -XX:MaxMetaspaceSize=10m
import java.util.ArrayList;

public class TestCMSGC {
    private static byte[] mem = new byte[1024 * 1024 *3];
    public static void main(String[] args) {
        ArrayList<byte[]> arrayList = new ArrayList<>();
           for(;;)
           arrayList.add(mem);
    }
}

ygc log:

[GC pause (G1 Humongous Allocation) (young) (initial-mark), 0.0017438 secs]
   [Parallel Time: 1.5 ms, GC Workers: 6]
      [GC Worker Start (ms): Min: 104.7, Avg: 104.9, Max: 105.2, Diff: 0.4]
      [Ext Root Scanning (ms): Min: 0.4, Avg: 0.5, Max: 1.0, Diff: 0.6, Sum: 3.1]
      [Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
         [Processed Buffers: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0]
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Object Copy (ms): Min: 0.0, Avg: 0.7, Max: 0.9, Diff: 0.9, Sum: 4.2]
      [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2]
         [Termination Attempts: Min: 1, Avg: 45.2, Max: 101, Diff: 100, Sum: 271]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [GC Worker Total (ms): Min: 1.0, Avg: 1.3, Max: 1.4, Diff: 0.4, Sum: 7.6]
      [GC Worker End (ms): Min: 106.2, Avg: 106.2, Max: 106.2, Diff: 0.0]
   [Code Root Fixup: 0.0 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.0 ms]
   [Other: 0.2 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 0.1 ms]
      [Ref Enq: 0.0 ms]
      [Redirty Cards: 0.0 ms]
      [Humongous Register: 0.0 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.0 ms]
   [Eden: 3072.0K(3072.0K)->0.0B(2048.0K) Survivors: 0.0B->1024.0K Heap: 6000.1K(10.0M)->4152.0K(10.0M)]
Parallel Time: 1.5 ms 所有GC线程的花费时间,1.6毫秒
GC Worker Start  表示这里6个GC线程启动时间
Ext Root Scanning  根扫描时间
Update RS 更新记忆集,Remembered Sets:每个区块都有一个 RSet,用于记录进入该区块的对象引用(如区块 A 中的对象引用了区块 B,区块 B 的 Rset 需要记录这个信息),可以避免扫描整个区,而只需要扫描Rset就行,它用于实现收集过程的并行化以及使得区块能进行独立收集。
Scan RS 扫描RS时间
Object Copy  在正式回收前,G1会将存活对象放在其他区域,因此需要对象复制
Termination 线程花在终止阶段的耗时,GC线程终止前,会检查还有没对象没处理完,如果没处理完,请求终止的GC线程会去帮助完成。Termination Attempts 表示每个工作线程尝试终止的次数
GC Worker Other  GC线程哈在其他任务的耗时
GC Worker Total  GC耗时
GC Worker End   单个GC线程结束的时间
Other 其他几个任务的耗时
Choose CSet  
Ref Proc  处理弱引用,软引用的时间
Ref Enq   弱引用,软引用的入队时间
Redirty Cards  重新脏化卡表
Humongous Register,Humongous Reclaim   主要是对巨型对象回收的信息,youngGC阶段会对RSet中有引用的短命的巨型对象进行回收,巨型对象会直接回收而不需要进行转移(转移代价巨大,也没必要)

Free CSet  是否被回收的Cset区域

Mixed marking cycle log:
you can see that the concurrent marking cycle starts after ygc is completed

[GC pause (G1 Humongous Allocation) (young) (initial-mark), 0.0010437 secs]
[Parallel Time: 0.8 ms, GC Workers: 6]
[GC Worker Start (ms): Min: 77.5, Avg: 77.6, Max: 77.8, Diff: 0.3]
[Ext Root Scanning (ms): Min: 0.0, Avg: 0.3, Max: 0.7, Diff: 0.7, Sum: 1.9]
[Update RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Processed Buffers: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0]
[Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Object Copy (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 2.1]
[Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
[Termination Attempts: Min: 1, Avg: 2.5, Max: 5, Diff: 4, Sum: 15]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.2]
[GC Worker Total (ms): Min: 0.5, Avg: 0.7, Max: 0.8, Diff: 0.3, Sum: 4.4]
[GC Worker End (ms): Min: 78.3, Avg: 78.3, Max: 78.3, Diff: 0.0]
[Code Root Fixup: 0.0 ms]
[Code Root Purge: 0.0 ms]
[Clear CT: 0.0 ms]
[Other: 0.2 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 0.1 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.0 ms]
[Humongous Register: 0.0 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 0.0 ms]
[Eden: 2048.0K(3072.0K)->0.0B(2048.0K) Survivors: 0.0B->1024.0K Heap: 1621.2K(10.0M)->704.1K(10.0M)]
[Times: user=0.00 sys=0.00, real=0.00 secs] It
can be seen that the concurrent marking cycle starts after the ygc above is completed, followed by root scan, concurrent marking, remarking, exclusive cleaning, concurrent cleaning
[GC concurrent-root -region-scan-start]
[GC concurrent-root-region-scan-end, 0.0003243 secs]
[GC concurrent-mark-start]
[GC concurrent-mark-end, 0.0000137 secs]
[GC remark [Finalize Marking, 0.0000911 secs ] [GC ref-proc, 0.0000384 secs] [Unloading, 0.0005454 secs], 0.0007483 secs]
[Times: user=0.00 sys=0.00, real=0.00 secs]
[GC cleanup 6904K->6904K(10M), 0.0001752 secs]
[ Times: user=0.00 sys=0.00, real=0.00 secs]
[GC concurrent-cleanup-start]
[GC concurrent-cleanup-end, 0.0002721]

More G1 GC logs can refer to this:
https://www.cnblogs.com/javaadu/p/11220234.html

4. Parameter setting

-XX:+UseG1GC

Use G1 collector

-XX:MaxGCPauseMillis=200
-XX:G1MixedGCCountTarget: After a global concurrent mark, the maximum number of subsequent mixedGCs executed. The default value is 8.

Specify the target pause time, the default value is 200 milliseconds.

When setting the -XX:MaxGCPauseMillis value, do not specify the average time, but should specify that 90% of the pauses are within this time. Remember, the pause time goal is our goal, and it is not always satisfied.

-XX:InitiatingHeapOccupancyPercent=45

After the entire stack reaches this ratio, the concurrent mark cycle is triggered, which is 45% by default.

If you want to reduce the promotion failure, you can usually adjust this value to make the concurrent cycle advance

-XX:NewRatio=n

Old generation/young generation, the default value is 2, which means 1/3 of the young generation, 2/3 of the old generation

Do not set the young generation to a fixed size, otherwise:

G1 no longer needs to meet our pause time goal, and
can no longer expand or shrink the size of the young generation as needed
-XX:SurvivorRatio=n

Eden/Survivor, the default value is 8, this is the same as other generational collectors

-XX:MaxTenuringThreshold =n

The age threshold for promotion from young generation to old generation is the same as other generational collectors

-XX:ParallelGCThreads=n

Number of garbage collection threads in parallel collection

-XX:ConcGCThreads=n

The number of garbage collection threads in the concurrent marking phase

Increasing this value can make concurrent marking complete faster. If this value is not specified, the JVM will be calculated by the following formula:

ConcGCThreads=(ParallelGCThreads + 2) / 4^3

-XX:G1ReservePercent=n

The percentage of the reserved space of the heap memory, the default is 10, to reduce the risk of promotion failure, that is, 10% of the heap memory is reserved by default.

-XX:G1HeapRegionSize=n

The size of each region, the default value is calculated according to the heap size, the value is 1MB~32MB, we usually specify the entire heap size.
-XX:G1MixedGCCountTarget The
maximum number of subsequent mixedGC executions after a global concurrent mark. The default value is 8.

Reference:
https://www.oracle.com/technetwork/tutorials/tutorials-1876574.html
https://javadoop.com/post/g1
https://www.jianshu.com/p/9edcbc4bcb8b?from=singlemessage
https ://blog.csdn.net/lijingyao8206/article/details/80566384
https://blog.csdn.net/lijingyao8206/article/details/80513383
https://ezlippi.com/blog/2018/01/jvm-card -table-turning.html

Guess you like

Origin blog.csdn.net/u010857795/article/details/112972045