G1 GC technical analysis

introduce

    G1 GC, the full name of Garbage-First Garbage Collector, is enabled by the -XX:+UseG1GC parameter. The G1 collector is a collector that works on different partitions in the heap. The partition can be either the young generation or the old generation, and the partitions of the same generation do not need to be consecutive. And the number of each generation partition can be dynamically adjusted. The purpose of setting partitions for the old generation is that some partitions in the old generation have more garbage, and some partitions have less garbage, so that when recycling, you can focus on collecting the partitions with more garbage, which is also the origin of the name of G1. However, this algorithm is not suitable for garbage collection of the new generation, because the garbage collection algorithm of the new generation is a replication algorithm, but the new generation also uses the partition mechanism mainly because it is convenient to adjust the size of the generation.
    G1 GC is designed to replace CMS. Compared with CMS, G1 has the following advantages:
1. Predictable pause model
2. Avoids the garbage fragmentation of CMS
3. The performance of super heap is better

G1 Key Concepts

Region

    The concept of Region in G1 is different from the concept of partition in traditional garbage collection algorithms. G1 divides the heap memory into 1024 partitions by default, and the units of subsequent garbage collection are all in Region. Region is the basis for implementing the G1 algorithm. The size of each Region is equal. The size of the Region can be set by the -XX:G1HeapRegionSize parameter. As shown in the figure below: E in the figure represents the Eden area, S represents the Survivor, O represents the Old area, and H represents the humongous, which represents a giant object (an object with half the size of the Region space). It can be seen from the figure that each area is not logically continuous. And a Region is Eden at one moment, and may belong to the old generation at another moment. When G1 performs garbage cleaning, it copies the objects of one Region to another Region.

SATB

The full name of SATB is Snapchat-At-The_Beginning. SATB is a means of maintaining concurrent GCs. The foundation of G1 concurrency is SATB. SATB can be understood as taking a snapshot of the objects in the heap memory before the GC starts. At this time, the live objects are considered to be alive, thus forming an object graph. During the GC collection, the objects of the new generation are also considered as live objects, and other unreachable objects are considered as garbage objects.
How to find the objects allocated during the GC process? Each region records two top-at-mark-start (TAMS) pointers, prevTAMS and nextTAMS. Objects above TAMS are newly allocated and thus considered implicitly marked. In this way, we find the newly allocated objects during the GC process and consider these objects to be live objects.
The problem of object allocation during GC is solved, so how to solve the problem of reference changes during GC? The solution given by G1 is through Write Barrier. Write Barrier is to make a circle cut on the assignment of the reference field. Through Write Barrier, you can know which reference objects have changed.

RSet

The full name of RSet is Remember Set, and there is an RSet in each Region, which records the relationship between objects in other Regions referencing this Region object (who refers to my object). There is another data structure in G1, Collection Set (CSet). CSet records the collection of Regions to be collected by GC. Regions in CSet can be of any generation. During GC, for old->young and old->old cross-generational object references, just scan the RSet in the corresponding CSet.

Pause prediction model

One of the highlights of the G1 collector is the use of a pause prediction model to select the size of the CSet according to the user-configured pause time, so as to achieve the user's expected application pause time. Set by the -XX:MaxGCPauseMillis parameter. This is somewhat similar to the ParallelScavenge collector. The setting of the pause time is not as short as possible. The shorter the setting time, the smaller the CSet collected each time, which leads to the gradual accumulation of garbage, and eventually has to degenerate into Serial GC; if the pause time is set too long, it will cause a long pause every time, affecting the external response time of the program.

#G1 recycling process
G1 garbage collection is divided into two phases:
1. Global Concurrent marking phase (Global Concurrent marking)
2. Copy surviving object phase (evacuation)

Global concurrent marking phase

    The global concurrent marking stage is based on SATB, which is similar to CMS, but there are also differences. The main stages are as follows:
Initial marking: This stage will be STW. Scan the root collection, push all the objects directly through the root collection into the scanning stack, and wait for subsequent processing. The initial marking phase in G1 is carried out with the help of the pause of the Young GC, and no additional pause is required. Although the pause time of Young GC is lengthened, it still improves the efficiency of GC in general.
Concurrent Marking: STW is not required for this stage. At this stage, the object is continuously taken out of the scan stack for scanning, and the fields of the scanned object are pushed into the scan stack, recursively until the scan stack is empty, that is to say, all objects directly reached by GCRoot are traced. At the same time, this phase also scans the references recorded by the SATB write barrier.
Final mark: Also called Remark, this stage is also STW. This phase will process the references recorded by the write barrier in the concurrent marking phase, and will also process weak references. The biggest difference between this stage and CMS is that CMS will scan the entire root collection at this stage, and Eden will also be scanned as part of the root collection, so it may take a long time.
Cleanup:  This phase will STW. Inventory and reset marker state. This stage is a bit like the sweep stage in mark-sweep. This stage does not actually do garbage collection, but only predicts the CSet according to the pause model and waits for the evacuation stage to recycle.

copy live object phase

    The Evacuation phase is a full pause. This phase copies the live objects in one part of the Region to another part of the Region, so as to realize garbage collection and cleanup. The Evacuation stage selects any number of Regions from the Regions selected in the first stage as the target of garbage collection. These Regions to be collected are called CSets and are implemented through RSets.
After filtering out the CSet, G1 will copy the surviving objects in these Regions to other Regions in parallel, which is similar to the copying process of ParallelScavenge, and the whole process is completely suspended. The control of the pause time is to achieve the goal of controlling the length of time by selecting the number of CSets.

Collection mode of G1:

YoungGC: Collect Regions in the Young Generation
MixGC: All Regions in the Young Generation + Global Concurrent Marking Phase Selected high-revenue
regions, both YoungGC and MixGC, are only concurrent copy phases.

There are two sub-modes for selecting CSet in the generational G1 mode, corresponding to YoungGC and mixedGC:
YoungGC: CSet is the Region
in all young generations MixedGC: CSet is the Region in all young generations plus the high returns marked in the global concurrent marking phase Region

    The running process of G1 is like this. It will continuously switch between Young GC and Mix GC, and at the same time, it will periodically do global concurrent marking, and use Full GC (Serial GC) when it really can't keep up with the recycling speed. The initial marking is performed on YoungGC. Mix GC is not performed when global concurrent marking is performed, and the initial marking phase is not started when Mix GC is performed. When MixGC can't keep up with the speed of object generation, it will degenerate into Full GC, which is where key tuning is needed.

G1 Best Practices

    Follow the following practices when using the G1 garbage collector to avoid many detours:

Continuously tune pause time metrics

    Through XX:MaxGCPauseMillis=x, you can set the pause time of the startup application. When G1 is running, it will select CSet according to this parameter to meet the response time setting. Under normal circumstances, this value can be set to 100ms or 200ms (it will be different in different situations), but it is not reasonable to set it to 50ms. If the pause time is set too short, it will cause the G1 to fail to keep up with the speed of garbage generation. Eventually degenerates into Full GC. Therefore, the tuning of this parameter is a continuous process, gradually adjusting to the best state.

Do not set the size of the young and old generation

    The G1 collector adjusts the size of the young and old generations when it runs. Adjust the speed of object promotion and the promotion age by changing the size of the generation to achieve the pause time goal we set for the collector. Setting the young generation size is equivalent to giving up the automatic tuning that G1 does for us. All we need to do is set the size of the entire heap memory, and leave the rest to G1 to allocate the size of each generation.

Follow Evacuation Failure

Evacuation Failure is similar to promotion failure in CMS. Too much garbage in heap space prevents copying between Regions, so it has to degenerate into Full GC to do a global garbage collection.

G1 common parameters

Parameter/default value meaning

-XX:+UseG1GC Use the G1 garbage collector
-XX:MaxGCPauseMillis=200 Set the expected maximum GC pause time target (the JVM will try to achieve it, but it is not guaranteed to be reached)
-XX:InitiatingHeapOccupancyPercent=45 The percentage of heap memory used when starting concurrent GC cycles. Garbage collectors such as G1 use this to trigger concurrent GC cycles, based on the usage of the entire heap, not just the memory usage ratio of a generation. A value of 0 means "always execute" GC cycles". The default value is 45.
-XX:NewRatio=n The size ratio (Ratio) of the new generation and the old generation (new/old generation). The default value is 2.
-XX:SurvivorRatio=n Ratio of eden/survivor space size. The default value is 8.
-XX:MaxTenuringThreshold=n The maximum threshold for raising the tenuring generation (tenuring threshold). The default value is 15.
-XX:ParallelGCThreads=n Sets the number of threads used by the garbage collector in the parallel phase. The default value varies with the platform the JVM is running on.
-XX:ConcGCThreads=n The number of threads used by the concurrent garbage collector. The default value varies depending on the platform the JVM is running on.
-XX:G1ReservePercent=n Sets the total amount of heap memory reserved for a false ceiling to reduce the chance of boost failures. Default is 10.
-XX:G1HeapRegionSize=n When using G1, the Java heap is divided into regions of uniform size. This parameter can specify the size of each heap area. The default value will calculate the optimal solution according to the heap size. The minimum value is 1Mb, and the maximum value is 32Mb.

G1 log analysis

// Cenozoic GC 
2018-05-03T10:21:43.209-0800: [GC pause (G1 Humongous Allocation) (young) (initial-mark), 0.0035356 secs]   // Initial mark, it takes 0.0035 seconds 
   [Parallel Time: 2.4 ms, GC Workers: 8]   // 8 threads in parallel, it takes 2.4ms 
      [GC Worker Start (ms): Min: 813.1, Avg: 813.7, Max: 813.9, Diff: 0.7 ]
      [Ext Root Scanning (ms): Min: 0.0, Avg: 1.1, Max: 1.5, Diff: 1.5, Sum: 9.1]    // Each thread scanning root takes time 
      [Update RS (ms): Min: 0.0, Avg : 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]    // Time-consuming to update RS, each area in G1 has an RS corresponding to it, and RS records the objects referenced by other areas in this area. When recycling, use RS as part of the root set to speed up recycling 
         [Processed Buffers: Min: 0, Avg: 0.0, Max: 0, Diff: 0, Sum: 0]   // Processed Buffers is the cache space for recording reference changes 
      [Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]    // Scan RS 
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0 , Diff: 0.0, Sum: 0.0]   // Root scan time 
      [Object Copy (ms): Min: 0.0, Avg: 0.5, Max: 1.3, Diff: 1.3, Sum: 3.6] // Object copy
      [Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.2, Diff: 0.2, Sum: 1.2]   
         [Termination Attempts: Min: 1, Avg: 1.8, Max: 4, Diff: 3, Sum: 14]
      [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
      [GC Worker Total (ms): Min: 1.6, Avg: 1.8, Max: 2.3, Diff: 0.8, Sum: 14.1]    // GC thread time 
      [GC Worker End (ms): Min: 815.4, Avg: 815.4, Max: 815.4, Diff: 0.0 ]
   [Code Root Fixup: 0.0 ms]
   [Code Root Purge: 0.0 ms]
   [Clear CT: 0.1 ms]    // It takes time to clear the CardTable, RS is dependent on the survival object of the CardTable record area 
   [Other: 1.1 ms]
      [Choose CSet: 0.0 ms]    // Select CSet 
      [Ref Proc: 0.9 ms]   // The processing time of weak reference and soft reference 
      [Ref Enq: 0.0 ms]    // The processing time of weak reference and soft reference 
      [Redirty Cards: 0.1 ms]
      [Humongous Register: 0.0 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 0.0 ms]    // Time to release the reclaimed area (including their RS) 
   [Eden: 5120.0K(24.0M)->0.0B(12.0M) Survivors: 0.0B->2048.0K Heap: 16.0 M(50.0M)->12.4M( 50.0M)]
 [Times: user =0.01 sys=0.00, real=0.01 secs] 
  // Root region scan 
2018-05-03T10:21:43.213-0800: [GC concurrent-root-region-scan- start]
 2018-05-03T10: 21:43.214-0800: [GC concurrent-root-region-scan-end, 0.0012422 secs]
 // concurrent mark 
2018-05-03T10:21:43.214-0800: [GC concurrent-mark- start]
 2018-05-03T10 :21:43.214-0800: [GC concurrent-mark-end, 0.0004063 secs]
 // Remark aka final mark 
2018-05-03T10:21:43.214-0800: [GC remark 2018-05-03T10:21:43.215 -0800: [Finalize Marking, 0.0003736 secs] 2018-05-03T10:21:43.215-0800: [GC ref-proc, 0.0000533 secs] 2018-05-03T10:21:43.215-0800: [Unloading, 0.0007439 secs] 0.0013442 secs]
 [Times: user =0.00 sys=0.00, real=0.00 secs] 
  // Exclusive cleanup 
2018-05-03T10:21:43.216-0800: [GC cleanup 13M->13M(50M), 0.0004002 secs]
 [Times: user=0.01 sys=0.00, real=0.00 secs]


    This is a complete GC log. On the whole, the concurrent mark cycle and mixed collection may be interspersed with the new generation GC. Among them, the concurrent marking cycle mainly reclaims the space of the old generation, and of course also includes a new generation GC. 

----------------------------------------------------------------

Welcome to my WeChat public account: yunxi-talk, sharing Java dry goods, a must for advanced Java programmers.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325231203&siteId=291194637