JVM - G1 garbage collector in-depth analysis

​​​​​​​​1. Overview of the G1 collector

The HotSpot team has been working hard in the direction of efficient collection and reduced pauses (STW: Stop The World), and has also contributed from the serial Serial collector, to the parallel collector Parallerl collector, to the CMS concurrent collector, and even today's A series of excellent garbage collectors including G1.

The G1 (Garbage First) garbage collector focuses on the garbage collector with the smallest delay, and is also suitable for garbage collection of large-size heap memory.

1.1. The biggest feature of the G1 collector

  • The biggest feature of G1 is the introduction of partition ideas, which weakens the concept of generation.

  • Reasonable use of resources in each cycle of garbage collection solves many defects of other collectors and even CMS.

1.2. Improvement of G1 compared to CMS​​​​​​

  • Algorithm: G1 is based on the mark-organization algorithm, which will not generate space fragmentation, and will not trigger a FULL GC in advance because it cannot obtain continuous space when allocating large objects.

  • Controllable pause time: G1 can control the garbage collection time by setting the expected pause time (Pause Time) to avoid application avalanches.

  • Parallelism and concurrency: G1 can make full use of the CPU and hardware advantages in a multi-core environment to shorten the pause time of stop the world.

1.3 The difference between CMS and G1

  • In CMS, the heap is divided into PermGen, YoungGen, and OldGen; and YoungGen is divided into two survivor areas. In G1, the heap is evenly divided into several regions. In each region, although the concept of new and old generations is retained, the collector collects in units of the entire region.

  • G1 will do the work of merging free memory immediately after reclaiming memory, and CMS will do it at STW (stop the world) by default.

  • G1 will be used in Young GC, and CMS can only be used in O zone.

1.4 Application Scenarios of G1 Collector

The G1 garbage collection algorithm is mainly used in multi-CPU and large-memory services. While meeting high throughput, it also satisfies the pause time of garbage collection as much as possible.
For now, CMS is still the preferred GC strategy by default, and G1 may be more suitable in the following scenarios:

  • Applications with a large server-side multi-core CPU and JVM memory usage (at least greater than 4G)

  • The application will generate a lot of memory fragments during the running process, and the space needs to be frequently compressed

  • Want a more controllable and predictable GC pause cycle to prevent application avalanche under high concurrency

2. G1's heap memory algorithm

2.1, JVM memory model before G1

  •  New generation: Eden space + 2 survivor areas
  • old generation

  • Persistent generation (perm space): before JDK1.8

  • Metaspace (metaspace): replace the persistent generation after JDK1.8

2.2, the memory model of the G1 collector

2.2.1, G1 heap memory structure

The heap memory will be divided into many fixed-size regions (Regions), each of which is a continuous range of virtual memory.
The size of a region (Region) in the heap memory can be specified by the -XX:G1HeapRegionSize parameter, with a minimum size range of 1M and a maximum of 32M, in short, it is a power of 2.

By default, the heap memory is divided equally according to 2048 shares.

2.2.2, G1 heap memory allocation

Each Region is marked with E, S, O, and H, and these areas are logically mapped to Eden, Survivor, and the Old Age.
Live objects are transferred (ie copied or moved) from one region to another. Regions are designed to be garbage collected in parallel, possibly suspending all application threads.
As shown in the figure above, regions can be allocated to Eden, survivor and old generation. In addition, there is a fourth type, known as the Humongous Region. The Humongous area is designed for objects that store more than 50% of the standard region size, and it is used to store huge objects. If a huge object cannot be accommodated in one H area, then G1 will look for continuous H partitions to store it. In order to find continuous H areas, sometimes Full GC has to be started.

3. G1 recycling process

 When performing garbage collection, G1 operates in a manner similar to the CMS collector.

3.1. The stage of the G1 collector is divided into the following steps:

3.1.1, the first stage of G1 execution: initial marking (Initial Marking)

This stage is STW (Stop the World), all application threads will be suspended, and objects directly reachable from GC Root will be marked.

3.1.2. The second stage of G1 execution: concurrent marking

It takes a long time to analyze the reachability of objects in the heap from GC Roots to find surviving objects. When the concurrent marking is completed, the final marking (Final Marking) phase begins

3.1.3. Final marking (marking objects that have changed during the concurrent marking phase will be recycled)

3.1.4. Screening and recovery

First sort the recovery value and cost of each Region, specify a recovery plan according to the GC pause time expected by the user, and recover a part of the Region

Finally, G1 provides two modes of garbage collection, Young GC and Mixed GC, both of which are Stop The World (STW).

4. GC mode of G1

4.1, YoungGC young generation collection

When allocating general objects (non-huge objects), when all eden region usage reaches the maximum threshold and cannot apply for enough memory, a YoungGC will be triggered. Every time younggc reclaims all Eden and Survivor areas, and copies surviving objects to the Old area and another part of the Survivor area.
The recycling process of YoungGC is as follows:

  • Root scanning, similar to CMS, Stop the world, scanning GC Roots objects.

  • Process Dirty card, update RSet.

  • Scan RSet, scan all old areas in RSet for references to scanned young areas or survivors.

  • Copy the scanned surviving objects to the survivor2/old area

  • Handle reference queues, soft references, weak references, phantom references

4.2、mixed gc

When more and more objects are promoted to the old region, in order to avoid the exhaustion of heap memory, the virtual machine triggers a mixed garbage collector, that is, mixed gc. This algorithm is not an old gc, except for reclaiming the entire young region , will also recycle part of the old region, here it needs to be noted: it is part of the old generation, not all the old generation, which old regions can be selected for collection, so that the time-consuming time of garbage collection can be controlled.
G1 does not have the concept of fullGC. When fullGC is needed, serialOldGC is called to scan the whole heap (including eden, survivor, o, perm).

The first important feature of G1 is to provide a low GC latency and large memory GC solution for user applications. This means that with a heap size of 6GB or larger, stable and predictable pause times will be below 0.5 seconds.
Applications that use the CMS or ParallelOld garbage collector will benefit from switching to G1 if they have one or more of the following characteristics:

  • Full GC lasts too long or is too frequent

  • Object allocation rate or young generation upgrades to old generation are very frequent

  • Unexpectedly long garbage collection times or compaction pauses (more than 0.5 to 1 second)

NOTE: If you are using the CMS or ParallelOld collectors and your application is not experiencing long garbage collection pauses, it is fine to stay with your current collector, upgrading the JDK does not necessarily update the collector to G1 .

Guess you like

Origin blog.csdn.net/qq_34272760/article/details/129232269