JVM garbage collector (4) - G1 collector

The first two articles mainly talked about the garbage collectors of the new generation and the old generation. This article will introduce a more special garbage collector - the G1 garbage collector

G1 is a garbage collector for server-side applications.

1. Features of G1:

1. Parallelism and Concurrency : G1 can take full advantage of the hardware in a multi-CPU and multi-core environment, and use multiple CPUs (CPU or CPU core) to shorten the pause time of Stop-The-World. Some other collectors originally need to pause Java threads Executed GC actions, the G1 collector can still allow the Java program to continue executing in a concurrent manner.

2. Generational collection : Like other collectors, the concept of generational collection is still preserved in G1. Although G1 can manage the entire GC heap independently without the cooperation of other collectors , it can handle newly created objects and old objects that have survived for a while and survived multiple GCs in different ways to obtain better Collection effects.

3. Spatial integration : Unlike the "mark-clean" algorithm of CMS, G1 is a collector based on the " mark-clean " algorithm as a whole, and is based on "copy" locally (between two regions) Algorithms are implemented, but in any case, both algorithms mean that G1 does not generate memory space fragmentation during operation, and can provide regular usable memory after collection. This feature is beneficial for the program to run for a long time, and the next GC will not be triggered in advance because the contiguous memory space cannot be found when allocating large objects.

4. Predictable pause : This is another major advantage of G1 over CMS. Reducing pause time is a common concern of G1 and CMS, but in addition to pursuing low pause, G1 can also establish a predictable pause time model, which can It is almost a feature of real-time Java (RTSJ) garbage collectors to let the user explicitly specify that in a time segment of length M milliseconds, the time spent on garbage collection should not exceed N milliseconds.

Second, the internal implementation principle:

Other collectors before G1 collect the entire young or old generation. And G1 is for the whole heap.
It divides the entire Java heap into multiple independent regions (Regions) of equal size, but G1 has a concept of priority, that is, regions with larger reclaimable space will be reclaimed first. The reason for living the highest recovery rate.
Each region in G1 has a remember set to record the pointers of objects in the region, so as to avoid scanning the entire heap during collection.

3. Operation steps: The first two steps are the same as CMS

Initial Marking : Just mark the objects that GC Roots can directly associate with, and modify the value of TAMS (Next Top at Mark Start), so that when the user program runs concurrently in the next stage, it can be in the correct available Region. Create a new object, this stage needs to stop the thread, but the time is very short
Concurrent Marking : From the GC Root, the reachability analysis of the objects in the heap is carried out, and the surviving objects are found. This stage takes a long time, but it can be executed concurrently with the user program
Final Marking : In order to correct the part of the mark record that changes the mark due to the continued operation of the user program during the concurrent mark, the virtual machine records the object changes during this period in the thread Remembered Set Logs. The data of the Remembered Set Logs is merged into the Remembered Set. This stage needs to stop the thread, but it can be executed in parallel
Screening and recycling (Live Data Counting and Evacuation) : First, sort the recycling value and cost of each Region, and formulate a recycling plan according to the GC pause time expected by the user. This stage can be executed concurrently with the user program, but because only Part of the Region is recycled, so the time is controllable by the user, and pausing the user thread will greatly improve the collection efficiency.

The reason for the difference between the two is that CMS uses a mark-and-sweep algorithm, while G1 uses a mark-sort algorithm + Region memory division

4. Schematic diagram of G1 collector operation:

write picture description here

The content and pictures of this article are referenced from: "In-depth understanding of the Java virtual machine: JVM advanced features and best practices"