Summary of the 7 garbage collectors of the JVM

1. Detailed introduction of 7 kinds of garbage collectors

The above picture shows 7 types of collectors, and two collectors with a connection can be used together with each other. The area where it is located indicates whether it belongs to the young generation collector or the old generation collector. The collectors in the new generation mostly use the replication algorithm, and the collectors in the old generation mostly use the mark-collation algorithm.

1. Serial collector

Features:

(1) It is a single-threaded collector, which not only uses one CPU or one thread to complete GC, but also must suspend all other worker threads when it performs GC until its collection ends. (As if your computer was suspending response for 5 minutes for every 1 hour it was running)

(2) The advantage is that it is simple and efficient. In the environment where a single CPU is limited, because there is no overhead of thread interaction, it is very efficient to concentrate on GC.

2. ParNew collector

The ParNew collector is actually a multi-threaded version of the Serial collector. Except for the number of threads, the rest of the behavior of the two is exactly the same. In addition to the Serial collector, currently only the ParNew collector can work with the CMS, so it is also the first choice for many new generation collectors in Server mode.

Ps: Starting from this, there will be several concurrent and parallel collectors later, so explain the next two concepts.

Parallel: refers to multiple garbage collection threads working in parallel, but the user thread is still waiting at this time;

Concurrent: It means that the user thread and the garbage collection thread execute at the same time (not necessarily in parallel, but may be executed alternately), the user program continues to run, while the garbage collection program Yunxingyu runs on another CPU.

3. Parallel Scavenge Collector

(1) Features:

Young generation collector, replication algorithm, parallel collection, oriented to throughput requirements ( throughput priority collector ). The so-called throughput = user code running time / (user code running time + garbage collection time).

(2) Usage scenarios:

Mainly suitable for tasks that operate in the background without much interaction. High throughput can make the most efficient use of CPU time and complete the operation tasks of the program as soon as possible.

(3) Three parameters:

-XX:MaxGCPauseMillis: Controls the maximum garbage collection pause time, the number of milliseconds greater than zero.

-XX:GCTimeRatio: An integer from 0 to 100, the ratio of garbage collection time to total time, equivalent to the inverse of throughput.

-XX:UseAdaptiveSizePolicy: After opening, the virtual machine adjusts the pause time and throughput according to the system operating status, which is called GC adaptive adjustment policy, which is also an important difference between it and the ParNew collector.

4. Serial Old Collector

It is the old version of the Serial collector, which is mainly used by virtual machines in Client mode. If in Server mode, it has two uses:

(1) Use with Parallel Scavenge before JDK1.5;

(2) As a backup solution for the CMS collector, it is used when concurrent collection fails.

5. Parallel Old Collector

The Parallel Old collector is an older version of the Parallel Scavenge collector, using multithreading and a "mark-and-clean" algorithm.

Before this collector appeared, when the new generation chose the Parallel Scavenge collector, the old generation could only choose Serial Old (because the Parallel Scavenge collector and CMS could not be used together), this combination is not as powerful as the combination of ParNew + CMS.

Until the emergence of the Parallel Old collector, Parallel Scavenge + Parallel Old can be selected for use in situations where throughput is important.

6. CMS collector

This is a collector with the goal of obtaining the shortest recovery pause time. From the name of CMS (Concurrent Mark Sweep), it is based on the "mark-sweep" algorithm.

The operation process is divided into 4 stages:

(1) Initial mark.

Pauses all other threads and records objects directly connected to the GC root, which is fast.

(2) Concurrent marking

Open GC and user threads at the same time, and perform the process of GC roots Tracing. In this process, because the user thread may continuously update the reference field, the GC thread cannot guarantee the real-time performance of the reachability analysis, and cannot guarantee that all currently reachable objects are included.

(3) Relabel

Suspends all other threads in order to correct reference fields that have changed during the concurrent marking phase due to the operation of user threads. The pause time in this phase is generally slightly longer than the initial marking phase, but much shorter than concurrent marking.

(4) Concurrent clearing

The user thread is started, and the GC thread starts to clean the unmarked area. This process should be careful not to clean up the object that has just been allocated by the user thread.

Advantages: concurrent collection, low pause

shortcoming:

1 Very sensitive to CPU resources. The collection process will consume a lot of CPU time and the overall throughput will be low.

2 Cannot handle floating garbage. Since the user thread is still running in the concurrent cleaning phase of the CMS, new garbage will naturally be generated along with the program. This part of garbage appears after the marking process, and the CMS cannot process them in the current collection, so it has to wait until the next GC. to deal with.

3 Generates a lot of space debris. CMS is implemented based on a "mark-sweep" algorithm, which means that a large amount of space fragmentation occurs at the end of the collection. CMS provides a parameter to defragment the memory, but this process cannot be concurrent, and only the pause time can be sacrificed to solve the fragmentation problem. Another parameter can set how many times to perform full GC without compression, followed by a compressed one.

7. G1 collector (Garbage-First)

Features:

(1) Parallelism and Concurrency.

(2) Generational collection. G1 can manage the entire GC heap independently, and can handle newly created objects and old objects that have been around for a while in different ways.

(3) Spatial integration. G1 is based on the "mark-sort" algorithm as a whole, and based on the "copy" algorithm locally (between two regions), neither of which will generate memory space fragmentation.

(4) Predictable pauses. A predictable pause time model can be established, allowing the user to explicitly specify that the time spent on the GC should not exceed N milliseconds within a time segment of M milliseconds.

When using G1, the memory layout of the Java heap is very different from other collectors. It divides the entire Java heap into multiple independent regions (Regions) of equal size. The new generation and the old generation are no longer physically separated. A collection of parts of Regions (which do not need to be contiguous). G1 does not need to perform full-region GC in the entire Java heap. It tracks the value of garbage accumulation in each Region (the amount of space that can be recovered and the proportion of time required for recovery), and maintains a priority list in the background. According to the allowed collection time, the Region with the greatest value will be recovered first (the origin of the name Garbage-First). This ensures that G1 can obtain the highest possible collection efficiency within a limited time.

How does G1 avoid doing a full heap scan?

In G1, the object reference between Regions, or the object reference between the young generation and the old generation in other collectors, the virtual machine uses the Remembered Set to avoid full heap scanning. Each Region in G1 has a corresponding Remembered Set. When the virtual machine finds that the program is writing the data of the Reference type, it will generate a Write Barrier to temporarily interrupt the writing operation to check whether the object referenced by the Reference is in a different In the Region (in the example of generation, it is to check whether the objects in the old generation refer to the objects in the new generation). If so, the relevant reference information is recorded in the Remembered Set of the Region to which the referenced object belongs through CardTable. When memory reclamation is performed, a Remembered Set is added to the enumeration range of the GC root node, so as to ensure that the whole heap is not scanned and will not be missed.

How G1 works:

1. Initial marking. Mark the objects that GC Roots can directly associate with, and modify the value of TAMS (Next Top at Mark Start), so that when the user program runs concurrently in the next stage, new objects can be created in the correct available Region. This stage needs to suspend the thread, but the time is very short.

2. Concurrent marking. Starting from the GC Root, the reachability analysis of the objects in the heap is carried out to find out the surviving objects. This phase takes a long time, but can be executed concurrently with the user program.

3. Final Mark. In order to correct the mark changes caused by the continuous operation of the user program during the concurrent mark, these changes are recorded in the thread Remembered Set Logs, and the data of the Remembered Set Logs are merged into the Remembered Set. This phase requires the thread to be suspended, but can be executed in parallel.

4. Screening for recycling. First sort the recycling value of each Region, and formulate a recycling plan according to the GC pause time expected by the user. This stage can be done concurrently, but suspending user threads can greatly improve efficiency.

2. Schematic diagram of the operation process of 7 kinds of garbage collectors

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326393093&siteId=291194637