[JVM] 7 classic garbage collectors

Reference for this article: In-depth understanding of Java virtual machine: JVM advanced features and best practices (3rd edition)

1. Garbage Collector Overview

image-20230212113322170

Although the seven garbage collectors shown in the figure are not considered advanced technologies, they are mature enough in practice and can basically be considered as all garbage collectors that can be safely used in the production environment in the next two or three years.

These seven garbage collectors act on different generations. If there is a connection between the two collectors, it means that they can be used together. The position of the garbage collector indicates whether it belongs to the new generation or the old generation collector.

Garbage collectors can be divided into:

  1. serial
  2. Throughput priority
  3. Response time priority

2. Serial collector

The Serial collector is the most basic and oldest collector. It is the garbage collector for the young generation.

This collector is a single-threaded collector. "Single-threaded" not only means that this collector will only use one processor or one collection thread for garbage collection, but more importantly, the Serial collector is performing garbage collection. All other work must be suspended while collecting until the collection is complete.

That is to say, it is unacceptable to stop all the normal working threads of the user when the user is unknowable and uncontrollable.

image-20230212114757514

Even though the Serial collector is the earliest garbage collector, the default new generation collector of the HotSpot virtual machine running in client mode is still the Serial collector.

The advantage of the Serial collector is simple, that is, it is simple and efficient. In an environment with limited memory resources, it consumes the least additional memory among all collectors. For single-core processors or processors with a small number of cores, The Serial collector does not have the overhead of thread interaction, and concentrates on garbage collection, thus ensuring collection efficiency.


3. ParNew collector

The ParNew collector is actually a multi-threaded version of the Serail collector. Except for using multiple threads for garbage collection at the same time, any other behaviors, including all control parameters, collection algorithms, etc., are the same as the Serail collector, and there is not much innovation. place

image-20230212115554893

And in addition to the Serial collector, currently only the ParNew collector can work with the CMS collector.

CMS is a garbage collector that can be called epochal significance in strong interactive applications. It is a garbage collector that supports concurrency in the true sense. It is the first time to allow garbage collector threads to work simultaneously with user threads. Therefore, CMS The emergence of ParNew has consolidated the position of ParNew .

The ParNew collector definitely does not need the Serial collector to have a better effect in a single-core processor environment, and there is even an overhead for thread interaction.
In the pseudo dual-core processor environment implemented by the ParNew collector through hyper-threading technology, it cannot surpass Serial 100%.

However, as the number of processor cores increases, ParNew is still very beneficial for the efficient use of system resources during garbage collection. The number of collection threads enabled by default is the same as the number of processor core threads.


4. Paraller Scavenge collector

The Paraller Scavenge collector is a collector based on the mark-copy algorithm, and it is also a multi-threaded collector that can collect in parallel.

This garbage collector is often referred to as the "throughput-first collector"

Parallel : Parallel describes the relationship between multiple garbage collector threads, indicating that there are multiple such threads working together at the same time . Usually, the user thread is in a waiting state by default at this time

Concurrency : Concurrency describes the relationship between garbage collector threads and user threads, indicating that both garbage collector threads and user threads are running at the same time. Since the user thread is not frozen, the program can still respond to service requests, but because the garbage collector thread occupies a part of system resources, the processing throughput of the application program will be affected to a certain extent.

The Paraller Scavenge collector is characterized to achieve a manageable throughput.

image-20230212133427488

The Parallel Scavenge collector provides two parameters for precise control of throughput, namely the parameter that controls the maximum garbage collection pause time and the parameter -XX:MaxGCPauseMillisthat directly sets the throughput size-XX:GCTimeRatio

  1. -XX:MaxGCPauseMillis: The allowable value of the parameter is a number of milliseconds greater than 0. The collector will try its best to ensure that the time spent on memory recovery does not exceed the user's set value, **but this parameter value is not recommended to be set smaller, because the garbage collection pause time is sacrificed At the expense of throughput and young generation space. **The system adjusts the new generation to be smaller, and the garbage collection speed will definitely be faster, but it will be followed by frequent garbage collection, so the throughput will also drop.
  2. -XX:GCTimeRatio: The value should be a positive number greater than 0 and less than 100, which is the ratio of the total garbage collection time, which is equivalent to the reciprocal of throughput. For example, if it is set to 19, then the maximum allowed garbage collector time is 5% (that is, 1/(1+19)); the default is 99, and the maximum allowed 1% (1/(1+99)) is garbage collection time

5. Serial Old Collector

The Serial Old collector is an old version of the Serial collector, and it is also a single-threaded collector, using the mark-sort algorithm

image-20230212135752822

If it is in server-side mode, it may also have two purposes: one is to use it with the Parallel Scavenge collector in JDK 5 and earlier versions9, and the other is as a backup plan when the CMS collector fails, Used when Concurrent Mode Failure occurs in concurrent collection.


6. Parller Old Collector

The Parller Old collector is an old version of the Parller Scavenge collector, which supports multi-threaded concurrent collection and is implemented based on the mark-sort algorithm.

image-20230212140005545

Before the Parller Old collector appeared, the Parller Scavenge collector could only be used in conjunction with the Serial Old collector. Due to the drag on the application performance of the Serial Old collector, it cannot be said that the Parller Scavenge collector may not be able to exert the best throughput. The effect of maximizing quantity. This combination is not even as good as the combination of ParNew and CMS.

Until the emergence of the Paraller old collector, the "throughput priority" collector finally has a more veritable combination. When throughput is emphasized or processor resources are scarce, the combination of Paraller Scavenge and Paraller Old collector can be given priority .


7. CMS Collector

The CMS (Concurrent Mark Sweep) collector is a collector that aims to obtain the shortest recovery pause time . At present, a large part of Java applications are concentrated on the server side of Internet websites or browser-based B/S systems. Such applications usually pay more attention to the response speed of services , hoping that the system pause time will be as short as possible to bring users Good interactive experience.

The CMS collector is implemented based on the mark-sweep algorithm. Its garbage collection steps are as follows

  1. initial mark

    1. Need Stop The Word, only mark the directly related objects from GC Roots, the speed is very fast
  2. concurrent mark

    1. The process of traversing the entire object graph starting from the directly associated objects of GC Roots is time-consuming but does not need to pause user threads, and can be concurrently allowed with garbage collection threads
  3. relabel

    1. Stop The Word is needed to correct the mark records of some objects that have changed due to the continued permission of the user program during the concurrent mark period.
    2. Pause time is longer than initial mark time, but shorter than concurrent mark time
  4. concurrent purge

    1. In the stage of clearing the mark, it is judged that the dead object is not required to move the surviving object, so this stage can also run at the same time as the user process

image-20230212140536014

The advantages of CMS are concurrent collection and low pause.

However, CMS is not perfect, it has the following shortcomings

  1. The CMS collector is very sensitive to processor resources
  2. CMS can't handle floating garbage
    1. Due to the concurrent marking and concurrent clearing phases, user threads continue to run, and new garbage objects will naturally be generated when the program is running.
  3. Since it is implemented using the mark-sweep algorithm, it will lead to more space fragmentation.

8. Garbage First collector

The Garbage First collector also becomes the G1 collector. It is a milestone achievement in the history of garbage collector technology development. It pioneered the design idea of ​​the collector for local collection and the Region-based memory layout form.

The G1 collector forms a Collection Set (CSet for short) for any part of the heap memory to recycle. The measurement criterion is no longer which generation it belongs to, but which piece of memory stores the most garbage and has the greatest recycling income. This is the Mixed GC mode of the G1 collector.

The Region-based heap memory layout pioneered by G1 is the key to its ability to achieve this goal . Although G1 is still designed according to the theory of generational collection, its heap memory layout is very different from other collectors: G1 no longer adheres to fixed size and fixed number of generational area divisions, but divides continuous Java The heap is divided into multiple independent regions (Regions) of equal size, and each Region can act as the Eden space of the new generation, the Survivor space, or the old generation space according to the needs . The collector can adopt different strategies to deal with Regions that play different roles, so that whether it is a newly created object or an old object that has survived for a period of time and survived multiple collections, it can obtain good collection results.

Region is the smallest unit of the G1 collector's unit recycling. This can be planned to avoid garbage collection of the entire JAVA heap. And the G1 collector will track the value of each Region, that is, the size of the space obtained by recycling and the experience value of the time required for recycling, and form a priority queue in the background, and then give priority to processing according to the allowed pause time set by the user Those Regions with high recovery value ensure that the G1 collector can obtain the highest possible collection efficiency within a limited time.

image-20230212161608985

Operation steps of G1 collector

  1. initial mark
    1. Just mark the objects that GC Roots can directly relate to and modify the value of the TAMS pointer, so that the next stage of user threads can correctly allocate new objects in the available Region when the user threads run concurrently. This stage needs to pause the thread, but it takes a short time, and it is completed synchronously when borrowing Minor GC, so the G1 collector does not actually have an additional pause at this stage.
  2. concurrent mark
    1. Starting from the GC Root, it conducts reachability analysis on the objects in the heap, recursively scans the object graph in the entire heap, and finds out the objects to be recycled. This stage takes a long time, but it can be executed concurrently with the user program. After the scanning of the object graph is completed, it is necessary to reprocess the objects recorded by SATB that have reference changes during concurrency.
  3. final mark
    1. Another brief pause in the user thread to process the last few SATB records left over from the end of the concurrent phase.
  4. screening recovery
    1. Responsible for updating the statistical data of the Region, sorting the recycling value and cost of each Region, formulating a recycling plan according to the user's expected pause time, and can freely select any number of Regions to form a recycling collection, and then put the part of the Region that is decided to recycle The surviving objects are copied to the empty Region, and then all the space in the old Region is cleaned up. The operation here involves the movement of surviving objects, which must suspend the user thread and be completed in parallel by multiple collector threads.

image-20230212162540809


Guess you like

Origin blog.csdn.net/weixin_51146329/article/details/128996486