Deep Understanding of the Java Virtual Machine (Second Edition) Learning 3: Garbage Collectors

garbage collector

If the collection algorithm is the methodology of memory recycling, then the garbage collector is the specific implementation of memory recycling.

The collectors we discuss here are mainly based on Hotspot VM after JDK 1.7 Update 14.

Serial collector

The Serial collector is the most basic and oldest collector, and was (before JDK 1.3.1) the only choice for the new generation of virtual machines.

This collector is a single-threaded collector, but the meaning of "single-threaded" does not only mean that it only uses one CPU or one collection thread to complete garbage collection work, but more importantly, when it performs garbage collection, All other worker threads must be suspended until it collects, as shown below:

insert image description here

Advantages:
Simple and efficient (compared to the single-threaded ratio of other collectors), for an environment limited to a single CPU, the Serial collector has no thread interaction overhead, so it is natural to concentrate on garbage collection to achieve higher single-threaded collection efficiency.

The Serial collector is a good choice for virtual machines running in Client mode.

Serial Old Collector

Serial Old is the older version of the Serial collector , which is also a single-threaded collector and uses a " mark-and-sort " algorithm.

The main significance of this collector is to use it for virtual machines in Client mode.

If in Server mode, it has two other uses:

  1. Works with the Parallel Scavenge collector in JDK 1.5 and earlier
  2. As a fallback for the CMS collector, it is used when Concurrent Mode Failure occurs in concurrent collections

insert image description here

ParNew collector

The ParNew collector is actually a multi-threaded version of the Serial collector. In addition to using multiple threads for garbage collection, the rest of the behavior includes all the control parameters available to the Serial collector, collection algorithms, Stop The World, object allocation principles, recycling strategies, etc. Both are identical to the Serial collector. The working process of the ParNew collector is as follows:

insert image description here

The ParNew collector is the young generation collector of choice for many virtual machines running in Server mode, for a non-performance related but important reason, aside from the Serial collector, currently only it works with the CMS collector .

The default number of collection threads enabled by the ParNew collector is the same as the number of CPUs. In an environment with a large number of CPUs (for example, 32), the -XX:ParallelGCThreadsparameter to limit the number of garbage collection threads.

Parallel Scavenge Collector

The Parallel Scavenge collector is a new generation collector , it is also a collector using a replication algorithm, and a parallel multi-threaded collector... It looks the same as ParNew, so what's special about it?

difference:

  • Collectors such as CMS: minimize the pause time of user threads during garbage collection;
  • Parallel Scavenge collector: achieves a manageable throughput (Throughput).

The so-called throughput is the ratio of the time the CPU uses to run user code to the total CPU consumption time, that is, throughput = time to run user code / (time to run user code + garbage collection time), the virtual machine runs for a total of 100 minutes, in which garbage The collection takes 1 minute, so the throughput is 99%.
High throughput can efficiently use CPU time to complete the program's computing tasks as soon as possible, (meaning that the pause time may be longer) is mainly suitable for tasks that are calculated in the background and do not require interaction . Therefore, it is commonly used in server environments. For example, applications that perform batch processing, order processing, payroll, scientific computing.

Therefore, the Parallel Scavenge collector is also known as a "throughput-first garbage" collector.

Parameter Description

  • -XX:MaxGCPauseMillis
    controls the maximum garbage collection pause time (ie STW time). The allowed value is a number of milliseconds greater than 0, and the collector will try to ensure that memory reclamation does not take longer than the set value. We should not set this value too small. The shortening of GC pause time is at the expense of throughput and new generation space. The pause time is decreasing, but the throughput is also decreasing.

  • -XX:GCTimeRatio
    sets the throughput size. The allowed value is an integer greater than 0 and less than 100, which is the ratio of garbage collection time to total time, which is equivalent to the inverse of throughput. If this parameter is set to 19, the maximum allowed GC time is 5% of the total time (ie 1 / (1+19)), the default value is 99, which is allowed to be greater than 1% (ie 1 / (1+99 )) garbage collection time. The larger the MaxGCPauseMillis, the higher the ratio.

  • -XX:+UseAdaptiveSizePolicy
    Sets the Parallel Scavenge collector to have a GC adaptive scaling policy (GC Ergonomics) .
    In this mode, parameters such as the size of the young generation, the ratio of Eden and Survivor, and the age of objects promoted to the old generation are automatically adjusted to achieve a balance between heap size, throughput and pause time . In situations where manual tuning is difficult, you can directly use this adaptive method to specify only the maximum heap, target throughput (GCTimeRatio), and pause time (MaxGCPauseMillis) of the virtual machine, and let the virtual machine complete the tuning work by itself .

Features

  1. throughput priority
  2. With GC adaptive adjustment strategy, the virtual machine completes the adjustment work by itself

Parallel Old Collector

The Parallel Old collector is an older version of the Parallel Scavenge collector, using multithreading and a "mark-and-clean" algorithm .

Before the introduction of this collector (before JDK 1.6), the new generation of Parallel Scavenge collector has been in a rather awkward state.
The reason is that if the young generation chooses the Parallel Scavenge collector, the old generation has no choice but the Serial Old (PS MarkSweep) collector (because the Parallel Scavenge collector does not work with the CMS collector). Due to the "drag" of the Serial Old collector in the server-side application performance (single-threaded Serial Old cannot make full use of the processing power of the server's multiple CPUs), the use of the Parallel Scavenge collector may not be able to obtain throughput on the overall application maximized effect.

The workflow of the Parallel Old collector is shown in the following figure:

insert image description here

CMS collector

The CMS (Concurrent Mark Sweep) collector is a collector whose goal is to obtain the shortest recovery pause time . It is also called Concurrent Low Pause Collector in some documents. CMS is the first true concurrent (Concurrent) collector in
the Hotspot virtual machine , it is the first time that the garbage collection thread and the user thread work (basically) at the same time. As can be seen from the name (including "Mark Sweep"), the CMS collector is based on the "mark-sweep" algorithm , and its operation process is divided into 7 steps (the author only said 4...):

  • Initial mark (CMS initial mark), will cause STW
  • Concurrent mark (CMS concurrent mark), running concurrently with user threads
  • Preclean (CMS concurrent preclean), run concurrently with user threads
  • Terminable preclean (CMS concurrent abortable preclean), running concurrently with user threads
  • Remark (CMS remark), which will cause STW
  • Concurrent sweep (CMS concurrent sweep), running concurrently with user threads
  • The concurrent reset state waits for the next CMS trigger (CMS concurrent reset), running at the same time as the user thread

The execution process is as follows:

Please add image description

initial mark

This is one of two "Stop TheWorld" events in the CMS. The role of this step is to mark surviving objects, which has two parts:

  1. Mark all GC Roots objects in the old generation, as shown in node 1 in the following figure;
  2. Mark the objects of the old generation that the living objects in the new generation refer to (referring to the reference type objects that are still alive in the new generation, and the references point to the objects in the old generation) as shown in nodes 2 and 3 in the following figure

Please add image description
A concept is introduced here: Cross-generational reference
Cross-generational reference refers to the existence of a reference to an object in the old generation in the new generation, or a reference to the new generation in the old generation.

Generally speaking, each generational garbage collector hopes to sweep the door and only mark the surviving objects of its own generation. For example, in Young GC, when the GC thread encounters a reference to the old generation, it will stop traversing, because It is only responsible for reclaiming the memory space of the young generation, and does not need to access the old generation objects, which is not the case with the CMS collector.

When Card Marking
is in Young GC, in order to find cross-generation references, there are usually these methods:

  1. When the object reference path points to the old generation, continue to traverse the old generation object to find the cross-generation reference
  2. Linearly scan old generation objects, mark cross-generational references, and use sequential reads instead of discrete reads
  3. Since the program starts running, a collection is used to record the creation of all cross-generational references, and the collection is scanned for cross-generational references pointing to the new generation during Young GC.

The first two methods need to traverse the old generation objects during Young GC. Because there are many surviving objects in the old generation and the workload is too large, the JVM uses the third method.

How do cross-generational references arise?
For the cross-generational reference (a->b) from the old generation to the young generation, there are two generation conditions.
One is that the GC thread moves the object a from the new generation to the old generation, and the
other is that a itself is an object of the old generation and is modified by the application thread. The reference to a points to b of the young generation (there is only the second case for the cross-generation reference from the young generation to the old generation).

For the cross-generational reference created by the GC thread itself, it can be directly recorded by the GC thread when it is created, so the question becomes: how to record the cross-generational reference created when the application thread modifies the object reference?

The divide and conquer method is used again in the JVM, and the old age is divided into multiple Cards (similar to linux memory pages), and each Card is about 512K.
Cards are marked as Dirty whenever object references within a Card are modified by the application thread. Then Young GC will scan the memory area corresponding to the Dirty Card in the old generation, and record the cross-generation references in it. This method is called Card Marking.

The JVM implements the modification of the reference by the monitor thread through the write barrier (write barrier) , and marks the corresponding Card. The writing barrier works in a similar way to the proxy mode. Specifically, when the reference assignment instruction is executed, the corresponding Card Table is added. Modify instructions.

Summary: The JVM uses Card Marking to avoid scanning the entire old age surviving objects during Young GC, at the cost of adding extra assembly instructions to implement write barriers and extra memory to save the Card Table each time the reference is modified.

concurrent marking

Find all surviving objects from the objects marked in the "initial marking" phase;
because they are running concurrently, during the operation, the objects of the new generation will be promoted to the old generation, or the objects will be allocated directly in the old generation, or the old generation will be updated. The reference relationship of the chronological objects, etc. For these objects, they all need to be re-marked, otherwise some objects will be omitted, and the situation of missing labels will occur. In order to improve the efficiency of re-marking, this stage will identify the Card where the above objects are located as Dirty, and then only need to scan the objects of these Dirty Cards to avoid scanning the entire old age.

The concurrent marking phase is only responsible for marking the Card whose reference has changed as Dirty, and is not responsible for processing.
As shown in the figure below, that is, nodes 1, 2, 3, and finally nodes 4 and 5 are found.

Concurrent marking is characterized by running concurrently with application threads. Not all surviving objects in the old generation will be marked, because the application will change some object references, etc.
Since this phase is concurrent with user threads, it may cause "concurrent mode failure". If there is a concurrent mode failure during the concurrent marking process of the CMS, then a Mark sweep compact Full GC will be performed next, which is completely Stop The World.

Please add image description

precleaning stage

It has been explained in the previous stage that all the surviving objects in the old age cannot be marked because the application program will change some object references while marking. This stage is used to deal with the surviving objects that were not marked due to the change of the reference relationship in the previous stage. , it scans all Cards marked Dirty.

As shown in the figure below, in the concurrent cleanup phase, the reference of node 3 points to 6, and the Card of node 3 is marked as Dirty;
Please add image description

Finally mark 6 as alive, as shown in the following image:

Please add image description

Terminable preprocessing

This stage tries to undertake enough work in the next stage Final Remark stage. The duration of this phase depends on many factors, since this phase is repeated doing the same thing until one of the abort conditions occurs (eg: number of repetitions, amount of work, duration, etc.).

PS: The maximum duration of this stage is 5 seconds. Another reason why it can last for 5 seconds is to expect a Young GC to occur within these 5 seconds to clean up the references of the new generation, so that the next stage can be re-marked and scanned. The time for the young generation to refer to the old generation is reduced.

relabel

This phase will lead to a second Stop The Word, the task of this phase is to complete the marking of all surviving objects throughout the old generation .

At this stage, the remarked memory range is the entire heap, including Young Gen and Old Gen.

Why scan the new generation?
Because for objects in the old generation, if they are referenced by objects in the new generation, they will be regarded as surviving objects. Even if the objects in the new generation are unreachable, these unreachable objects will be used as the "GC Root" of the CMS. ", to scan the old age ;

Therefore, for the new generation, the objects of the new generation that refer to the objects in the old generation will also be regarded as "GC Root" by the new generation .

When this stage takes a long time, you can add parameters -XX:+CMSScavengeBeforeRemark. Before re-marking, perform a Young GC to recycle the useless objects of the new generation, and put the objects into the Survivor area or promote them to the old generation, and then proceed. When scanning the new generation, you only need to scan the objects in the Survivor area. Generally, the Survivor area is very small, which greatly reduces the scanning time.

Since the previous preprocessing stage is executed concurrently with the user thread, at this time, the references of the new generation objects to the old generation may have changed a lot. At this time, the remark stage takes a lot of time to process these changes, which will lead to a long stop. The Word, so usually CMS tries to run the Final Remark phase when the young generation is clean enough.

Alternatively, parallel collection can be turned on: -XX:+CMSParallelRemarkEnabled.

Concurrent clearing

Through the above five stages of marking, all surviving objects in the old generation have been marked and now those unusable objects will be collected by the Garbage Collector by sweeping.
This phase is mainly to clear those unmarked objects and reclaim space ;
since the user thread is still running in the concurrent clearing phase of the CMS, new garbage will naturally be generated as the program runs. This part of garbage appears after the marking process, and the CMS They cannot be disposed of in the current collection and will have to be cleaned up for the next GC. This part of the garbage is called "floating garbage".

Since the longest concurrent mark and concurrent clear process collector threads in the entire process can work together with user threads, in general, the memory reclamation process of the CMS collector is performed concurrently with user threads.

Advantages of the CMS collector

  • concurrent collection
  • low pause

Disadvantages of CMS collectors

Very sensitive to CPU resources

The CMS collector is very sensitive to CPU resources. In the concurrent stage, although it will not cause the user thread to slow down, it will cause the application to slow down because some threads (or CPU resources) are occupied, and the total throughput will be reduced. The number of recycling threads started by CMS by default is (number of CPUs + 3) / 4, that is, when there are more than 4 CPUs, the garbage collection threads occupy no less than 25% of the CPU resources during concurrent recycling, and as the number of CPUs increases, increase and decrease.
In order to deal with this situation, the virtual machine provides a variant of the CMS collector called "Incremental Concurrent Mark Sweep / i-CMS", which is to let the GC thread during concurrent marking and concurrent clearing. Run alternately with user threads to minimize the time that GC threads monopolize resources, so that the entire garbage collection process will be longer, but the impact on user programs will be reduced. (The effect is not obvious, and it is no longer recommended for users to use it)

Can't handle floating garbage

The CMS collector is unable to handle Floating Garbage and may fail with "Concurrent Mode Failure" leading to another Full GC.

We mentioned the floating garbage before in concurrent clearing, because the program is still running during the clearing process, and new garbage will continue to be generated during the running process. This part of garbage appears after the marking process, and the CMS cannot be collected during the current collection. Dispose of them and have to save them for the next GC.

It is also because the user thread needs to continue to run during the garbage collection stage, so it is necessary to reserve enough memory space for the user thread to use, so the CMS collector cannot wait until the old age is almost completely filled before proceeding like other collectors. For collection, it is necessary to reserve a part of the space for the program running during concurrent collection.
We can set -XX:CMSInitiatingOccupancyFractionthe value of the parameter to set the startup threshold of the CMS collector. In JDK 1.6, the startup threshold of the CMS collector has been increased to 92%.

If the memory reserved during the CMS operation cannot meet the needs of the program, a "Concurrent Mode Failure" failure will occur. At this time, the virtual machine will start the backup plan: temporarily start the Serial Old collector to restart the garbage collection of the old generation , so that it stops. It's been a long time. Therefore, -XX:CMSInitiatingOccupancyFractionsetting the parameter too high can easily lead to a large number of "Concurrent Mode Failure" failures, and the performance will be degraded.

A lot of space fragmentation will be generated at the end of the GC

As we said earlier, the CMS collector is implemented based on a "mark-sweep" algorithm, which means that there will be a lot of space debris at the end of the collection. When there are too many space fragments, it will bring a lot of trouble to the allocation of large objects. There is often a lot of space left in the old age, but if you cannot find a large enough continuous space to allocate the current object, you can only Trigger Full GC early.

In order to solve the problem of space fragmentation, the CMS collector provides a -XX:+UseCMSCompactAtFullCollectionswitch parameter (which is enabled by default), which is used to start the process of merging and defragmenting memory fragments when the CMS collector cannot withstand Full GC. The process of memory defragmentation cannot be performed. Concurrent, the space fragmentation problem is gone, but the pause time has to be longer.
The virtual machine designer also provides another parameter -XX:CMSFullGCBeforeCompaction, this parameter is used to set how many times to perform full GC without compression, followed by one with compression (the default value is 0, which means that defragmentation is performed every time it enters Full GC ).

Specific examples of space debris problems

During the Minor GC process, Survivor Unused may not be enough to accommodate the surviving objects in Eden and another Survivor, then the excess will be moved to the old generation, called Premature Promotion , which will lead to short-term survival in the old generation. Object growth can cause serious performance problems.
Further, if the old age is full, a Full GC will be performed after the Minor GC, which will result in traversing the entire heap, which is called a Promotion Failure .

During Minor GC, the Survivor Space cannot fit, and objects can only be placed in the old generation. At this time, because of too many fragments in the old generation, the objects to be transferred from the new generation to the old generation are relatively large, and a contiguous memory area cannot be found. caused by storing this object.

Reasons for premature promotion:

  1. The Survivor space is too small to hold all the objects with short lifetimes at runtime.
    If this is the reason, you can try to increase the Survivor, otherwise the objects in the end life cycle will be promoted too fast, which will cause the old age to be quickly filled, resulting in frequent full gc;
  2. Objects are too large, and Survivor and Eden do not have enough space to store these large objects.

Reason for promotion failure:
When promoting, it is found that the old generation also does not have enough contiguous space to hold the object. Why is there not enough contiguous space instead of free space?

There are two situations in which the old generation cannot accommodate promoted objects:

  1. There is not enough free space in the old age;
  2. Although there is a lot of free space in the old generation, there are too many fragments, and there is no continuous free space to store the object.

Solution

  1. If the promotion of large objects fails due to memory fragmentation, the CMS needs to perform space defragmentation and compression;
  2. If it is caused by too fast promotion, it means that the free space of the Survivor is insufficient, then you can try to increase the Survivor;
  3. If it is caused by insufficient space in the old generation, try lowering the threshold triggered by the CMS.

The content of the CMS part is borrowed from:
https://zhuanlan.zhihu.com/p/150696908
https://www.jianshu.com/p/86e358afdf17

G1 collector

The G1 collector is a garbage collector for server-side applications. The biggest feature of G1 is the introduction of the idea of ​​partition, which weakens the concept of generation, rationally utilizes the resources of each cycle of garbage collection, and solves many defects of other collectors and even CMS.
The G1 collector has the following characteristics:

  • Parallelism and Concurrency : G1 can take full advantage of the hardware in a multi-CPU and multi-core environment, and use multiple CPUs (CPU or CPU core) to shorten the Stop-The-World pause time. Some other collectors originally need to pause the GC executed by the Java thread action, the G1 collector can still allow the Java program to continue executing in a concurrent manner.
  • Generational collection : Like other collectors, the concept of generational is still preserved in G1. Although G1 can manage the entire GC heap independently without the cooperation of other collectors, it can handle newly created objects and old objects that have survived for a while and survived multiple GCs in different ways for better collection. Effect.
  • Spatial integration : Unlike the "mark-clean" algorithm of CMS, G1 is based on the "mark-clean" algorithm as a whole, and locally (between two regions) is based on the "copy" algorithm. Yes, but in any case, both algorithms mean that G1 does not generate memory space fragmentation during operation, and can provide regular available memory after collection. This feature is beneficial for the program to run for a long time, and the next GC will not be triggered in advance because the contiguous memory space cannot be found when allocating large objects.
  • Predictable pause : This is another major advantage of G1 over CMS. Reducing pause time is a common concern of G1 and CMS, but in addition to the pursuit of reducing pause, G1 can also establish a predictable pause time model, allowing users to Explicitly specifying that in a period of length M milliseconds, the time spent on garbage collection must not exceed N milliseconds, which is almost a feature of real-time Java (RTSJ) garbage collectors.

The rest of the content will be added when I go back and look at the literature about the G1 collector. The book "In-depth Understanding of Java Virtual Machine" (Second Edition) is too shallow about the G1 collector (maybe the second edition is too old)…

PS: You can also go to my personal blog to see more content
Personal blog address: Xiaoguan classmate's blog

Guess you like

Origin blog.csdn.net/weixin_45784666/article/details/120621566