Detailed explanation of jvm (6)丨Detailed explanation of garbage collector CMS

Welcome to the homepage to see my jvm column

I have been too busy recently, so updating articles has been a lot slower, but no matter how busy I am, I will still insist on finishing this series. Those who follow jvm, please rest assured~

In the last article, we talked about a few basic garbage collectors. Starting from this article, we will talk about the more commonly used garbage collectors and teach you how to read garbage collection logs. These often involve There are various questions in job interviews. Of course, this first article will be difficult. It is not recommended to skip reading or for people who are just getting started in this field.

1. Introduction

CMS, the full name of Concurrent Mark Sweep, is a garbage collector that targets low pauses. This collector is a true concurrent collector. We know that usually B/S-based servers have higher requirements for low pauses, because this can improve the user's interactive experience. Therefore, CMS can be said to just meet such requirements. It is precisely because of this that it It is currently widely used on the server side of various applications.

When we learn this collector, we must pay attention to the fact that CMS is a collector for the old generation. When we use jdk1.8, the -XX:+UseConcMarkSweepGC parameter is turned on, which can only be used in the young generation. ParNew, which is why there are ParNew logs in the gc log after turning on CMS.

2. Working process

1. Main work flow

CMS recycling uses a mark and clear algorithm, and its recycling will go through the following stages:

  • Initial Mark (STW)

  • Concurrent Mark

  • Concurrent Preclean

  • Concurrent Abortable Preclean

  • Final Remark (STW)

  • Concurrent Sweep

  • Concurrent Reset Let’s talk about it one by one:

  • Initial Mark (Initial Mark) What this stage does is to start from the GC Root and mark all surviving objects in the old generation that are directly related to the GC Root. In fact, they will also be scanned into the new generation (this is because the GC is not known before the scan). Whether the object referenced by Root is in the old generation), you can understand this stage as what I did from state 1 to state 2 when I talked about the three-color marking in the algorithm article. This stage is STW, and the reason is very simple. If this stage is run together with the user thread, the accuracy will be greatly affected.

这个阶段还有一个名字,就是叫做根结点枚举,以后还要讲的收集器如G1,ZGC也有这个阶段,并且它们的这个阶段也都是STW的,
STW目的也都是为了保证这个阶段所做工作的准确性。

The speed of this stage is very fast, so even if it is STW, the pause time will not be very long. We can also make this stage multi-threaded by using the CMSParallelInitialMarkEnabled parameter.

  • Concurrent Mark This stage starts from the object found in the previous stage and traverses the entire object reference chain. This stage takes a long time, but it can run concurrently with the user thread, so it will not cause pauses. . However, it should be noted that since this stage is carried out concurrently with the application thread, the application thread is still allocating objects at this time, so the CMS needs to reserve a certain space to support the application thread in allocating objects before triggering GC. If there is not enough space, "concurrent mode failure" will be triggered, which is a very serious problem. At this time, cms will degenerate into Serial old for gc, which will cause a very long pause time.

  • Concurrent Preclean (Concurrent Preclean) At this stage, the GC thread and the application thread are also executed concurrently. Because stage 2 is executed concurrently with the application thread, some reference relationships may have changed. Through Card Marking, the old generation space is logically divided into equal-sized areas (Card) in advance. If the reference relationship changes, the JVM will mark the changed area as a "dirty card", and then in At this stage, these dirty areas will be found, the reference relationship will be refreshed, and the "dirty area" mark will be cleared.

  • Concurrent Abortable Preclean This phase does not stop the application thread. This phase tries to do as much work as possible before the final marking phase of STW (Final Remark) to reduce the application pause time during this phase. Loop processing: Mark the reachable objects in the old generation and scan and process the objects in the Dirty Card area. The termination conditions of the loop are: 1. The number of loops is reached. 2. The loop execution time threshold is reached. 3. The memory usage of the new generation reaches the threshold. The maximum duration of this phase (Condition 2) is 5 seconds. The reason why it can last for 5 seconds is to expect that a young gc can occur within these 5 seconds, clean up the references of the young zone, and reduce the re-marking phase of the next phase. The time to scan the young zone for references pointing to the old generation.

  • Final Remark This stage is STW, and the goal is to complete the marking of all surviving objects in the old generation. Execute at this stage: 1. Traverse the new generation objects and re-mark them. 2. Re-mark them according to GC Roots. 3. Traverse the Dirty Cards in the old generation and re-mark them.

  • Concurrent Sweep (Concurrent Sweep) This phase is executed concurrently with the application, without STW pauses, and garbage objects are cleared based on the marking results.

  • Concurrent Reset (Concurrent Reset) This phase is executed concurrently with the application to reset the internal data related to the CMS algorithm to prepare for the next GC cycle.

2. CMS background and foreground modes

This is an important point about CMS but many people don’t know about it. People who do GC optimization when using CMS may often wonder why GC occurs before the value of the CMSInitiatingOccupancyFraction parameter is reached. This is because of CMS GC. This is actually not the only thing that happened.

a. Foreground mode: This mode is relatively easy to understand. It is triggered when memory is allocated to an object but there is not enough space. At this time, the mark and clear algorithm is used and no compression is performed.

b. Background mode: This mode is more complicated. This mode is mainly due to the constant scanning of the CMS background thread. Once the trigger conditions are found, a background mode gc will be triggered. Let’s take a look at the specific conditions

I. When the call is System.gc() and the ExplicitGCInvokesConcurrent parameter is configured or the triggering reason is gc locker and the GCLockerInvokesConcurrent parameter is configured.

Let's talk about the two situations one by one. When we use CMS and run to System.gc(), we know that System.gc() triggers a full gc. Generally, full gc is the entire The processes are all suspended, but if the parameter ExplicitGCInvokesConcurrent is configured, some processes can be executed concurrently (background gc at this time), which improves efficiency.

As for the gc locker, it is a very complicated thing. To put it simply, gc is not allowed to occur when executing JNI. This is because the thread is running in the critical section at this time. If the other thread has not finished running the gc and clears the data it needs, there will be problems (a typical scenario) It uses the local method JNI function to access string or array data in the JVM), so gc is not allowed at this time. So what should we do when the thread is running in the critical section and needs gc? At this time, the gc will be blocked (this is what the gc locker does). After all threads in the critical section have finished running, the gc locker will trigger a gc. The gc at this time is the background gc.

II. Dynamic calculation based on statistical data (only when UseCMSInitiatingOccupancyOnly is not configured) When UseCMSInitiatingOccupancyOnly is not configured, it will be dynamically determined based on statistical data whether a CMS GC is required.

The judgment logic is that if the predicted time required for CMS GC to complete is greater than the estimated time that the old generation will be filled, GC will be performed. These judgments need to be based on historical CMS GC statistical indicators. However, when the first CMS GC is performed, the statistical data has not yet been formed and is invalid. At this time, the usage ratio of Old Gen will be used to determine whether to perform GC.

III. According to the usage of the old generation, there are two situations. The first is that UseCMSInitiatingOccupancyOnly is configured, and when the proportion used by the old generation is larger than the CMSInitiatingOccupancyFraction we set (if not set, it is 92% of the default value). will trigger. Another situation is that UseCMSInitiatingOccupancyOnly is not configured. At this time, there are two small branch situations. One is when the old generation is successfully expanded due to object allocation. At this time, the background gc will be triggered (it is enough to see that the -xmx and -xms settings (with the same importance), the second one is a very complicated situation, which is related to the free linked list in the old generation of CMS. The reason for the complexity is that the complexity of freeList itself is very high. Simply put, at this time, CMS The background thread determines that the space in the FreeList is insufficient to allocate new objects that are promoted to the old generation next time, and the background gc will be triggered.

IV. Will young gc fail (similar to the pessimistic strategy of parllel gc mentioned before, CMS also has such behavior) When young gc fails, background gc will be triggered. Why does young gc fail? Most of the reasons are Because there is not enough space in the old generation; another one is to count the average size of previous promotions. If there is not enough space in the old generation, the old generation gc will also be triggered. The gc at this time is also the background gc.

V. Determine based on the usage of meta space. When the meta space is expanded, if we configure CMSClassUnloadingEnabled (this parameter is used to control whether CMS is allowed to recycle meta space), a CMS GC will be triggered. If this situation appears, It's very strange. At this time, you can see that the old generation is obviously not used much, but GC occurs.

3. Trade-off between recycling and compression

We know that CMS recycling uses a mark and clear algorithm, which means that it does not usually organize the space. This will cause too many memory fragments. These memory fragments will reduce the efficiency of object allocation. If there are too many space fragments, it cannot be allocated. For large objects, a full gc must be performed at this time. At this time, there will be a parameter -XX:+UseCMSComPackAtFullCollection to control the compression of the object during full gc. This solves the fragmentation problem, but the pause time becomes longer. And this is a passive solution, which can easily cause the pause time to be uncontrollable or difficult to predict. In order to make it easier to predict or control the pause time, CMS provides another parameter -XX:CMSFullGCsBeforeCompaction to solve this problem. This parameter The meaning is to require CMS to defragment the next time it enters full gc after performing several full gcs without defragmenting the space. The default value is 0, which means that the space will be defragmented every time full gc is performed. If this value is too small, it will cause space defragmentation. If it is too frequent, it will reduce efficiency; if it is too large, it will cause too many memory fragments and reduce the efficiency of allocating objects. In actual use, if you want to accurately control this value, you must comprehensively observe the throughput and pause time through stress testing to obtain a most appropriate value.

3. Disadvantages of CMS Recycler

CMS is not the default recycler no matter which jdk version it is in, because it has several obvious shortcomings.

1. Memory fragmentation

Memory fragmentation is caused by the use of mark defragmentation, which has been mentioned above and will not be repeated here.

2. Floating garbage

This is because the CMS runs concurrently with the user thread, and new garbage will naturally be generated during the running of the program. However, this part of the garbage cannot be processed by the CMS this time. It can only wait until the next recycling to process this part of garbage. It's just floating garbage.

3. Concurrent marking failed

It is precisely because the user program is still running that CMS cannot recycle it when the old generation space is 100% used. This is also the reason for configuring CMSFullGCsBeforeCompaction. It is worth mentioning here that during the recycling process, if there is insufficient space in the old generation To store object data generated by user programs, it will trigger concurrent marking failure, which will result in a full gc. Setting CMSFullGCsBeforeCompaction also requires trade-offs, because if it is set too large, it is easy to trigger concurrent mark failure. If it is set too small, recycling will occur frequently, which will increase pause time and reduce throughput.

4. Common parameters

In addition to the parameters mentioned in the article, the following parameters are more commonly used:

1、XX:+UseConcMarkSweepGC

The default is false. The old generation uses the CMS collector. It will be used in 1.8. The new generation will use ParNew by default.

2、-XX:+CMSScavengeBeforeRemark

The default is false. Before re-marking, execute ygc once to recycle the useless objects in the young zone and put the objects into the survivor zone or promote them to the old zone. In this way, when scanning the young zone again, you only need to scan the survivors zone. The object can be used. Generally, the survivor zone is very small, which greatly reduces the scanning time and thereby reduces the STW time.

3、-XX:+UseCMSCompactAtFullCollection

The default is true. After it is turned on, CMS can compress the object layout in the old generation after full gc. It is used in conjunction with the fourth parameter.

4、-XX:CMSFullGCsBeforeCompaction=n

If the third parameter is true, the object layout will be compressed after n full gcs. If it is 0, it will be compressed after each gc.

5、–XX:ParallelGCThreads=n

This is the number of threads for concurrent recycling. Generally, the default value can be used.

6、-XX:CMSInitiatingOccupancyFraction=n

This parameter is used in conjunction with the seventh parameter. This parameter means that a major gc is triggered when n percent of the old generation space is occupied, that is, old gc. The default value of this parameter is 92 for 1.8, but if it is applied If the amount of concurrency is large (meaning that object allocation is fast), this value needs to be set lower.

7、-XX:+UseCMSInitiatingOccupancyOnly

CMS will automatically adjust the threshold that triggers major gc. If you only want to use the value we set, you also need to use this parameter.

5. Log interpretation

1. Focus on gc logs

The main purpose here is to teach you how to read CMS logs, and also teach you how to use a gc visual analysis tool, GC Viewer. When we read gc logs, we must first understand what points need to be paid attention to when reading gc logs, otherwise it is really better not to read them. There are several basic points that you need to pay attention to when reading GC:

  • The number of gcs within a certain period of time

  • Time consuming for a single gc

  • Number of full gcs

  • The reason for the occurrence of gc. In addition, one thing that needs our attention is that since the jvm does not have a specification for gc logs (currently under planning), the logs generated by different garbage collectors are different, which is why reading gc logs is very difficult. A bit of a headache.

2. ParNew log

Using the data of our test environment as an example, the jvm parameters of our test environment are as follows:

-Xms1024m 
-Xmx1024m 
-XX:NewRatio=1 
-XX:+UseConcMarkSweepGC 
-XX:+CMSParallelRemarkEnabled 
-XX:+UseParNewGC 
-XX:+PrintGCDetails 
-XX:+PrintGCDateStamps 
-XX:+PrintGC 
-Xloggc:/export/Logs/gc.log 
-XX:+PrintHeapAtGC 
-XX:+HeapDumpOnOutOfMemoryError 
-XX:HeapDumpPath=/export/Logs/oom_error.hprof

Since PrintHeapAtGC is configured, the heap size will be displayed before and after each gc, and the log will appear relatively long. There is no need to worry here, they are all paper tigers, and there is nothing to be afraid of. Another thing you need to know is that if you use CMS, the young generation uses ParNew, so the young generation logs will all have ParNew. Let’s take a look at an example:

The 123456 I represented with red numbers are the usage of heap memory before and after recycling. Note here that the total size at 1 is equal to 2+3. This is because only one from and to space is used. 1 is very interesting here. Because we are using the parnew collector, this is called par new generation. 5 is the old generation. Similarly, because we are using CMS, it is also called concurrent mark-sweep generation. 6 is the metaspace space usage and the I believe it is easy to understand the usage of the class storage area, so I will not explain more about the numerical markings.

Let’s take a look at the longer of the two red boxes. To facilitate explanation, I’ll post the log here:

I separated each part with a red box and marked it with a blue number. Let’s look at it one by one:

"GC (Allocation Failure) 45.679:" in the first part means that 45.679 seconds after the jvm started, this gc was caused by the failure to allocate memory space for the object;

In the second part, "455889K->31600K(471872K), 0.3074756 secs", 455889K means that the young generation has used such a large space in total. This is the same as the used value in 1 of the previous figure. After recycling, it occupies 31600K , 471872K refers to the total capacity of the young generation, which is the same as the total value in 1 in the above figure, and 0.3074756 is the recycling time;

In Part 3 "464536K->46851K (996160K)", 464536k is the size of the entire heap used before recycling, 46851k is the size of the entire heap used after recycling, and 996160k in parentheses is the size of the entire heap;

Part 4 "0.3076113 secs" (actually this is together with Part 3), this is the time-consuming of the entire heap recycling work. Although the recycling is the young generation, it may involve upgrading the object to the old generation, etc., so This time is larger than the time-consuming value in Part 2;

Part 5 "Times: user=0.63 sys=0.02, real=0.31 secs" involves three time types, meaning: user: the total CPU time used by the GC thread during garbage collection; sys: system call Or the time spent waiting for system events; real: the clock time when the application is suspended. Since the GC thread is multi-threaded, real is less than (user+real). If the gc thread is single-threaded, real is close to (user +real) time.

Take another look at the short frame of the first picture:

The content of this box is very small, mainly composed of two parts. The first one is invocations=5, which means that 5 GCs have occurred from the start of the jvm to the present (old generation GC is not included here, which means that the young generation GC has occurred 5 times. Full gc recycles the entire heap, so during full gc, this value will also be increased by 1). Full=2 means that full gc occurred twice. By observing this, we can see the number of gcs that occurred in a certain period of time and the number of full gcs. The number of gcs.

3. CMS log

The parameters are still the same as above. Let’s look at a CMS log.

The reason why there are no screenshots here is that CMS logs and ParNew are output alternately, and a CMS recycling cycle lasts relatively long, so it is difficult to find a complete picture, but if you are serious about reading this, this log It should be just a trivial matter to you.

See the Chinese notes inside for the meaning.

// 1、这里是第一个阶段初始标记,0是老年代使用大小,524288K是整个老年代大小,268461K是整个堆使用大小,996160K是整个堆的大小,0.0369037 secs是这个阶段耗时,这个阶段是STW的,Times含义和上面一样
2021-01-15T15:10:22.071+0800: 4.271: [GC (CMS Initial Mark) [1 CMS-initial-mark: 0K(524288K)] 268461K(996160K), 0.0369037 secs] [Times: user=0.10 sys=0.00, real=0.04 secs] 
// 2、并发标记开始
2021-01-15T15:10:22.109+0800: 4.308: [CMS-concurrent-mark-start]
2021-01-15T15:10:22.109+0800: 4.308: [CMS-concurrent-mark: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
// 3、并发预清理开始
2021-01-15T15:10:22.109+0800: 4.308: [CMS-concurrent-preclean-start]
2021-01-15T15:10:22.111+0800: 4.310: [CMS-concurrent-preclean: 0.002/0.002 secs] [Times: user=0.02 sys=0.00, real=0.00 secs] 
// 4、可被终止的并发预清理开始
2021-01-15T15:10:22.111+0800: 4.310: [CMS-concurrent-abortable-preclean-start]
// 5、最终标记,这里写到文章里,因为太长了
2021-01-15T15:10:24.601+0800: 6.800: [GC (CMS Final Remark) [YG occupancy: 26035 K (471872 K)]6.800: [Rescan (parallel) , 0.0727898 secs]6.873: [weak refs processing, 0.0000510 secs]6.873: [class unloading, 0.0168876 secs]6.890: [scrub symbol table, 0.0033340 secs]6.893: [scrub string table, 0.0005541 secs][ CMS-remark: 0K(524288K)] 26035K(996160K), 0.0968042 secs] [Times: user=0.20 sys=0.01, real=0.10 secs] 
// 6、并发清理开始
2021-01-15T15:10:24.698+0800: 6.897: [CMS-concurrent-sweep-start]
2021-01-15T15:10:24.698+0800: 6.897: [CMS-concurrent-sweep: 0.000/0.000 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
// 7、并发重置
2021-01-15T15:10:24.698+0800: 6.897: [CMS-concurrent-reset-start]
2021-01-15T15:10:24.771+0800: 6.970: [CMS-concurrent-reset: 0.007/0.073 secs] [Times: user=0.08 sys=0.01, real=0.07 secs] 

Part 5 Final Marks:

YG occupancy is the occupied size of the young generation, 26035 is the currently used size of the young generation, and 471872 is the total size of the young generation. These two values ​​are the same as when parnew was mentioned before;

Rescan (parallel) is to complete the marking of all living objects when the application is suspended. This stage is processed in parallel, and it took 0.0727898 secs;

weak refs processing, 0.0000510 secs: The first sub-stage, its job is to process weak references;

class unloading, 0.0168876 secs: The second sub-stage, its job is: unloading the unused classes;

scrub symbol table, 0.0033340 secs to scrub string table, 0.0005541 secs: The last sub-phase, its purpose is to clean class-level metadata and internal string symbols corresponding to references in the string table;

CMS-remark: 0K (524288K), 0 is the size occupied by the old generation after this stage, 524288K is the size occupied by the old generation after this stage;

26035K (996160K) 0.0968042 secs: Heap memory usage after the 26035K stage. 996160K is the size of the entire heap; 0.0968042 secs is the time spent in this stage.

Careful readers will notice that the use of the old generation is 0k, but a GC of the old generation occurred here. This is because the size of the metaspace changed at this time. This is also something I need to explain. This gc happened when our springboot project was just started. This gc is actually part of the full gc. As I said before, full gc includes old gc and young gc. This Sometimes, because various classes are loaded and asm is frequently called to generate new classes, the size of the metaspace changes. This can be demonstrated by looking at graphs of classes.

4、GC Viewer

It is very troublesome to count the number of GC in the entire log. Here is a relatively easy-to-use visualization tool GC Viewer. I will not go into the download and installation process of this thing here. Let’s take a look at some basic gameplay. Below is what I The interface to open our gc log with GC Viewer:

The first part is a line chart of changes in memory over time. I generally don’t look at this (not that it’s useless); the second part is the statistics of various data. What I look at most is the following interface:

Here are all the gc pauses and full gc times in the gc log file, as well as some data on the concurrent log. As the gc log format is different, the data here will be different. Those who are familiar with CMS should see this interface. It doesn’t require much explanation to understand. For the specific usage of GC Viewer, please refer to github.com/chewiebug/G…[6]

Guess you like

Origin blog.csdn.net/weixin_54542328/article/details/134883532