14.JVM-Garbage Collector

Table of Contents of Series Articles

1. JVM and Java architecture

2. JVM-class loading subsystem

3. JVM-runtime data area overview and threads

4. JVM-Program Counter (PC Register)

5. JVM-Virtual Machine Stack

6. JVM-Native Method Interface

7. JVM-native method stack

8. JVM-heap

9. JVM-method area

10.JVM-StringTable/StringPool

11.JVM-Garbage Collection Overview

12.JVM-garbage collection related algorithms

13.JVM-garbage collection related concepts

14.JVM-Garbage Collector


1. GC classification and performance indicators

The garbage collector is not specified too much in the specification and can be implemented by different manufacturers and different versions of JVM.

Since JDK versions are in a high-speed iteration process, Java has developed numerous GC versions so far.

Analyzing garbage collectors from different angles, GC can be divided into different types.

New features in different versions of Java

  • 1. Syntax level: Lambda expression, switch, automatic unboxing and boxing, enum...
  • 2. API level: Stream API, new date and time, Optional, String, collection framework
  • 3. Bottom-level optimization: JVM optimization, GC changes, metaspace, static fields, string constant pool, etc.

1.1. Garbage collector classification

1.1.1. Score by thread

According to the number of threads (number of garbage collection threads, orange part in the figure below), it can be divided into serial garbage collector and parallel garbage collector.

Serial recycling means that only one CPU is allowed to perform garbage collection operations at the same time. At this time, the working thread is suspended until the garbage collection work is completed.

  • In situations where the hardware platform is not particularly superior, such as a single CPU processor or a small application memory, the performance of the serial collector can exceed that of the parallel collector and concurrent collector. Therefore, serial recycling is applied by default to the JVM in Client mode on the client side.
  • On CPUs with strong concurrency capabilities, the pause time generated by the parallel collector is shorter than that of the serial collector.

Contrary to serial recycling, parallel collection can use multiple CPUs to perform garbage collection at the same time, thus improving the throughput of the application. However, parallel recycling is still the same as serial recycling, using the "Stop-the-World" mechanism. .

1.1.2. According to working mode

According to the working mode, it can be divided into concurrent garbage collector and exclusive garbage collector.

  • The concurrent garbage collector works alternately with application threads to minimize application pause times.
  • Once the exclusive garbage collector (Stop the World) runs, it stops all user threads in the application until the garbage collection process is completely completed.

1.1.3. According to fragmentation processing methods

According to the fragment processing method, it can be divided into compressed garbage collector and non-compressed garbage collector.

  • The compressing garbage collector will compress and organize the surviving objects after the recycling is completed to eliminate the fragments after recycling.

    Reallocation of object space usage: pointer collision

  • Non-compacting garbage collectors do not perform this step.

    Reallocate object space usage: free list

1.1.4. Divide by working memory area

According to the working memory interval, it can be divided into young generation garbage collector and old generation garbage collector.

1.2. Evaluate GC performance indicators

  • Throughput: the ratio of time spent running user code to total running time

    (Total running time = program running time (a) + memory recycling time (b). The larger a / (a ​​+ b), the better.)

  • Garbage collection overhead: The complement of throughput, the ratio of time spent collecting garbage to total running time.

    The smaller b / (a ​​+ b) is, the better, which is the complement of throughput

  • Pause time (STW): The time the program's worker thread is paused when garbage collection is performed.

  • Collection frequency: How often collection operations occur relative to the execution of the application. (The frequency of recycling is low, which does not mean the time period)

  • Memory usage: The memory size occupied by the Java heap area.

  • Fast: The time it takes for an object to be recycled.

Throughput, pause time, and memory usage, these three together form an "impossible triangle." The overall performance of the three will get better and better as technology advances. A good collector usually does at most two of these things. Among these three items, the importance of pause time has become increasingly prominent. Because with the development of hardware, more memory usage is becoming more and more tolerated, and the improvement of hardware performance also helps to reduce the impact of the collector runtime on the application, that is, it improves throughput. The expansion of memory will have a negative effect on latency.

To put it simply, we mainly focus on two points:

  • Throughput
  • Pause time

1.3. Evaluate GC performance indicators: throughput (throughput)

Throughput is the ratio of the time the CPU spends running user code to the total CPU time consumed, that is, throughput = time running user code/(time running user code + garbage collection time).

For example: if the virtual machine runs for a total of 100 minutes, of which garbage collection takes 1 minute, the throughput is 99%.

In this case, the application can tolerate higher pause times, so high-throughput applications have longer time bases and fast responses are not a concern.

Throughput priority means that within unit time, the STW time is the shortest: 0.2 + 0.2 = 0.4. (The larger the throughput each time, the longer the pause time will be to clean up the garbage, that is, the longer the STW.)

1.4. Evaluate GC performance indicators: pause time (pause time)

"Pause time" refers to the state in which the application thread is paused within a period of time to allow the GC thread to execute

For example, a 100 millisecond pause during GC means that no application thread is active during this 100 millisecond period.

Pause time priority means keeping the time of a single STW as short as possible: 0.1+0.1 + 0.1+ 0.1+ 0.1=0.5

1.5. Throughput vs. Pause Time

Compare the previous two figures, (6s - (0.2 + 0.2)s) / 6s = 0.933333... . (6s - (0.1 + 0.1 + 0.1 + 0.1 + 0.1)s) / 6s = 0.916666...

High throughput is better because it gives the end-user of the application the impression that only the application threads are doing "productive" work. Intuitively, the higher the throughput, the faster the program will run.

Low pause time (low latency) is better because it is always bad from the end user's perspective that an application is hung whether due to GC or other reasons. Depending on the type of application, sometimes even a brief 200ms pause can interrupt the end-user experience . Therefore, it is very important to have low maximum pause times, especially for an interactive application .

Unfortunately "high throughput" and "low pause time" are competing goals (a contradiction).

  • Because if you choose to prioritize throughput, you will inevitably need to reduce the frequency of memory recycling , but this will cause the GC to require a longer pause time to perform memory recycling.
  • On the contrary, if you choose to give priority to low latency, then in order to reduce the pause time each time memory recycling is performed, memory recycling can only be performed frequently , but this will cause the reduction of young generation memory and cause a decrease in program throughput. decline.

When designing (or using) a GC algorithm, we must determine our goals: a GC algorithm may only target one of two goals (i.e. focus only on maximum throughput or minimum pause time), or try to find a solution for both. compromise.

Now the standard: reduce pause times when maximizing throughput takes precedence . (GC is the current standard and is a compromise. It tries to increase throughput within a controllable pause time.)

2. Overview of different garbage collectors

The garbage collection mechanism is Java's signature capability, which greatly improves development efficiency. This is of course also a hot spot for interviews.

So, what are the common garbage collectors in Java?

2.1. History of garbage collector development

With a virtual machine, a mechanism for collecting garbage must be needed. This is Garbage Collection, and the corresponding product is called Garbage Collector.

  • In 1999, the serial GC came with JDK 1.3.1, which was the first GC. The ParNew garbage collector is a multi-threaded version of the Serial collector
  • On February 26, 2002, Parallel GC and Concurrent Mark Sweep GC (CMS, concurrent mark sweep, the first concurrent garbage collector, achieving low latency) were released together with JDK1.4.2
  • Parallel GC becomes HotSpot's default GC after JDK6.
  • In 2012, in the JDK1.7u4 version, G1 (GFirst) was available.
  • In 2017, G1 became the default garbage collector in JDK9 to replace CMS.
  • In March 2018, parallel full garbage collection of the G1 garbage collector in JDK10 implemented parallelism to improve worst-case latency. (Parallelism improves throughput)
  • -------------------------------------------------- Watershed------------------------------------------------- ----------
  • In September 2018, JDK11 was released. Introducing the Epsilon garbage collector, also known as the "No-Op" collector. At the same time, ZGC is introduced: a scalable low-latency garbage collector (Experimental).
  • In March 2019, JDK12 was released. Enhance G1 to automatically return unused heap memory to the operating system. At the same time, Shenandoah GC (this is from Red Hat and competes with ZGC on the official website) is introduced: low pause time GC (Experimental).
  • In September 2019, JDK13 was released. Enhance ZGC to automatically return unused heap memory to the operating system.
  • In March 2020, JDK14 was released. Removed CMS garbage collector (CMS becomes history). Expanding ZGC’s use on macOS and Windows

2.2. 7 classic garbage collectors

  • Serial collector: Serial (new generation), Serial Old (old generation)
  • Parallel collectors: ParNew, Parallel Scavenge, Parallel Old
  • Concurrent collector: CMS, G1

 2.3. The relationship between 7 classic collectors and garbage generation

New generation collector: Serial, ParNew, Paralle1 Scavenge;

Old generation collector: Serial old, Parallel old, CMS;

Whole stack collector: G1

2.4. Combination relationship of garbage collectors

This picture has been updated to JDK14, while many on the Internet have only updated it to JDK8.

 Analysis: (Note that ParNew is an abbreviation, that is, Par is the abbreviation of Parallel, and New means that it can only process the new generation)

  • There are connections between the two collectors, indicating that they can be used together: Serial/Serial Old, Serial/CMS, ParNew/Serial Old, ParNew/CMS, Parallel Scavenge/Serial Old, Parallel Scavenge/Parallel Old, G1;
  • Among them, Serial O1d is used as a backup plan for CMS failure when "Concurrent Mode Failure" occurs.
  • (Red dotted line) Due to the cost of maintenance and compatibility testing, the two combinations Serial+CMS and ParNew+Serial Old were declared obsolete in JDK 8 (JEP173), and support for these combinations was completely canceled in JDK9 (JEP214 ), that is: removed.
  • (Green dotted line) In JDK14: Parallel Scavenge and Serial Old GC combination deprecated (JEP366)
  • (Cyan dashed line) In JDK14: Remove CMS garbage collector (JEP363)

They cannot be used together because the garbage collectors use different frameworks. For example, CMS and Parallel Scavenge cannot be used together.

In JDK8, optional matching methods

Why have many collectors, isn't one enough? Because Java has many usage scenarios, including mobile terminals, servers, etc. Therefore, it is necessary to provide different garbage collectors for different scenarios to improve the performance of garbage collection.

Although we will compare various collectors, we are not trying to pick the best collector. There is no perfect collector that is universally applicable and applicable in any scenario, and there is no universal collector. So we only choose the collector that is most suitable for the specific application . (Tuning is also about choosing a different garbage collector.)

2.5. How to view the default garbage collector

-XX:+PrintCommandLineFlags : View command line related parameters (including the garbage collector used)

Use the command line command: jinfo -flag related garbage collector parameters process ID

3. Serial Recycler: Serial Recycling

The Serial collector is the most basic and oldest garbage collector. Before JDK1.3, the only option was to recycle the new generation.

The Serial collector serves as the default new generation garbage collector in Client mode in HotSpot.

The Serial collector uses a copy algorithm, serial recycling, and a "Stop-the-World" mechanism to perform memory recycling.

In addition to the young generation, the Serial collector also provides the Serial Old collector for performing old generation garbage collection. The Serial Old collector also uses serial recycling and "Stop the World" mechanism, but the memory recycling algorithm uses a mark-compression algorithm.

  • Serial Old is the default old generation garbage collector running in Client mode.

  • Serial Old has two main uses in Server mode:

    • ① Used in conjunction with the new generation Parallel Scavenge
    • ② As a backup garbage collection solution for the old generation CMS collector (Serial Old GC is similar to a social butterfly)

This collector is a single-threaded collector, but its "single-threaded" meaning does only mean that it will only use one CPU or one collection thread to complete garbage collection work , but more importantly, when it performs garbage collection , all other worker threads must be paused until it collects (Stop The World)

Advantages: Simple and efficient (compared to the single thread of other collectors). For environments limited to a single CPU, the Serial collector has no thread interaction overhead, so it can naturally achieve the highest single-thread collection efficiency by focusing on garbage collection.

A virtual machine running in Client mode is a good choice.

In the user's desktop application scenario, the available memory is generally not large (tens of MB to one or two hundred MB), and garbage collection can be completed in a short period of time (tens of ms to more than a hundred ms). As long as it does not occur frequently, use Serial collectors are acceptable.

In the HotSpot virtual machine, use the -XX:+UseSerialGC parameter to specify that both the young generation and the old generation use the serial collector.

It is equivalent to using Serial GC in the new generation and Serial Old GC in the old generation.

Example: (the previous example, only the jvm parameters are different)

Under JDK9, the default is G1, now it is changed to Serial GC, its JVM parameters: -XX:+PrintCommandLineFlags -XX:+UseSerialGC

Summarize:

Everyone knows this kind of garbage collector, and it is no longer used serially. And it can only be used on single-core CPUs. It is no longer single-core.

For highly interactive applications, this garbage collector is unacceptable. Generally, serial garbage collectors are not used in Java web applications.

4. ParNew collector: parallel recycling

If Serial GC is a single-threaded garbage collector in the young generation, then ParNew collector is a multi-threaded version of Serial collector.

Par is the abbreviation of Parallel, and New means that it can only handle the new generation.

There is almost no difference between the two garbage collectors except that the ParNew collector uses parallel collection to perform memory reclamation. The ParNew collector also uses the copy algorithm and "Stop-the-World" mechanism in the young generation .

ParNew is the default garbage collector for the new generation of many JVMs running in Server mode.

For the new generation, recycling times are frequent and the parallel method is efficient.

For the old generation, the number of recycling is small and the serial method is used to save resources. (CPU parallel needs to switch threads, while serial can save the resources of switching threads)

Since the ParNew collector is based on parallel recycling, can it be concluded that the recycling efficiency of the ParNew collector will be more efficient than the Serial collector in any scenario?

  • The ParNew collector runs in a multi-CPU environment. Since it can reuse the advantages of physical hardware resources such as multiple CPUs and multiple cores, it can complete garbage collection more quickly and improve the throughput of the program.
  • But in a single CPU environment, the ParNew collector is not more efficient than the Serial collector. Although the Serial collector is based on serial recycling, because the CPU does not need to switch tasks frequently, it can effectively avoid some additional overhead caused by multi-thread crossing.

Because except for Serial (Serial and Serial Old), currently only ParNew GC can work with the CMS collector.

In the program, developers can manually specify to use the ParNew collector to perform memory recycling tasks through the option "-XX:+UseParNewGC". It means that the young generation uses the parallel collector and does not affect the old generation.

-XX:ParallelGCThreads limits the number of threads, and the same number of threads as the CPU data is enabled by default. (Do not give more threads than CPU to prevent multiple threads from competing for the CPU and adding additional overhead)

5. Parallel Scavenge: Throughput Priority (Parallel Scavenge)

In addition to the ParNew collector based on parallel recycling in HotSpot's young generation, the Parallel Scavenge collector also uses a copy algorithm, parallel recycling and "Stop the World" mechanism.

So is the emergence of Parallel collector unnecessary?

  • Unlike the ParNew collector, the Parallel Scavenge collector's goal is to achieve a controllable throughput (Throughput), and it is also called a throughput-first garbage collector. (to increase throughput as much as possible)
  • The adaptive adjustment strategy (dynamic memory adjustment) is also an important difference between Parallel Scavenge and ParNew.

High throughput can efficiently utilize CPU time and complete the program's computing tasks as quickly as possible. It is mainly suitable for tasks that are performed in the background and do not require too much interaction . Therefore, it is commonly used in server environments. For example, those applications that perform batch processing, order processing, payroll payments, scientific computing .

The Parallel collector provided the Parallel Old collector for performing old generation garbage collection in JDK 1.6, which was used to replace the Serial Old collector of the old generation.

The Parallel Old collector uses a mark-compression algorithm , but it is also based on parallel collection and "Stop-the-World" mechanism .

In application scenarios where program throughput is prioritized, the combination of Parallel collector and Parallel Old collector has very good memory recycling performance in Server mode. In Java8, the default is this garbage collector.

5.1. Parameter configuration

-XX:+UseParallelGC manually specifies that the young generation uses the Parallel parallel collector to perform memory recycling tasks.

-XX:+UseParallelOldGC manually specifies that the old generation uses the parallel recycling collector.

  • Applicable to the new generation and the old generation respectively. By default jdk8 is enabled.
  • The above two parameters, one is enabled by default, and the other will also be enabled. ( activate each other )

(Corresponding to the following example 1)

-XX:ParallelGCThreads sets the number of threads for the young generation parallel collector. Generally, it is best to equal the number of CPUs to avoid excessive thread numbers affecting garbage collection performance.

  • By default, when the number of CPUs is less than 8, the value of ParallelGCThreads is equal to the number of CPUs.
  • When the number of CPUs is greater than 8, the value of ParallelGCThreads is equal to 3 + [5 * CPU_Count] / 8]. (Assume CPU_Count=12, then the formula 3 + [5 * CPU_Count] / 8 = 3 + [5 * 12] / 8 = 10, that is, there are 10 threads to perform garbage collection, leaving some CPU for other tasks.)

-XX:MaxGCPauseMillis sets the maximum pause time of the garbage collector (that is, the STW time). The unit is milliseconds.

  • In order to control the pause time within MaxGCPauseMills as much as possible, the collector will adjust the Java heap size or other parameters when working. (If the STW is small, the heap space will be adjusted smaller so that the cleanup can be completed within the specified STW. If the STW is large, the heap space will be adjusted larger.)
  • For users, the shorter the pause time, the better the experience (long pauses will cause the client to freeze). But on the server side, we focus on high concurrency and overall throughput. Therefore, the server side is suitable for Parallel for control.
  • Use this parameter with caution .

-XX:GCTimeRatioThe ratio of garbage collection time to the total time (=1/(N+1)). A size used to measure throughput. (This is the "garbage collection overhead" mentioned earlier, that is, the complement of throughput)

  • Value range (0, 100). The default value is 99, which means the garbage collection time does not exceed 1%.
  • There is a certain contradiction with the previous -XX:MaxGCPauseMillis parameter. The longer the pause time is, the Radio parameters will easily exceed the set ratio.

-XX:MaxGCPauseMillis and -XX:GCTimeRatio are mutually exclusive.

-XX:+UseAdaptiveSizePolicy sets the Parallel Scavenge collector to have an adaptive adjustment policy

  • In this mode, parameters such as the size of the young generation, the ratio of Eden to Survivor, and the age of objects promoted to the old generation will be automatically adjusted (the aforementioned 8:1:1 becomes 6:1:1). Achieve a balance between heap size, throughput and pause time.
  • In situations where manual tuning is difficult, you can directly use this adaptive method to only specify the virtual machine's maximum heap, target throughput (GCTimeRatio), and pause time (MaxGCPauseMills), and let the virtual machine complete the tuning work by itself.

-XX:+UseParallelGC: Indicates that the new generation uses Parallel GC
-XX:+UseParallelOldGC: Indicates that the old generation uses Parallel Old GC
Note: The two can activate each other

6. CMS collector: low latency

During the JDK 1.5 period, Hotspot launched a garbage collector that was almost epoch-making in strong interaction applications : the CMS (Concurrent-Mark-Sweep) collector, which was the first collector in the HotSpot virtual machine. A true concurrent collector, it enables the garbage collection thread and the user thread to work at the same time for the first time .

The focus of the CMS collector is to shorten the pause time of the user thread during garbage collection as much as possible. The shorter the pause time (low latency), the more suitable it is for programs that interact with users. Good response speed can improve the user experience.

At present, a large part of Java applications are concentrated on the servers of Internet websites or B/S systems. Such applications pay special attention to the response speed of the service and hope that the system pause time will be the shortest to provide users with a better experience. The CMS collector fits the needs of this type of application very well.

The garbage collection algorithm of CMS adopts the mark-clear algorithm (mark-clear can ensure that the user thread can continue to execute when cleaning garbage, and the running resources will not be affected. Mark-cleaning will affect the continued execution of the user thread when cleaning garbage), and will also "Stop-the-World"

Unfortunately, CMS, as the old generation collector, cannot work with the new generation collector Parallel Scavenge that already exists in JDK1.4.0. Therefore, when using CMS to collect the old generation in JDK1.5, the new generation can only Choose one of the ParNew or Serial collectors.

Before the emergence of G1, CMS was still widely used. To this day, there are still many systems using CMS GC.

The entire CMS process is more complicated than the previous collector. The entire process is divided into four main stages, namely the initial marking stage, the concurrent marking stage, the re-marking stage and the concurrent clearing stage. (STW is required in the initial marking and re-marking stages, but not in other stages.)

  • Initial mark (Initial-Mark) phase: In this phase, all working threads in the program will be briefly suspended due to the " Stop-the-World " mechanism. The main task of this phase is just to mark the GC Roots. The object to which it is directly related . Once marking is complete, all previously suspended application threads will be resumed. Since the directly related objects are relatively small, the speed here is very fast .
  • Concurrent -Mark phase: The process of traversing the entire object graph (the graph is a data structure) starting from the directly associated objects of GC Roots . This process takes a long time but does not require pausing the user thread and can be concurrent with the garbage collection thread. run.
  • Remark phase: Since in the concurrent marking phase, the program's working thread and the garbage collection thread will run at the same time or cross-running, in order to correct user program during the concurrent marking period For object marking , the pause time (STW) in this phase is usually slightly longer than the initial marking phase, but much shorter than the concurrent marking phase.
  • Concurrent- Sweep phase: This phase cleans up and deletes dead objects judged in the marking phase to free up memory space . Since there is no need to move live objects, this phase can also be concurrent with the user thread. (New garbage generated during the concurrent marking phase will not be cleaned)

Although the CMS collector uses concurrent recycling (non-exclusive), it still needs to execute the "Stop-the-World" mechanism to pause the worker threads in the program during its initial marking and re-marking phases, but the pause time does not It will not be too long, so it can be said that all current garbage collectors do not completely eliminate the need for "Stop-the-World", they just shorten the pause time as much as possible.

Since the most time-consuming concurrent marking and concurrent cleaning phases do not require work to be paused, the overall recycling is low-pause.

In addition, since the user thread is not interrupted during the garbage collection phase, you should also ensure that the application user thread has enough memory available during the CMS recycling process . Therefore, the CMS collector cannot wait until the old generation is almost completely filled like other collectors before collecting. Instead , when the heap memory usage reaches a certain threshold, it starts recycling to ensure that the application is in the CMS working process. There is still enough space for applications to run. If the memory reserved during CMS operation cannot meet the needs of the program, there will be a " Concurrent Mode Failure " failure (the garbage cleaning thread cannot outwork the user thread and there will be insufficient memory). At this time, the virtual machine will start the backup plan : Temporarily enable the Serial Old (mark-compression) collector to re-collect garbage in the old generation, so the pause time is very long.

The garbage collection algorithm of the CMS collector uses a mark-and-sweep algorithm , which means that after each memory recycling is performed, the memory space occupied by the useless objects that perform memory recycling is very likely to be discontinuous memory blocks. Inevitably some memory fragmentation will occur . Then when CMS allocates memory space for a new object, it will not be able to use the pointer collision (Bump the Pointer) technology, but can only choose the free list (Free List) to perform memory allocation.

Some people may think that since Mark Sweep can cause memory fragmentation, why not change the algorithm to Mark Compact?

The answer is actually very simple, because when clearing concurrently, if you use Compact to organize the memory, how can the memory used by the original user thread be used? To ensure that the user thread can continue to execute, the prerequisite is that the resources it runs on are not affected. Mark Compact is more suitable for use in "Stop the World" scenarios. (Marking - Cleaning must stop the user thread, otherwise it will be a mess, because cleaning is performed concurrently.)

6.1. Advantages of CMS

  • concurrent collection
  • low latency

6.2. Disadvantages of CMS

(The disadvantages of CMS, that is, the reason for the emergence of G1)

  • 1) Memory fragmentation will occur , resulting in insufficient space available for user threads after concurrent clearing. When large objects cannot be allocated, Full GC has to be triggered in advance. (Full GC time bomb)
  • 2) The CMS collector is very sensitive to CPU resources . In the concurrent phase, although it will not cause user pauses, it will cause the application to slow down because it takes up part of the threads, and the total throughput will be reduced. (A formula will be mentioned later: (ParallelGCThreads + 3) / 4)
  • 3) The CMS collector cannot handle floating garbage . A "Concurrent Mode Failure" failure may occur, causing another Full GC to occur. In the concurrent marking phase, because the program's working thread and garbage collection thread run at the same time or cross-running, if new garbage objects are generated during the concurrent marking phase, CMS will not be able to mark these garbage objects, which will eventually lead to these newly generated garbage objects. Garbage objects are not recycled in time , so the memory space that has not been recycled before can only be released the next time GC is executed. (CMS has four stages: initial marking, concurrent marking, re-marking, and concurrent clearing; re-marking only re-marks the previously initially marked garbage, and the garbage generated during the concurrent marking process will not be marked; the initial marking is Mark GC Roots direct objects, and concurrent marking marks the entire object graph starting from the direct object.)

6.3. Related parameters

-XX:+UseConcMarkSweepGC manually specifies to use the CMS collector to perform memory recycling tasks.

  • After turning on this parameter, -XX:+UseParNewGC will automatically be turned on. That is: the combination of ParNew (for Young area) + CMS (for Old area) + Serial Old (backup plan).

(Corresponding to the following example 1)

-XX:CMSInitiatingOccupanyFraction sets the threshold for heap memory usage. Once the threshold is reached, recycling begins.

  • The default value for JDK5 and previous versions is 68, that is, when the space usage in the old generation reaches 68% (the threshold is small, it is easier to trigger CMS), a CMS recycling will be performed. The default value for JDK6 and above is 92% (the larger the threshold, the fewer CMS times)
  • If the memory growth is slow , you can set a slightly larger value. A larger threshold can effectively reduce the trigger frequency of CMS, and reducing the number of old generation recycling can significantly improve application performance. On the other hand, if the application memory usage is growing rapidly , this threshold should be lowered (faster recycling) to avoid triggering the old generation serial collector frequently. Therefore , this option can effectively reduce the number of Full GC executions .

-XX:+UseCMSCompactAtFullCollection is used to specify that the memory space is compressed and organized after the Full GC is executed to avoid the generation of memory fragmentation. However, since the memory compression and sorting process cannot be executed concurrently, the problem is that the pause time becomes longer. (Used in combination with -XX:CMSFullGCsBeforeCompaction, it indicates how many times after Full GC, the memory will be organized once, so that it will not be so fragmented)

-XX:CMSFullGCsBeforeCompaction sets how many times Full GC is executed to compress and organize the memory space.

-XX:ParallelCMSThreads sets the number of CMS threads. (That is, concurrently marked threads)

  • The default number of threads started by CMS is (ParallelGCThreads + 3) / 4. ParallelGCThreads is the number of threads of the young generation parallel collector. When CPU resources are tight, application performance may be very poor during the garbage collection phase due to the impact of the CMS collector thread. (For example: if there are 4 CPUs, then (4+3)/4=1, then 1 CPU is used for garbage collection, that is, 1/4=25% of the CPUs are used for garbage collection; if there are 5 CPUs, Just (5+3)/4=2, 2/4=40% of the CPU is used for garbage collection, which consumes a lot of CPU. There seems to be a problem with this example. Isn’t ParallelGCThreads the number of threads for parallel GC? How did it become the CPU? number, and the formula result is the number of threads started by CMS?)

6.4. Summary

HotSpot has so many garbage collectors, so if someone asks, what are the differences between the three GCs: Serial GC, Parallel GC, and Concurrent Mark Sweep GC?

Please remember the following password:

  • If you want to minimize memory usage and parallel overhead, choose Serial GC;
  • If you want to maximize application throughput, choose Parallel GC;
  • If you want to minimize GC interruption or pause time, choose CMS GC.

6.5. Changes in CMS in subsequent versions of JDK

New feature of JDK9: CMS is marked as Dprecate (JEP291)

If you use the parameter -XX:+UseConcMarkSweepGC to turn on the CMS collector for the HotSpot virtual machine of JDK9 and above, the user will receive a warning message indicating that CMS will be abandoned in the future.

New features of JDK14: Remove CMS garbage collector (JEP363)

The CMS garbage collector has been removed. If -XX:+UseConcMarkSweepGC is used in JDK14, the JVM will not report an error, but will only give a warning message, but will not exit. The JVM will automatically fall back to starting the JVM in the default GC mode.

7. G1 collector: regionalized generational style

Why release Garbage First (G1) now that we already have the first few powerful GCs?

The reason is that the businesses that applications deal with are getting larger and more complex, and there are more and more users. Without GC, the normal operation of the application cannot be guaranteed. This often causes STW's GC to fail to keep up with actual demand, so it continues to fail. Try to optimize GC. The G1 (Garbage-First) garbage collector is a new garbage collector introduced after Java7 update 4. It is one of the most cutting-edge achievements in the development of today's collector technology.

At the same time, in order to adapt to the expanding memory and the increasing number of processors , the pause time (pause time) is further reduced while taking into account good throughput.

The official goal set for G1 is to obtain the highest possible throughput with controllable latency, so it assumes the responsibility and expectation of a "full-featured collector" .

Why is it called Garbage First (G1)?

Because G1 is a parallel collector, it divides the heap memory into many unrelated regions (Regions) (physically discontinuous). Use different Regions to represent Eden, survivor area 0, survivor area 1, old generation, etc.

G1 GC systematically avoids region-wide garbage collection across the entire Java heap. G1 tracks the value of garbage accumulation in each Region ( the space obtained by recycling and the experience value of the time required for recycling are recycled based on these two values), and maintains a priority list in the background, each time based on the allowed collection time. , give priority to the Region with the greatest value .

Since this method focuses on recycling the region with the largest amount of garbage, we give G1 a name: Garbage First.

G1 (Garbage-First) is a garbage collector for server-side applications. It is mainly aimed at machines equipped with multi-core CPUs and large-capacity memory . It can meet the GC pause time with a very high probability and also has high throughput performance characteristics. .

Officially enabled in JDK1.7, the Experimental logo has been removed. It is the default garbage collector after JDK9 , replacing the CMS collector and the Paralle1+Parallel Old combination. It is officially called a " full-featured garbage collector " by Orac1e.

At the same time, CMS has been marked as deprecated in JDK9. It is not the default garbage collector in jdk8 and needs to be enabled using -XX:+UseG1GC.

7.1. Features (advantages) of the G1 garbage collector

Compared with other GC collectors, G1 uses a new partitioning algorithm, whose characteristics are as follows:

Parallelism and Concurrency

  • Parallelism: During G1 recycling, multiple GC threads can work simultaneously, effectively utilizing multi-core computing power. At this time, the user thread STW; (parallel requires STW)
  • Concurrency: G1 has the ability to execute alternately with the application, and part of the work can be executed at the same time as the application. Therefore, generally speaking, the application will not be completely blocked during the entire recycling phase (STW is not used for concurrency)

Collection by generation

  • From a generational perspective, G1 is still a generational garbage collector . It distinguishes between the young generation and the old generation. The young generation still has the Eden area and the Survivor area. But from the perspective of the heap structure, it does not require the entire Eden area, young generation or old generation to be continuous, and it no longer insists on a fixed size and fixed number. (After each area is recycled, data from a different era or a different area may be placed next time.)
  • Divide the heap space into several regions (Regions), which contain the logical young generation and old generation.
  • Unlike previous collectors, it takes into account both the young and old generations . Compared with other collectors, they either work in the young generation or the old generation;

G1 no longer behaves like other garbage collectors.

But divided into regions

Space integration**

  • CMS: "mark-and-sweep" algorithm, memory fragmentation, defragmentation after several GCs
  • G1: Divide the memory into regions. Memory recycling uses region as the basic unit. There is a copy algorithm between regions , but overall it can actually be regarded as a mark-compression (Mark-Compact) algorithm. Both algorithms can avoid memory fragmentation. This feature is conducive to long-term running of the program. When allocating large objects, the next GC will not be triggered in advance because the continuous memory space cannot be found. Especially when the Java heap is very large, the advantages of G1 are even more obvious. (The larger the heap, the more valuable garbage can be recycled only)

Predictable pause time model (ie: soft real-time)

Real-time means the required STW time, and I will clean up the garbage for you within this time. Soft real-time means that STW=10ms, then 90% of the cleaning can be done within 10ms, which is soft real-time. Sometimes it is possible to exceed 10ms.

  • Predictable pause time model This is another major advantage of G1 over CMS. In addition to pursuing low pauses, G1 can also establish a predictable pause time model, allowing users to clearly specify a time segment with a length of M milliseconds. Within, the time spent on garbage collection shall not exceed N milliseconds.

  • Due to partitioning, G1 can only select part of the area for memory recycling, which reduces the scope of recycling and can better control the occurrence of global pauses.

  • G1 tracks the value of garbage accumulation in each Region (the amount of space obtained by recycling and the experience value of the time required for recycling), and maintains a priority list in the background. Each time, based on the allowed collection time, the Region with the greatest value is recycled first . This ensures that the G1 collector can obtain the highest possible collection efficiency within a limited time .

  • Compared with CMS GC, G1 may not be able to achieve the delayed pause of CMS in the best case, but it is much better in the worst case.

7.2. Disadvantages of the G1 garbage collector

Compared with CMS, G1 does not yet have all-round and overwhelming advantages. For example, when a user program is running, G1’s memory usage (Footprint) for garbage collection or additional execution load (Overload; G1 itself takes up 10%-20% of the memory) is higher than that of CMS. .

From experience, CMS will most likely perform better than G1 in small memory applications, while G1 will take advantage of large memory applications. The balance point is between 6-8GB. (In the range of 6-8GB, the two even reads are about the same. If it is larger than this range, G1 has more advantages.)

7.3. G1 parameter setting

  • -XX:+UseG1GC manually specifies to use the G1 garbage collector to perform memory recycling tasks
  • -XX:G1HeapRegionSize sets the size of each Region. The value is a power of 2, ranging from 1MB to 32MB. The goal is to divide approximately 2048 regions according to the minimum Java heap size (the number of regions is fixed, and the region size can float). The default is 1/2000 of the heap memory. (Region size ranges from 1MB to 32MB, a total of 2048 regions, so the total space size ranges from 2G-64G)
  • -XX:MaxGCPauseMillis sets the maximum GC pause time indicator expected to be achieved (the JVM will try its best to achieve it, but it is not guaranteed to be achieved). The default value is 200ms (the STW value is too small, and the area recycled each time is less. If the user thread generates garbage too fast and cannot be recycled quickly, Full GC will be triggered, which will lead to a longer STW.)
  • -XX:+ParallelGCThread sets the value of the number of STW worker threads . Set up to 8
  • -XX:ConcGCThreads sets the number of threads marked concurrently. Set n to about 1/4 of the number of parallel garbage collection threads (ParallelGCThreads). (1/4 of parallel)
  • -XX:InitiatingHeapOccupancyPercent sets the Java heap occupancy threshold that triggers concurrent GC cycles. If this value is exceeded, GC is triggered. The default value is 45. (As will be discussed later, the periodic type has three links: 1. Minor GC; 2. Concurrent marking link; 3. Minx GC)

7.4. Common operating steps of G1 collector

The design principle of G1 is to simplify JVM performance tuning. Developers only need three simple steps to complete the tuning:

  • Step 1: Turn on the G1 garbage collector
  • Step 2: Set the maximum memory of the heap
  • Step 3: Set the maximum pause time

G1 provides three garbage collection modes: YoungGC, Mixed GC and Fu11GC, which are triggered under different conditions.

7.5. Applicable scenarios of G1 collector

For server-side applications, targeting machines with large memory and multi-processors. (Performance on a normal-sized heap is not surprising)

The most important application is to provide solutions for applications that require low GC latency and have large heaps;

For example: when the heap size is about 6GB or larger, the predictable pause time can be less than 0.5 seconds; (G1 ensures that each GC pause time will not be reduced by incremental cleaning of only part of the Region instead of all of it. too long).

Used to replace the CMS collector in JDK1.5; in the following situations, using G1 may be better than CMS:

  • ① More than 50% of the Java heap is occupied by active data;
  • ② The frequency of object allocation or age promotion varies greatly;
  • ③ GC pause time is too long (longer than 0.5 to 1 second)

In the HotSpot garbage collector, except for G1, other garbage collectors use built-in JVM threads to perform multi-threaded operations of GC, while G1 GC can use application threads to undertake GC work running in the background, that is, when the JVM GC thread processing speed is slow When the system calls the application thread to help speed up the garbage collection process .

7.6. Partition Region: Divide into zeros

(It means dividing a whole piece into small pieces.)

When using the G1 collector, it divides the entire Java heap into about 2048 independent Region blocks of the same size. The size of each Region block is determined according to the actual size of the heap space. The overall size is controlled between 1MB and 32MB, and is 2 to the Nth power, that is, 1MB, 2MB, 4MB, 8MB, 16MB, 32MB. It can be set by -XX:G1HeapRegionSize. All Regions are the same size and will not change during the JVM life cycle.

Although the concepts of the new generation and the old generation are still retained, the new generation and the old generation are no longer physically isolated. They are both collections of a part of the Region (which does not need to be continuous). Logical continuity is achieved through the dynamic allocation of Region.

A region may belong to Eden, Survivor or Old/Tenured memory areas. But a region can only belong to one role. E in the figure indicates that the region belongs to the Eden memory area, S indicates that it belongs to the survivor memory area, and O indicates that it belongs to the 01d memory area. The blank space in the figure represents unused memory space. (After a region is cleared, it will be placed in the free list. If new space is needed, it will be used directly from the free list.)

The G1 garbage collector also adds a new memory area called Humongous memory area, as shown in the H block in the figure. It is mainly used to store large objects. If it exceeds 1.5 regions, it will be placed in H.

Reason for setting H:

For objects in the heap, they will be allocated directly to the old generation by default, but if it is a short-lived large object, it will have a negative impact on the garbage collector. In order to solve this problem, G1 divides a Humongous area, which is used to store large objects specifically. If an H area cannot hold a large object, G1 will look for a continuous H area to store it . In order to find the continuous H area, Full GC sometimes has to be started. Most actions of G1 treat area H as part of the old generation.

7.7. Recycling process of G1 garbage collector

The garbage collection process of G1 GC mainly includes the following three links:

  • Young GC
  • Old generation concurrent marking process (Concurrent Marking)
  • Mixed GC (both old and new generations will be recycled)
  • (If necessary, the single-threaded, exclusive, high-intensity Full GC will continue to exist. It provides a failure protection mechanism for GC evaluation failure, that is, strong recycling. Just like the previous CMS recycling failed, there will be MSC backup plan)

Clockwise, young gc -> young gc + concurrent mark -> Mixed GC order for garbage collection.

The application allocates memory and starts the young generation collection process when the Eden area of ​​the young generation is exhausted ; the young generation collection phase of G1 is a parallel exclusive collector. During the young generation recycling period, G1 GC suspends all application threads and starts multiple threads to perform young generation recycling. Then move the surviving objects from the young generation interval to the Survivor interval or the old interval, or both intervals may be involved .

When the heap memory usage reaches a certain value (default 45%), the old generation concurrent marking process starts.

The mixing and recycling process begins immediately after marking is completed. For a mixed collection period, G1 GC moves surviving objects from the old range to the free range, and these free ranges become part of the old generation. Unlike the young generation, the G1 collector in the old generation is different from other GCs. The G1 old generation collector does not need the entire old generation to be recycled. It only needs to scan/recycle a small part of the old generation Region at a time . At the same time, this old generation Region is recycled together with the young generation.

For example: a web server, the maximum heap memory of the Java process is 4G, responds to 1500 requests per minute, and newly allocates about 2G of memory every 45 seconds. G1 will perform young generation recycling every 45 seconds. Every 31 hours, the usage rate of the entire heap will reach 45%, and it will start the old generation concurrent marking process. After the marking is completed, four to five mixed collections will begin.

7.7.1. Remembered Set (referred to as RSet)

The problem of an object being referenced by different areas.

A Region cannot be isolated. Objects in a Region may be referenced by objects in any other Region. When determining the survival of an object, is it necessary to scan the entire Java heap to ensure accuracy?

This problem also exists in other generational collectors (and G1 is more prominent); does recycling the new generation have to scan the old generation at the same time? This will reduce the efficiency of MinorGC.

(If an object is referenced by different areas, it means that all areas must be traversed to know whether it refers to other areas, and it will collapse, so there is RSet.)

Solution:

  • Regardless of G1 or other generational collectors, the JVM uses Remembered Set to avoid global scans:
  • -Each Region has a corresponding Remembered Set ;
  • Every time a Reference type data is written, a Write Barrier will be generated to temporarily interrupt the operation;
  • Then check whether the object pointed to by the reference to be written is in a different Region from the Reference type data (other collectors: check whether the old generation object refers to the new generation object);
  • If they are different, record the relevant reference information through the CardTable into the Remembered Set corresponding to the Region where the reference points to the object;
  • When performing garbage collection, add the Remembered Set to the enumeration range of the GC root node; this ensures that no global scan is performed and no errors are missed.
  • (As mentioned earlier, G1 itself consumes about 10%-20% of the memory because it stores RSet.)

7.7.2. G1 recycling process one: young generation GC

When the JVM starts, G1 first prepares the Eden area. The program continuously creates objects in the Eden area during operation. When the Eden space is exhausted, G1 will start a young generation garbage collection process.

Young generation garbage collection will only recycle the Eden area and Survivor area.

During YGC, first G1 stops the execution of the application (exclusively, Stop-The-Wor1d), G1 creates a collection set (Collection Set), the collection set refers to the collection of memory segments that need to be recycled, and the young generation recycling process The collection collection contains all memory segments in the young generation Eden area and Survivor area.

Then start the recycling process as follows:

The first stage is scanning the root.

The root refers to the object pointed to by the static variable, the local variables in the method call chain being executed, etc. The root reference, together with the external reference recorded by RSet, serves as the entry point for scanning live objects.

(When recycling the old generation, you don’t have to worry about the young generation reference pointing to the old generation, because the recycling of the old generation will also trigger the recycling of the new generation, and the GC Root must be determined)

In the second stage, RSet is updated.

Process cards in the dirty card queue and update RSet. After this phase is completed, RSet can accurately reflect the old generation's references to objects in the memory segment. (Mainly refers to the object references from the old generation to the new generation)

For the application's reference assignment statement object.field=object, the JVM will perform special operations before and after to enqueue a card that stores the object reference information in the dirty card queue. When the young generation is recycled, G1 will process all cards in the Dirty Card Queue to update the RSet to ensure that the RSet accurately reflects the reference relationship in real time.

So why not update RSet directly at the reference assignment statement? This is for performance needs. The processing of RSet requires thread synchronization (the queue is first in, first out, no thread synchronization is required), and the overhead will be very high. The performance of using the queue will be much better.

The third stage is to process RSet.

Identify objects in Eden pointed to by old generation objects. These objects in Eden pointed to are considered alive objects.

The fourth stage is to copy the object.

At this stage, the object tree is traversed, and the surviving objects in the memory segment of the Eden area will be copied to the empty memory segment in the Survivor area. If the age of the surviving objects in the memory segment of the Survivor area does not reach the threshold, the age will be increased by 1 until the threshold is reached. Will be copied to the empty memory segment in the O1d area. If the Survivor space is not enough, some data in the Eden space will be directly promoted to the old generation space.

The fifth stage is to deal with references.

Handle Soft, Weak, Phantom, Final, JNI Weak and other references. In the end, the data in the Eden space is empty, the GC stops working, and the objects in the target memory are stored continuously without fragmentation, so the copy process can achieve the effect of memory defragmentation and reduce fragmentation.

7.7.3. G1 recycling process two: concurrent marking process

1. Initial marking phase : Mark objects directly reachable from the root node. This stage is STW and will trigger a young generation GC.

2. Root Region Scanning : G1 GC scans the old generation area objects directly reachable by the Survivor area and marks the referenced objects. This process must be completed before young GC.

3. Concurrent Marking : Concurrent marking is performed on the entire heap (executed concurrently with the application). This process may be interrupted by young GC. During the concurrent marking phase, if all objects in the area object are found to be garbage, the area will be recycled immediately (real-time recycling, no need to wait). At the same time, during the concurrent marking process, the object activity of each region (the proportion of surviving objects in the region) is calculated.

4. Remark : As the application continues, the last marking result needs to be corrected. It's from STW. G1 uses an initial snapshot algorithm that is faster than CMS: snapshot-at-the-beginning (SATB).

5. Exclusive cleanup (cleanup, STW) : Calculate the surviving objects and GC recycling ratio in each area, sort them, and identify areas that can be mixed for recycling. Pave the way for the next stage. It's from STW.

This stage does not actually do garbage collection.

6. Concurrent cleaning phase : Identify and clean completely free areas.

7.7.4. G1 recycling process three: mixed recycling

When more and more objects are promoted to the old generation o1d region, in order to avoid the heap memory being exhausted, the virtual machine triggers a mixed garbage collector, that is, Mixed GC. This algorithm is not an Old GC, except for recycling the entire Young Region. , and also recycle part of the Old Region. It should be noted here: it is part of the old generation, not all of the old generation (the reason for the short pause of G1). You can choose which O1d Regions are collected, so you can control the time it takes for garbage collection. It should also be noted that Mixed GC is not Full GC.

After concurrent marking ends, the memory segments that are 100% garbage in the old generation are recycled, and the memory segments that are partially garbage are calculated. By default, these old generation memory segments will be recycled 8 times (can be set by -XX:G1MixedGCCountTarget) .

The collection set of mixed recycling includes one-eighth of the old generation memory segments, Eden area memory segments, and Survivor area memory segments. The algorithm of mixed collection is exactly the same as that of young generation collection, except that more memory segments from the old generation are collected . For the specific process, please refer to the young generation recycling process above.

Since memory segments in the old generation are recycled in 8 times by default, G1 will prioritize memory segments with a lot of garbage. The higher the proportion of garbage in the memory segment, the more it will be recycled first. And there is a threshold that determines whether the memory segment is recycled , -XX:G1MixedGCLiveThresholdPercent, the default is 65% , which means that the proportion of garbage in the memory segment must reach 65% before it will be recycled. If the garbage ratio is too low, it means that the ratio of surviving objects is high and copying will take more time.

Mixed recycling does not necessarily have to be done 8 times. There is a threshold -XX:G1HeapWastePercent, the default value is 10%, which means that 10% of the space in the entire heap memory is allowed to be wasted, which means that if it is found that the proportion of garbage that can be recycled accounts for less than 10% of the heap memory , it will no longer Perform mixed recycling . Because GC takes a lot of time but recovers very little memory.

7.7.5. G1 recycling optional process four: Full GC

The original intention of G1 is to avoid the occurrence of Full GC. However, if the above method does not work properly, G1 will stop the execution of the application (Stop-The-World) and use a single- threaded memory recovery algorithm for garbage collection. The performance will be very poor and the application pause time will be very long.

To avoid the occurrence of Full GC, adjustments need to be made once it occurs. When will Full GC occur? For example, if the heap memory is too small , when G1 copies live objects and no empty memory segments are available, it will fall back to full gc. This situation can be solved by increasing the memory .

There may be two reasons for G1 Full GC:

  • 1. There is not enough to-space to store promoted objects during Evacuation (recycling phase);
  • 2. Space is exhausted before concurrent processing is completed. (Full GC is triggered before memory overflow)

7.7.6. G1 recycling process: supplement

From the information disclosed by Oracle, we can know that the recycling phase (Evacuation) was actually designed to be executed concurrently with the user program, but this is more complicated to do. Considering that G1 only recycles a part of the Region, the pause time is the user It is controllable, so there is no urgency to implement it, and I chose to put this feature into the low-latency garbage collector (ie ZGC) that appeared after G1 . In addition, we also consider that G1 is not just for low latency. Pausing user threads can maximize garbage collection efficiency. In order to ensure throughput, we chose the implementation plan of completely pausing user threads.

7.7.7. Optimization suggestions for G1 recycling

young generation size

  • Avoid explicitly setting the young generation size using related options like -Xmn or -XX:NewRatio
  • Fixed young generation size will cover the pause time target (because YGC is exclusive, unreasonable young generation space will cause the pause time target to be unreachable, allowing the JVM to dynamically adjust)

Don’t be too strict with your timeout goals

  • The throughput target for G1 GC is 90% application time and 10% garbage collection time
  • When evaluating G1 GC throughput, don't be too strict with your pause time goals. Targets that are too strict mean you are willing to incur more garbage collection overhead, which will directly affect throughput.

8. Summary of garbage collector

8.1. Summary of 7 classic garbage collectors

As of JDK1.8, there are 7 different garbage collectors. Each garbage collector has different characteristics. When using it, you need to choose different garbage collectors according to the specific situation.

GC development stage: Serial => Parallel (parallel) => CMS (concurrency) => G1 => ZGC

8.2. Garbage collector combination

There is a large gap in the implementation of virtual machines from different manufacturers and different versions. All collectors and combinations (connections) of the HotSpot virtual machine after JDK7/8 are as shown below: (mentioned earlier)

8.3. How to choose a garbage collector

The configuration of the Java garbage collector is a very important choice for JVM optimization. Choosing the appropriate garbage collector can greatly improve the performance of the JVM.

How to choose a garbage collector?

  • 1. Prioritize adjusting the heap size to allow JVM to adapt.
  • 2. If the memory is less than 100M, use the serial collector
  • 3. If it is a single-core, stand-alone program, and there is no pause time requirement, the serial collector
  • 4. If there are multiple CPUs, high throughput is required, and the pause time is allowed to exceed 1 second, choose parallelism or the JVM itself.
  • 5. If you have multiple CPUs, pursue low pause time, and need to respond quickly (for example, the delay cannot exceed 1 second, such as Internet applications), use a concurrent collector.
  • 6. The official recommendation is G1, which has high performance. Nowadays, Internet projects basically use G1. (My company does not have it)

Finally, one point needs to be made clear:

  • There is no best collector, and there is no universal collector.
  • Tuning is always based on specific scenarios and specific needs, and there is no collector that can be used once and for all (if it is done once and for all, the value of tuners will have no meaning.)

8.4. Interview

Regarding garbage collection, the interviewer can go in depth step by step from various angles of theory and practice, and the interviewer is not necessarily required to know everything. But if you understand the principles, it will definitely become a bonus in the interview. The more general and basic parts here are as follows:

  • What are the garbage collection algorithms? How to determine whether an object can be recycled?
  • The basic process of garbage collector work.

In addition, everyone needs to pay more attention to the various commonly used parameters in the garbage collector chapter.

9. GC log analysis

9.1. JVM parameters

The GC log analysis involved in the JVM is introduced earlier.

By reading GC logs, we can understand the Java virtual machine memory allocation and recycling strategy.

Parameter list for memory allocation and garbage collection:

  • -XX:+PrintGC outputs GC logs. Similar to: -verbose:gc
  • -XX:+PrintGCDetails outputs detailed logs of GC
  • -XX:+PrintGCTimeStamps outputs the GC timestamp (in the form of base time)
  • -XX:+PrintGCDateStamps outputs the GC timestamp (in the form of date, such as 2013-05-04T21:53:59.234+0800)
  • -XX:+PrintHeapAtGC prints heap information before and after GC
  • -Xloggc:../logs/gc.1og log file output path

Turn on GC log ( -XX:+PrintGC )

Supplementary notes to the log:

  • "[GC..." and "[Full GC..." indicate the pause type of this garbage collection. If there is "Full", it means that "Stop The World" occurred in GC.
  • The name of the new generation using the Seria1 collector is Default New Generation, so it is displayed as "[DefNew"
  • Using the ParNew collector, the name of the new generation will become "[ParNew", which means "Parallel New Generation"
  • The name of the new generation using Parallel Scavenge collector is "[PSYoungGen"
  • The collection in the old generation is the same as that in the new generation, and the name is also determined by the collector. There will be Old in the old generation, such as "[ParOldGen"
  • If you use the G1 collector, it will be displayed as "garbage-first heap"

Allocation Failure indicates that the reason for this GC is that there is not enough space in the young generation to store new data.

[PSYoungGen: 5986K->696K(8704K) ] 5986K->704K(9216K) Inside the square brackets: the size of the young generation before GC recycling, the size after recycling, (total size of the young generation); outside the square brackets: the sum of the young generation before GC recycling Old generation size, size after recycling, (total size of young generation and old generation)

User represents the user state recycling time, sys kernel state recycling time, and rea actual time consumption. Due to multi-core reasons, the total time may exceed the rea1 time

YGC/Minor GC log:

9.2. GC log analysis tool

You can use some tools to analyze these GC logs.

Commonly used log analysis tools include: GCViewer , GCEasy , GCHisto, GCLogViewer, Hpjmeter, garbagecat, etc.

GCViewer

Releases · chewiebug/GCViewer · GitHub

GCEasy

Universal JVM GC analyzer - Java Garbage collection log analysis made easy

10. New developments in garbage collectors

GC is still in rapid development. The current default option G1 GC is constantly being improved . Many of the shortcomings we originally thought, such as serial Full GC, inefficiency of Card Table scanning, etc., have been greatly improved. For example, After JDK 10, Full GC already runs in parallel, and in many scenarios, its performance is slightly better than the parallel Full GC implementation of Parallel GC.

Even though Serial GC is relatively old, its simple design and implementation may not be outdated. Its own overhead, whether it is the overhead of GC-related data structures or the overhead of threads, is very small, so with the advent of cloud computing With the rise of , Serial GC has found a new stage in new application scenarios such as Serverless.

Unfortunately, CMS GC has been marked as obsolete in JDK9 and removed in JDK14, although it still has a very large user group due to theoretical flaws in its algorithm and other reasons.

JDK11 new features:

10.1. Shenandoah GC of Open JDK12

The G1 collector has been the default collector for several years now.

We also saw the introduction of two new collectors: ZGC (coming in JDK11) and Shenandoah (Open JDK12).

Key Features: Low pause time

shenandoash GC for Open JDK12: low-pause-time GC (experimental) .

Shenandoah is undoubtedly the loneliest among many GCs. It is the first HotSpot garbage collector not developed by the Oracle team. Inevitable official rejection . For example, Oracle, which claims that there is no difference between OpenJDK and OracleJDk, still refuses to support Shenandoah in OracleJDK12.

The Shenandoah garbage collector was originally implemented by Pauseless GC, a garbage collector research project conducted by RedHat, aiming to achieve low pause requirements for memory recycling on the JVM . Contributed to OpenJDK in 2014.

The Red Hat R&D Shenandoah team claims that the pause time of the Shenandoah garbage collector has nothing to do with the heap size, which means that no matter whether the heap is set to 200MB or 200GB, 99.9% of the goals can limit the pause time of garbage collection to less than ten milliseconds. Actual performance will depend on actual work heap size and workload.

This is the paper data published by RedHat in 2016. The test content is to use ES to index 200GB of Wikipedia data. From the results:

  • The pause time is indeed a qualitative leap compared to other collectors , but it has not achieved the goal of controlling the maximum pause time within ten milliseconds.
  • There was a significant drop in throughput , and the total running time was the longest among all test collectors.

Summarize

  • Weaknesses of Shenandoah GC: Throughput decrease under high operating load.
  • Shenandoah GC's strength: low latency.

10.2. Shocking and revolutionary ZGC

The goal of ZGC is highly similar to that of Shenandoah. On the premise of having as little impact on throughput as possible, it is possible to achieve low latency that can limit the stop time of garbage collection to less than ten milliseconds at any heap memory size.

The book "In-depth Understanding of the Java Virtual Machine" defines ZGC as follows: The ZGC collector is based on Region memory layout, (temporarily) without generation, and uses technologies such as read barriers, dyed pointers, and memory multiple mapping to implement it. A garbage collector based on concurrent mark-and-compression algorithm with low latency as the primary goal .

The working process of ZGC can be divided into 4 stages: concurrent marking - concurrent preparation reallocation - concurrent reallocation - concurrent remapping, etc. (There is no STW in the concurrent phase)

ZGC is executed concurrently almost everywhere, except for the initial marking of STW (where the time is spent). Therefore, the pause time is almost spent on the initial marking, and the actual time of this part is very small.

Although ZGC is still in an experimental state and has not completed all features, its performance is already quite impressive at this time. It is not an exaggeration to describe it as "shocking and revolutionary".

In the future, it will be the preferred garbage collector for server-side, large memory, and low-latency applications.

New features of JDK14:

JEP 364: ZGC application on macOS

JEP 365: ZGC application on Windows

Before JDK14, ZGC was only supported by Linux.

Although many users of ZGC use Linux-like environments, on Windows and macOS, people also need ZGC for development deployment and testing. Many desktop applications can also benefit from ZGC. Therefore, ZGC features were ported to Windows and macOS.

Now ZGC can also be used on mac or Windows, the example is as follows:

-XX:+UnlockExperimentalVMOptions -XX:+UseZGC

10.3. Other garbage collection: AliGC

AliGC is based on the G1 algorithm of the Alibaba JVM team and is oriented to LargeHeap application scenarios. Comparison under specified scenarios:

Of course, other manufacturers also provide various unique GC implementations, such as the more famous low-latency GC, Zing ( The Azul Garbage Collector ). If you are interested, you can refer to the link provided.

Guess you like

Origin blog.csdn.net/weixin_47465999/article/details/127118579