Personal summary jvm garbage collector related

Jvm Garbage Collection has prepared these three blog posts for sorting out. In the process of writing the blog posts, I also read them while recording them. I think this learning method is easier for people to remember and will not easily be forgotten. The previous learning mode used to read PDF documents, read books, etc., but there was a disadvantage that I remembered it at that time and would forget it after a while, so I want to make a note summary of the important part of the learning process, so as to review and review later (learning Skills are only personal opinions) At the same time, I also hope that lz's blog can help the majority of gardeners. Set up a Flag here! I will stick to blogging in the future. Haha-well, let's get back to business.

Knowledge review:

The first " Jvm Garbage Collector (Basics) " mainly talks about the life and death of judging objects? Two basic algorithms for judging the life and death of objects, reference counting method, reachability analysis algorithm, and method area recovery. In the second " Jvm Garbage Collector (Algorithm) ", several commonly used algorithms of garbage collection are mainly introduced: mark-sweep, copy algorithm, mark-sort algorithm, generational collection algorithm. Then we focus on the garbage collector of Jvm (serial collector, parnew collector, parallel scavenge collector, serial old collector, parallel old collector, cms collector, g1 collector). I said so much before to pave the way for it.

Before entering officially, take a look at the collectors included in the illustrated HotSpot virtual machine:

The figure shows 7 types of collectors acting on different generations. If there is a connection between the two collectors, it means that they can be used together. The area where the virtual machine is located indicates whether it belongs to the new generation or the old generation collector.

New generation collectors : Serial, ParNew, Parallel Scavenge

Old age collector : CMS, Serial Old, Parallel Old

Whole stack collector : G1

Several related concepts:

Parallel collection : Refers to multiple garbage collection threads working in parallel, but at this time the user thread is still in a waiting state.

Concurrent collection : Refers to user threads and garbage collection threads working at the same time (not necessarily in parallel, they may be executed alternately). The user program continues to run, while the garbage collector is running on another CPU.

Throughput : the ratio of the time the CPU spends running user code to the total time consumed by the CPU (throughput = running user code time / (running user code time + garbage collection time)). For example: the virtual machine runs for 100 minutes, and the garbage collector takes 1 minute, then the throughput is 99%

One: Serial collector

The Serial collector is the most basic collector with the longest development history.

Features: Single-threaded, simple and efficient (compared to the single-threaded of other collectors). For the environment limited to a single CPU, the Serial collector has no thread interaction overhead. Concentrating on garbage collection will naturally obtain the highest single-threaded mobile phone. effectiveness. When the collector performs garbage collection, it must suspend all other worker threads until it ends (Stop The World).

Application scenario : Applicable to virtual machines in Client mode.

Schematic diagram of the operation of the Serial / Serial Old collector

Two: ParNew collector

The ParNew collector is actually a multi-threaded version of the Serial collector.

Except for the use of multithreading, the other behaviors are exactly the same as the Serial collector (parameter control, collection algorithm, Stop The World, object allocation rules, recycling strategies, etc.).

Features : Multithreading, the number of collection threads enabled by the ParNew collector by default is the same as the number of CPUs. In an environment with a lot of CPUs, you can use the -XX:ParallelGCThreads parameter to limit the number of garbage collection threads.

　　　Same as the Serial collector, there is the Stop The World problem

Application scenario : The ParNew collector is the preferred new-generation collector among many virtual machines running in Server mode, because it is the only one that can work with the CMS collector in addition to the Serial collector.

The operation diagram of the ParNew/Serial Old combined collector is as follows:

Three: Parallel Scavenge collector

It is closely related to throughput, so it is also called throughput priority collector.

Features : The new-generation collector is also a collector that uses a replication algorithm, and it is a parallel multi-threaded collector (similar to the ParNew collector).

The goal of the collector is to achieve a controllable throughput. There is another point worth paying attention to: GC adaptive adjustment strategy (the most important difference from the ParNew collector)

GC adaptive adjustment strategy : Parallel Scavenge collector can set -XX:+UseAdptiveSizePolicy parameter. When the switch is turned on, there is no need to manually specify the size of the young generation (-Xmn), the ratio of Eden to Survivor area (-XX:SurvivorRation), the age of the object promoted to the old generation (-XX:PretenureSizeThreshold), etc. The virtual machine is based on the operation of the system The condition collects performance monitoring information, and dynamically sets these parameters to provide the optimal pause time and the highest throughput. This adjustment method is called the GC's adaptive adjustment strategy.

The Parallel Scavenge collector uses two parameters to control throughput:

XX:MaxGCPauseMillis controls the maximum garbage collection pause time
XX: GCRatio directly sets the size of the throughput.

Four: Serial Old Collector

Serial Old is the old version of the Serial collector.

Features : It is also a single-threaded collector, using a mark-and-sort algorithm.

Application scenario : It is mainly used in a virtual machine in Client mode. It can also be used in Server mode.

The two main uses in Server mode (explained in detail in the follow-up...):

Used in conjunction with the Parallel Scavenge collector in JDK1.5 and previous versions.
As a backup solution for the CMS collector, it is used when concurrently collecting Concurent Mode Failure.

Serial / Serial Old collector working process diagram (Serial collector diagram is the same):

Five: Parallel Old Collector

It is the old version of the Parallel Scavenge collector.

Features : Multi-threaded, using mark-arrangement algorithm.

Application scenarios : Parallel Scavenge+Parallel Old collectors can be given priority when high throughput and CPU resource-sensitive occasions are important.

Parallel Scavenge/Parallel Old collector working process diagram:

Six: CMS collector

A collector that aims to obtain the shortest recovery pause time.

Features : Based on the mark-clear algorithm. Concurrent collection, low pause.

Application scenario : It is suitable for scenarios where the response speed of the service is emphasized, and the system pause time is the shortest, and the user experience is better. Such as web programs, b/s services.

The operation process of the CMS collector is divided into the following 4 steps:

Initial mark : mark the objects that GC Roots can reach directly. It is fast but there is still a Stop The World problem.

Concurrent marking : The process of GC Roots Tracing to find out live objects and user threads can execute concurrently.

Re-marking : In order to correct the marking record of the part of the object whose marking is changed due to the continued operation of the user program during concurrent marking. There is still the Stop The World problem.

Concurrent cleanup : clean up and recycle the marked objects.

The memory reclamation process of the CMS collector is executed concurrently with the user thread.

Working process diagram of CMS collector:

Disadvantages of CMS collector:

Very sensitive to CPU resources.
Unable to deal with floating garbage, Concurrent Model Failure may fail and cause another Full GC.
Because of the use of the mark-and-sweep algorithm, there will be a problem of space fragmentation. As a result, large objects cannot allocate space and have to trigger a Full GC in advance.

Seven: G1 collector

A garbage collector for server applications.

Features are as follows:

Parallel and concurrency: G1 can take full advantage of the hardware advantages in a multi-CPU and multi-core environment, and use multiple CPUs to shorten the Stop-The-World pause time. Some collectors originally need to pause the Java thread to perform GC actions, but the G1 collector can still allow Java programs to continue running in a concurrent manner.

Generational collection: G1 can manage the entire Java heap on its own, and uses different methods to process newly created objects and old objects that have survived for a period of time and have survived multiple GCs to obtain better collection results.

Space integration: G1 will not generate space fragments during operation, and can provide regular usable memory after collection.

Predictable pause: In addition to pursuing a low pause, G1 can also build a predictable pause time model. Allows users to clearly specify that in a time period of M milliseconds, the time spent on garbage collection shall not exceed N milliseconds.

Why can G1 build a predictable pause time model?

Because it systematically avoids garbage collection in the entire Java heap. G1 tracks the size of the garbage accumulation in each region, maintains a priority list in the background, and recycles the region with the greatest value each time according to the allowable collection time. This ensures that the highest possible collection efficiency can be obtained in a limited time.

The difference between G1 and other collectors :

The working scope of other collectors is the entire young or old generation, and the working scope of the G1 collector is the entire Java heap. When using the G1 collector, it divides the entire Java heap into multiple independent regions of equal size (Region). Although the concepts of the new generation and the old generation are also retained, the new generation and the old generation are no longer isolated from each other. They are all a collection of regions (not necessarily continuous).

Problems with the G1 collector:

Regions cannot be isolated. The objects allocated in the Region can have a reference relationship with any object in the Java heap. When the reachability analysis algorithm is used to determine whether an object is alive, it is necessary to scan the entire Java heap to ensure accuracy. Other collectors also have this problem (G1 is more prominent). Will cause the efficiency of Minor GC to decrease.

How does the G1 collector solve the above problems?

Use Remembered Set to avoid scanning the entire stack. Each Region in G1 has a corresponding Remembered Set. When the virtual machine discovers that the program writes the Reference type, it will generate a Write Barrier to temporarily interrupt the write operation, and check whether the Reference object is in multiple Regions (ie Check whether the object in the new generation is referenced in the old generation). If so, record the relevant reference information into the Remembered Set of the Region to which the referenced object belongs through CardTable. When reclaiming memory, adding Remembered Set to the enumeration range of the GC root node can ensure that the entire heap is not scanned and there will be no omissions.

If the operation of maintaining Remembered Set is not calculated, the G1 collector can be roughly divided into the following steps:

Initial mark : Only mark the objects that GC Roots can reach directly, and modify the value of TAMS (Next Top at Mark Start), so that when the user program runs concurrently in the next stage, new objects can be created in the correct and available Region. (The thread needs to be paused, but it takes a short time.)

Concurrent marking : Analyze the reachability of objects in the heap starting from GC Roots to find surviving objects. (It takes a long time, but it can be executed concurrently with the user program)

Final mark : In order to correct the part of the mark record that caused the mark to change due to the execution of the user program during the concurrent mark. And the change of the object is recorded in the thread Remembered Set Logs, and the data in the Remembered Set Logs is merged into the Remembered Set. (Thread pauses are required, but can be executed in parallel.)

Screening and recycling : Sort the recycling value and cost of each Region, and formulate a recycling plan according to the GC pause time expected by the user. (Can be executed concurrently)

Schematic diagram of G1 collector operation: