Fourth, the garbage collector
1. Garbage collection algorithm
[1] Mark-Sweep algorithm (Mark-Sweep)
The garbage collector first determines all objects that need to be recycled, marks them, and then performs cleanup work to recycle all the objects that have just been marked
Disadvantages: one is not efficient, and the other is memory fragmentation
[2] Copying algorithm (Copying)
Copy the objects that are still alive after the garbage collector works from the memory area where the garbage collection is performed to another free memory area
Advantages: no memory fragmentation
Disadvantage: waste of memory area
[3] Mark-Compact algorithm
The garbage collector first determines all objects that need to be recycled and marks them, then moves the objects that do not need to be recycled (unmarked) to the free end of the memory, and finally performs garbage collection operations
Advantages: no memory fragmentation
Disadvantages: low efficiency
[4] Generational Collection Algorithm (Generational Collection)
According to the different life cycle of the object, the memory is divided into several blocks, generally the Java heap is divided into the young generation and the old generation. In this way, the most suitable collection algorithm can be adopted according to the characteristics of each era
The new generation, the life cycle of the object is short, it is suitable to use the replication algorithm
In the old age, the life cycle of objects is long, and there is no extra space to allocate guarantees, so it is suitable to use mark-sweep or mark-organize algorithms
2. Java garbage collector
【1】Serial
Serial collector
In both the new generation and the old generation, there is only one collection thread to perform garbage collection operations. When the collection thread performs garbage collection, other worker threads must be suspended until the collection ends (STW, Stop The World)
The new generation adopts single-threaded work and the replication algorithm; the old generation adopts single-threaded work and uses the mark-and-sort algorithm
【2】 ParNew
New Generation Parallel Collector
Compared with SerialGC, its optimization point is that the new generation uses parallel collection threads to work. The number of collection threads enabled by default is the same as the number of CPUs, and the number of -XX:ParallelGCThreads
parallel collection threads can be limited by parameters
The new generation uses parallel collection threads to work and uses a copy algorithm; the old generation uses a single thread to work and uses a mark-and-sort algorithm
【3】Parallel Scavenge
New Generation Parallel Collector
Compared with ParNew, its goal is to reduce STW time and achieve a controllable throughput (Throughput)
Throughput is the ratio of the time used by the CPU to execute user code to the total time consumed by the CPU (time to execute user code + GC time). For example, if the virtual machine runs for 100 minutes, and garbage collection consumes 1 minute, the throughput is (100-1) / 100 * 100% = 99%
The -XX:MaxGCPauseMillis=???
maximum pause time can be controlled by
And through -XX:GCTimeRatio=???
to control the throughput
In addition, there is a parameter -XX:+UseAdaptiveSizePolicy
. After opening it, JVM can collect performance monitoring information according to the current system operation, and dynamically adjust such as: -Xmn (new generation size), -XX:SurvivorRatio (ratio of Eden Park and Survivor Area) And -XX:PretenureSizeThreshold (promoted the age of the old age object) and other parameters
The new generation uses parallel collection threads to work and uses a copy algorithm; the old generation uses a single thread to work and uses a mark-and-sort algorithm
【4】Serial Old(PS MarkSweep)
Old serial collector
Now it is mainly used as a backup plan for the CMS collector, that is, used when Concurrent Mode Failure occurs
It uses only one collection thread to perform garbage collection operations in the young and old generations
The old-generation collector of Parallel Scavenge uses PS MarkSweep, but its implementation is very close to the implementation of Serial Old, so in many official materials, Serial Old is used instead of PS MarkSweep to explain
The new generation adopts single-threaded work and the replication algorithm; the old generation adopts single-threaded work and uses the mark-and-sort algorithm
【5】Parallel Old
Parallel collector
The new generation uses parallel collection threads to work and uses the replication algorithm; the old generation uses parallel collection threads to work and uses the mark-sort algorithm
【6】CMS(Concurrent Mark Sweep)
Concurrent Mark-Cleanup Collector
It is a collector that aims to obtain the shortest STW time
The old age uses parallel marking threads and parallel collection threads to work, using mark-sweep algorithms
work process:
-
CMS initial mark【STW】
Mark the objects that GC Roots can directly associate
-
Concurrent mark (CMS concurrent mark) [Work with user threads]
Perform GC Roots Tracing
-
CMS remark【STW】
During the correction of concurrent marking, the marking record of that part of the object is changed due to the continued operation of the user program
The execution time of this stage is greater than the execution time of the initial mark and less than the execution time of the concurrent mark
-
Concurrent sweep (CMS concurrent sweep) [Work with user threads]
Disadvantages:
-
CMS is very sensitive to CPU resources. In fact , programs designed for concurrency are very sensitive to CPU resources. Although in the concurrent marking phase, it will not STW, but it will take up a part of the thread (CPU resources), which will cause the program to slow down and reduce the overall throughput
The number of recycling threads started by default is (number of CPUs + 3) / 4. When the number of CPUs is more than 4, the garbage collection threads for parallel garbage collection are not less than 25%, and when the number of CPUs is less than 4, you need to take out Half of the computing power is given to the CMS to execute the collection thread
-
CMS cannot clean up floating garbage (Floating Garbage), Concurrent Mode Failure may occur, resulting in Full GC
Floating garbage refers to the new garbage objects that appear after the concurrent marking phase. These objects can only be collected in the next GC
-
Like other collectors, CMS cannot wait until the old age is almost full before collecting. It needs to reserve a part of the space for the parallel execution of the program operation. You can use
-XX:CMSInitiatingOccupancyFraction
to adjust the trigger percentage. The default value of this attribute: JDK5 is 68, JDK6 is 92, JDK7 is -1 (JDK7 has another parameterCMSInitiatingPermOccupancyFraction
, which corresponds to the percentage of memory triggered by permanent generation), and JDK8 is -1If the reserved memory space cannot meet the needs of the program, Concurrent Mode Failure will occur. At this time, the JVM will start the backup plan and temporarily start Serial Old to perform garbage collection . But this will increase the STW time. So the reserved memory space should not be too small
-
CMS uses a mark-sweep algorithm, which will generate memory fragmentation. Excessive memory fragmentation will cause a lot of trouble for the allocation of large objects. Although there is a lot of free space in the old generation, the Full GC will be triggered because it cannot find enough continuous memory space.
There is a parameter
-XX:+UseCMSCompactAtFullCollection
used to control the sorting operation of the memory when Full GC. But the memory consolidation operation cannot be parallelized. There is also a parameter-XX:CMSFullGCsBeforeCompaction
that is used to set how many full GCs without memory arranging are executed, followed by a compressed Full GC, the default is 0, that is, every Full GC is compressed
【7】 G1
Garbage-First, is a garbage collector for server-side applications
G1 divides the entire Java heap into multiple regions of equal size (Region). The difference from the past is that although it retains the concepts of the Cenozoic and the Elderly Generation, the Cenozoic and the Elderly Generation are no longer physically continuous. Instead, each Region contains part of the Cenozoic and Elderly Generation.
By default, the Java heap is divided into about 2048 Regions (if the number is too small, it will affect the collection efficiency and increase the scanning time). -XX:G1HeapRegionSize=???
The size of each partition can be adjusted by parameters , the default is 1MB (1048576B), the maximum is 32MB (33554432B), and it must be a power of two, less than 1MB, take 1MB, greater than 1MB, take down to the power of 2. value
G1 can also plan to avoid region-wide garbage collection in the entire Java heap. It will track the value of garbage accumulation in each Region (the amount of memory that can be released by the GC and the required collection time), and maintain one in the background Priority list, each time according to the allowable collection time, priority is given to the region with the highest value (the origin of the name Garbage-First)
Features:
-
Parallel and concurrency
Make full use of the hardware advantages of multi-core CPUs to reduce STW time and improve throughput. And the garbage collection thread and user thread can be processed concurrently
-
Generational collection
G1 logically divides each Region into Eden, Survivor and Old. Each Region will continuously adjust and switch as the G1 collector runs
-
Spatial integration
G1 as a whole uses a mark-and-sort algorithm, but from a partial point of view, it is implemented based on a copy algorithm. This means that during the operation of the G1 collector, no memory fragmentation will occur. It will not trigger the next GC in advance because the program needs to allocate large objects and cannot find contiguous memory space
-
Predictable pause
The
-XX:MaxGCPauseMillis
maximum STW time can be specified by parameters . And according to the established recycling priority list, give priority to recycling those regions with high recycling value
work process:
-
Initial Marking【STW】
Mark the objects that GC Roots can directly associate to. Shorter execution time
-
Concurrent Marking [Work with user threads]
Starting from GC Roots, the reachability analysis of the objects in the heap memory is performed to find out the surviving objects. Long execution time, but can be executed concurrently with the user program
-
Final Marking【STW】
Correct the record of mark changes due to concurrent execution of programs. Can be executed in parallel with the user program
-
Screening and recycling (Live Data Counting and Evacuation)
According to the pause time specified by the user, make a collection plan and execute the collection command. Can be executed in parallel with the user program
Note: For now, the G1 collector is not stable and it is not recommended to use it on a production system
3. The choice of garbage collector
[1] JDK source code
bool Arguments::check_gc_consistency() {
bool status = true;
uint i = 0;
if (UseSerialGC) i++;
if (UseConcMarkSweepGC || UseParNewGC) i++;
if (UseParallelGC || UseParallelOldGC) i++;
if (UseG1GC) i++;
if (i > 1) {
jio_fprintf(defaultStream::error_stream(),
"Conflicting collector combinations in option list; "
"please refer to the release notes for the combinations "
"allowed\n");
status = false;
}
return status;
}
In the C++ source code of the JDK, there is a method to check the correctness of the GC policy configuration
【2】Serial + Serial Old
Use parameters-XX:+UseSerialGC
The new generation uses serial collectors, and the old generation also uses serial collectors
【3】ParNew + Serial Old
Use parameters-XX:+UseParNewGC
The new generation uses parallel collectors and the old generation uses serial collectors
Note that using this strategy configuration, the virtual machine throws a warning message
Java HotSpot(TM) 64-Bit Server VM warning: Using the ParNew young collector with the Serial old collector is deprecated and will likely be removed in a future release
It is recommended not to use this policy configuration, it may be removed in future releases
【4】ParNew + (CMS + Serial Old)
Use parameters-XX:+UseConcMarkSweepGC
The new generation uses parallel collectors, and the old generation uses parallel-clear collectors (serial collectors do backup plans)
【5】Parallel Scavenge + Parallel Old
Use parameters -XX:+UseParallelGC
or-XX:+UseParallelOldGC
The new generation uses parallel collectors, the old generation also uses parallel collectors
JDK8 default GC policy configuration
【6】 G1
Use parameters-XX+:UseG1GC
[7] Summary
Single core: Serial
Multi-core, high throughput: Parallel Scavenge and Parallel Old
Multi-core, fast response: ParNew and CMS
4. Memory allocation of Java objects
-
Priority is given to Eden (new generation Eden Park) for distribution
-
The big object directly enters the old age. The so-called large object refers to an object that requires a large amount of continuous memory space, such as a very long string or a large array.
-XX:PretenureSizeThreshold=???
The threshold of the object size can be adjusted by parameters, the default is 0, that is, no limit. This parameter has no unit. Note that this parameter is only valid for the Serial and ParNew collectors, and the Parallel Scavenge collector does not recognize this parameter -
When the number of survival times of the object reaches MaxTenuringThreshold, it will enter the old age
-
If the total size of all objects of the same age in the survival area is greater than half of the Survivor (survival area), the objects with an age greater than or equal to this age will be copied to the old generation during the Minor GC. This is the age judgment of dynamic objects
-
When Minor GC occurs, if the surviving objects in the Eden Park and a certain survival area cannot be copied to another survival area, the space allocation guarantee mechanism will borrow memory from the old generation and copy the objects to the old generation in advance.
Can pass
XX:+HandlePromotionFailure
parameters to allow guarantee failureIf the guarantee failure is allowed, it will check whether the maximum available continuous memory in the old generation is greater than the average size of the objects promoted to the old generation, if it is greater, it will try to perform a Minor GC, otherwise it will perform a Full GC. Minor GC is risky here, because there may be large surviving objects, leading to promotion to the old generation, and the old generation does not have enough memory space, resulting in Full GC
If the guarantee failure is not allowed, proceed directly to Full GC