Java virtual machine-4-garbage collector

Fourth, the garbage collector

1. Garbage collection algorithm

[1] Mark-Sweep algorithm (Mark-Sweep)

The garbage collector first determines all objects that need to be recycled, marks them, and then performs cleanup work to recycle all the objects that have just been marked

Disadvantages: one is not efficient, and the other is memory fragmentation

[2] Copying algorithm (Copying)

Copy the objects that are still alive after the garbage collector works from the memory area where the garbage collection is performed to another free memory area

Advantages: no memory fragmentation

Disadvantage: waste of memory area

[3] Mark-Compact algorithm

The garbage collector first determines all objects that need to be recycled and marks them, then moves the objects that do not need to be recycled (unmarked) to the free end of the memory, and finally performs garbage collection operations

Advantages: no memory fragmentation

Disadvantages: low efficiency

[4] Generational Collection Algorithm (Generational Collection)

According to the different life cycle of the object, the memory is divided into several blocks, generally the Java heap is divided into the young generation and the old generation. In this way, the most suitable collection algorithm can be adopted according to the characteristics of each era

The new generation, the life cycle of the object is short, it is suitable to use the replication algorithm

In the old age, the life cycle of objects is long, and there is no extra space to allocate guarantees, so it is suitable to use mark-sweep or mark-organize algorithms

2. Java garbage collector

【1】Serial

Serial collector

In both the new generation and the old generation, there is only one collection thread to perform garbage collection operations. When the collection thread performs garbage collection, other worker threads must be suspended until the collection ends (STW, Stop The World)

The new generation adopts single-threaded work and the replication algorithm; the old generation adopts single-threaded work and uses the mark-and-sort algorithm

【2】 ParNew

New Generation Parallel Collector

Compared with SerialGC, its optimization point is that the new generation uses parallel collection threads to work. The number of collection threads enabled by default is the same as the number of CPUs, and the number of -XX:ParallelGCThreadsparallel collection threads can be limited by parameters

The new generation uses parallel collection threads to work and uses a copy algorithm; the old generation uses a single thread to work and uses a mark-and-sort algorithm

【3】Parallel Scavenge

New Generation Parallel Collector

Compared with ParNew, its goal is to reduce STW time and achieve a controllable throughput (Throughput)

Throughput is the ratio of the time used by the CPU to execute user code to the total time consumed by the CPU (time to execute user code + GC time). For example, if the virtual machine runs for 100 minutes, and garbage collection consumes 1 minute, the throughput is (100-1) / 100 * 100% = 99%

The -XX:MaxGCPauseMillis=???maximum pause time can be controlled by

And through -XX:GCTimeRatio=???to control the throughput

In addition, there is a parameter -XX:+UseAdaptiveSizePolicy. After opening it, JVM can collect performance monitoring information according to the current system operation, and dynamically adjust such as: -Xmn (new generation size), -XX:SurvivorRatio (ratio of Eden Park and Survivor Area) And -XX:PretenureSizeThreshold (promoted the age of the old age object) and other parameters

The new generation uses parallel collection threads to work and uses a copy algorithm; the old generation uses a single thread to work and uses a mark-and-sort algorithm

【4】Serial Old(PS MarkSweep)

Old serial collector

Now it is mainly used as a backup plan for the CMS collector, that is, used when Concurrent Mode Failure occurs

It uses only one collection thread to perform garbage collection operations in the young and old generations

The old-generation collector of Parallel Scavenge uses PS MarkSweep, but its implementation is very close to the implementation of Serial Old, so in many official materials, Serial Old is used instead of PS MarkSweep to explain

The new generation adopts single-threaded work and the replication algorithm; the old generation adopts single-threaded work and uses the mark-and-sort algorithm

【5】Parallel Old

Parallel collector

The new generation uses parallel collection threads to work and uses the replication algorithm; the old generation uses parallel collection threads to work and uses the mark-sort algorithm

【6】CMS(Concurrent Mark Sweep)

Concurrent Mark-Cleanup Collector

It is a collector that aims to obtain the shortest STW time

The old age uses parallel marking threads and parallel collection threads to work, using mark-sweep algorithms

work process:

  1. CMS initial mark【STW】

    Mark the objects that GC Roots can directly associate

  2. Concurrent mark (CMS concurrent mark) [Work with user threads]

    Perform GC Roots Tracing

  3. CMS remark【STW】

    During the correction of concurrent marking, the marking record of that part of the object is changed due to the continued operation of the user program

    The execution time of this stage is greater than the execution time of the initial mark and less than the execution time of the concurrent mark

  4. Concurrent sweep (CMS concurrent sweep) [Work with user threads]

Disadvantages:

  1. CMS is very sensitive to CPU resources. In fact , programs designed for concurrency are very sensitive to CPU resources. Although in the concurrent marking phase, it will not STW, but it will take up a part of the thread (CPU resources), which will cause the program to slow down and reduce the overall throughput

    The number of recycling threads started by default is (number of CPUs + 3) / 4. When the number of CPUs is more than 4, the garbage collection threads for parallel garbage collection are not less than 25%, and when the number of CPUs is less than 4, you need to take out Half of the computing power is given to the CMS to execute the collection thread

  2. CMS cannot clean up floating garbage (Floating Garbage), Concurrent Mode Failure may occur, resulting in Full GC

    Floating garbage refers to the new garbage objects that appear after the concurrent marking phase. These objects can only be collected in the next GC

  3. Like other collectors, CMS cannot wait until the old age is almost full before collecting. It needs to reserve a part of the space for the parallel execution of the program operation. You can use -XX:CMSInitiatingOccupancyFractionto adjust the trigger percentage. The default value of this attribute: JDK5 is 68, JDK6 is 92, JDK7 is -1 (JDK7 has another parameter CMSInitiatingPermOccupancyFraction, which corresponds to the percentage of memory triggered by permanent generation), and JDK8 is -1

    If the reserved memory space cannot meet the needs of the program, Concurrent Mode Failure will occur. At this time, the JVM will start the backup plan and temporarily start Serial Old to perform garbage collection . But this will increase the STW time. So the reserved memory space should not be too small

  4. CMS uses a mark-sweep algorithm, which will generate memory fragmentation. Excessive memory fragmentation will cause a lot of trouble for the allocation of large objects. Although there is a lot of free space in the old generation, the Full GC will be triggered because it cannot find enough continuous memory space.

    There is a parameter -XX:+UseCMSCompactAtFullCollectionused to control the sorting operation of the memory when Full GC. But the memory consolidation operation cannot be parallelized. There is also a parameter -XX:CMSFullGCsBeforeCompactionthat is used to set how many full GCs without memory arranging are executed, followed by a compressed Full GC, the default is 0, that is, every Full GC is compressed

【7】 G1

Garbage-First, is a garbage collector for server-side applications

G1 divides the entire Java heap into multiple regions of equal size (Region). The difference from the past is that although it retains the concepts of the Cenozoic and the Elderly Generation, the Cenozoic and the Elderly Generation are no longer physically continuous. Instead, each Region contains part of the Cenozoic and Elderly Generation.

By default, the Java heap is divided into about 2048 Regions (if the number is too small, it will affect the collection efficiency and increase the scanning time). -XX:G1HeapRegionSize=???The size of each partition can be adjusted by parameters , the default is 1MB (1048576B), the maximum is 32MB (33554432B), and it must be a power of two, less than 1MB, take 1MB, greater than 1MB, take down to the power of 2. value

G1 can also plan to avoid region-wide garbage collection in the entire Java heap. It will track the value of garbage accumulation in each Region (the amount of memory that can be released by the GC and the required collection time), and maintain one in the background Priority list, each time according to the allowable collection time, priority is given to the region with the highest value (the origin of the name Garbage-First)

Features:

  1. Parallel and concurrency

    Make full use of the hardware advantages of multi-core CPUs to reduce STW time and improve throughput. And the garbage collection thread and user thread can be processed concurrently

  2. Generational collection

    G1 logically divides each Region into Eden, Survivor and Old. Each Region will continuously adjust and switch as the G1 collector runs

  3. Spatial integration

    G1 as a whole uses a mark-and-sort algorithm, but from a partial point of view, it is implemented based on a copy algorithm. This means that during the operation of the G1 collector, no memory fragmentation will occur. It will not trigger the next GC in advance because the program needs to allocate large objects and cannot find contiguous memory space

  4. Predictable pause

    The -XX:MaxGCPauseMillismaximum STW time can be specified by parameters . And according to the established recycling priority list, give priority to recycling those regions with high recycling value

work process:

  1. Initial Marking【STW】

    Mark the objects that GC Roots can directly associate to. Shorter execution time

  2. Concurrent Marking [Work with user threads]

    Starting from GC Roots, the reachability analysis of the objects in the heap memory is performed to find out the surviving objects. Long execution time, but can be executed concurrently with the user program

  3. Final Marking【STW】

    Correct the record of mark changes due to concurrent execution of programs. Can be executed in parallel with the user program

  4. Screening and recycling (Live Data Counting and Evacuation)

    According to the pause time specified by the user, make a collection plan and execute the collection command. Can be executed in parallel with the user program

Note: For now, the G1 collector is not stable and it is not recommended to use it on a production system

3. The choice of garbage collector

[1] JDK source code

bool Arguments::check_gc_consistency() {
    bool status = true;
    
    uint i = 0;
    if (UseSerialGC)                        i++;
    if (UseConcMarkSweepGC || UseParNewGC)  i++;
    if (UseParallelGC || UseParallelOldGC)  i++;
    if (UseG1GC)                            i++;
    if (i > 1) {
        jio_fprintf(defaultStream::error_stream(),
                   "Conflicting collector combinations in option list; "
                    "please refer to the release notes for the combinations "
                    "allowed\n");
        status = false;
    }

    return status;
}

In the C++ source code of the JDK, there is a method to check the correctness of the GC policy configuration

【2】Serial + Serial Old

Use parameters-XX:+UseSerialGC

The new generation uses serial collectors, and the old generation also uses serial collectors

【3】ParNew + Serial Old

Use parameters-XX:+UseParNewGC

The new generation uses parallel collectors and the old generation uses serial collectors

Note that using this strategy configuration, the virtual machine throws a warning message

Java HotSpot(TM) 64-Bit Server VM warning: Using the ParNew young collector with the Serial old collector is deprecated and will likely be removed in a future release

It is recommended not to use this policy configuration, it may be removed in future releases

【4】ParNew + (CMS + Serial Old)

Use parameters-XX:+UseConcMarkSweepGC

The new generation uses parallel collectors, and the old generation uses parallel-clear collectors (serial collectors do backup plans)

【5】Parallel Scavenge + Parallel Old

Use parameters -XX:+UseParallelGCor-XX:+UseParallelOldGC

The new generation uses parallel collectors, the old generation also uses parallel collectors

JDK8 default GC policy configuration

【6】 G1

Use parameters-XX+:UseG1GC

[7] Summary

Single core: Serial

Multi-core, high throughput: Parallel Scavenge and Parallel Old

Multi-core, fast response: ParNew and CMS

4. Memory allocation of Java objects

  1. Priority is given to Eden (new generation Eden Park) for distribution

  2. The big object directly enters the old age. The so-called large object refers to an object that requires a large amount of continuous memory space, such as a very long string or a large array. -XX:PretenureSizeThreshold=???The threshold of the object size can be adjusted by parameters, the default is 0, that is, no limit. This parameter has no unit. Note that this parameter is only valid for the Serial and ParNew collectors, and the Parallel Scavenge collector does not recognize this parameter

  3. When the number of survival times of the object reaches MaxTenuringThreshold, it will enter the old age

  4. If the total size of all objects of the same age in the survival area is greater than half of the Survivor (survival area), the objects with an age greater than or equal to this age will be copied to the old generation during the Minor GC. This is the age judgment of dynamic objects

  5. When Minor GC occurs, if the surviving objects in the Eden Park and a certain survival area cannot be copied to another survival area, the space allocation guarantee mechanism will borrow memory from the old generation and copy the objects to the old generation in advance.

    Can pass XX:+HandlePromotionFailureparameters to allow guarantee failure

    If the guarantee failure is allowed, it will check whether the maximum available continuous memory in the old generation is greater than the average size of the objects promoted to the old generation, if it is greater, it will try to perform a Minor GC, otherwise it will perform a Full GC. Minor GC is risky here, because there may be large surviving objects, leading to promotion to the old generation, and the old generation does not have enough memory space, resulting in Full GC

    If the guarantee failure is not allowed, proceed directly to Full GC

Guess you like

Origin blog.csdn.net/adsl624153/article/details/103865572