Transfer jvm series (3): java GC algorithm garbage collector

GC algorithm  garbage collector

Overview

Garbage Collection Garbage Collection is often referred to as "GC", it was born in MIT's Lisp language in 1960, after more than half a century, it is now very mature .

In JVM  , the program counter, virtual machine stack, and local method stack are all created and destroyed with the thread. The stack frame is pushed and popped with the entry and exit of the method, which realizes automatic memory cleanup. Therefore, Our memory garbage collection is mainly concentrated in the java heap and method area. During the running of the program, the allocation and use of this part of the memory are dynamic .

 

object storagelive judgment

There are generally two ways to determine whether an object is alive:

Reference count : Each object has a reference count attribute. When a new reference is added, the count is increased by 1, and when the reference is released, the count is decreased by 1. When the count is 0, it can be recycled. This method is simple and cannot solve the problem of circular reference of objects to each other .

Reachability Analysis : Search downwards from GC Roots, and the path traversed by the search is called the reference chain. When an object does not have any reference chain connected to GC Roots, the object is proved to be unavailable. Unreachable object.

In the Java language, GC Roots include:

  Objects referenced in the virtual machine stack.

  The object referenced by the class static property entity in the method area.

  The object referenced by the constant in the method area.

  Objects referenced by JNI in the native method stack.

 

Garbage Collection Algorithms

mark-sweep algorithm

   The " Mark -Sweep" (Mark-Sweep) algorithm, like its name, is divided into two stages: "marking" and "sweeping": first, all objects that need to be recycled are marked, and all objects that need to be recycled are uniformly recycled after the marking is completed. marked object. The reason why it is said to be the most basic collection algorithm is that the subsequent collection algorithms are based on this idea and improve its shortcomings.

There are two main disadvantages: one is the efficiency problem, the efficiency of the marking and clearing process is not high; the other is the space problem, after the mark is cleared, a large number of discontinuous memory fragments will be generated, and too much space fragmentation may cause, when When the program needs to allocate larger objects in the later running process, it cannot find enough contiguous memory and has to trigger another garbage collection action in advance.

wpsA73E.tmp

replication algorithm

"Copying" is a collection algorithm that divides the available memory into two equal-sized blocks according to their capacity, and only uses one of them at a time. When the memory of this block is used up, the surviving objects are copied to another block, and then the used memory space is cleaned up once.

In this way, memory is reclaimed for one of them every time, and there is no need to consider complex situations such as memory fragmentation during memory allocation. Just move the top pointer of the heap and allocate memory in sequence, which is simple to implement and efficient to run. It's just that the cost of this algorithm is that the memory is reduced to half of the original, and the efficiency of continuously copying long-lived objects is reduced.

wps9D31.tmp

tag-compression algorithm

The copy collection algorithm will perform more copy operations when the object survival rate is high, and the efficiency will be lower. More importantly, if you don't want to waste 50% of the space, you need to have extra space for allocation guarantees to deal with the extreme situation where all objects in the used memory are 100% alive, so in the old age, you can't directly choose this kind of memory. algorithm.

According to the characteristics of the old age, someone proposed another "Mark-Compact" algorithm. The marking process is still the same as the "Mark-Clear" algorithm, but the subsequent steps are not directly cleaning the recyclable objects, but Make all surviving objects move to one end, and then directly clean up the memory outside the end boundary

wps3952.tmp

Generational Collection Algorithm

The basic assumption of GC generation: the life cycle of most objects is very short and the survival time is short.

The "Generational Collection" algorithm divides the Java heap into the new generation and the old generation, so that the most appropriate collection algorithm can be adopted according to the characteristics of each generation. In the new generation, a large number of objects are found to die during each garbage collection, and only a few survive, so the replication algorithm is used, and the collection can be completed with only a small cost of copying the surviving objects. In the old age, because the object has a high survival rate and there is no extra space to allocate it, it must use the "mark-sweep" or "mark-clean" algorithm for recycling.

garbage collector

     If the collection algorithm is the methodology of memory recycling, the garbage collector is the specific implementation of memory recycling

Serial collector

Serial collectors are the oldest, most stable, and efficient collectors, and may have long pauses, using only one thread to recycle. The new generation and the old generation use serial recycling; the new generation replication algorithm, the old generation mark-compression; the process of garbage collection will stop the world (service suspension)

Parameter control: -XX:+UseSerialGC serial collector

wpsA77.tmp

ParNew collector

The ParNew collector is actually a multithreaded version of the Serial collector. Young generation parallel, old generation serial; new generation replication algorithm, old generation mark-compression

Parameter control: -XX:+UseParNewGC ParNew collector

-XX:ParallelGCThreads limit the number of threads

wps6A83.tmp

Parallel collector

The Parallel Scavenge collector is similar to the ParNew collector,The Parallel collector is more concerned with the throughput of the system. The adaptive adjustment strategy can be turned on through parameters. The virtual machine collects performance monitoring information according to the current system operation, and dynamically adjusts these parameters to provide the most suitable pause time or maximum throughput; you can also control the GC time through parameters not to exceed How many milliseconds or scale; young generation replication algorithm, old generation mark-compression

Parameter control: -XX:+UseParallelGC Use Parallel collector + old generation serial

Parallel Old Collector

Parallel Old is an older version of the Parallel Scavenge collector, using multithreading and a "mark-and-sort" algorithm. This collector was only available in JDK 1.6

Parameter control: -XX:+UseParallelOldGC Use Parallel collector + old generation parallel

CMS collector

The CMS (Concurrent Mark Sweep) collector is a collector whose goal is to obtain the shortest collection pause time. At present, a large part of Java applications are concentrated on the server side of Internet sites or B/S systems. Such applications pay special attention to the response speed of services, and hope that the system pause time is shortest to bring users a better experience.

From the name (including "Mark Sweep"), it can be seen that the CMS collector is based on the "mark-sweep" algorithm. Its operation process is more complicated than the previous collectors. The whole process is divided into 4 steps including: 

Initial mark (CMS initial mark)

Concurrent mark (CMS concurrent mark)

Remark (CMS remark)

Concurrent sweep (CMS concurrent sweep)

 The two steps of initial marking and re-marking still require "Stop The World". The initial marking is just to mark the objects that GC Roots can directly associate with, which is very fast. The concurrent marking phase is the process of GC Roots Tracing, and the remarking phase is to correct the marking caused by the continued operation of the user program during the concurrent marking period. The marking record of the part of the object that produced the change, the pause time of this stage is generally slightly longer than the initial marking stage, but much shorter than the time of concurrent marking. 
      Since the collector thread can work with the user thread during the longest concurrent marking and concurrent clearing process in the whole process, in general, the memory recovery process of the CMS collector is performed concurrently with the user thread. Old generation collector (new generation uses ParNew)

  Advantages: concurrent collection, low pause 

   Disadvantages: a lot of space fragmentation, concurrent stages will reduce throughput

   Parameter control: -XX:+UseConcMarkSweepGC Use CMS collector

             -XX:+ UseCMSCompactAtFullCollection After Full GC, perform a defragmentation; the defragmentation process is exclusive, which will cause longer pause times

            -XX:+CMSFullGCsBeforeCompaction set to perform a defragmentation after several Full GCs

            - XX:ParallelCMSThreads Set the number of CMS threads (generally equal to the number of available CPUs)

wpsCA6E.tmp

G1 collector

G1 is one of the most cutting-edge achievements in current technological development. The mission given to it by the HotSpot development team is to replace the CMS collector released in JDK1.5 in the future. Compared with the CMS collector, the G1 collector has the following characteristics:

1. Space integration, G1 collector adopts mark sorting algorithm, which will not generate memory space fragmentation. When allocating large objects, the next GC will not be triggered early because the contiguous space cannot be found.

2. Predictable pauses are another major advantage of G1. Reducing pause time is a common concern of G1 and CMS, but in addition to pursuing low pauses, G1 can also establish a predictable pause time model, allowing users to clearly Specifying that in a time slice of length N milliseconds, the time spent on garbage collection must not exceed N milliseconds, which is almost a feature of real-time Java (RTSJ) garbage collectors.

The garbage collector mentioned above, the scope of collection is the entire new generation or old generation, and G1 is no longer the case. When using the G1 collector, the memory layout of the Java heap is very different from other collectors. It divides the entire Java heap into multiple independent regions (Regions) of equal size, although the concepts of the new generation and the old generation are still retained. But the new generation and the old generation are no longer a physical gap, they are all a collection of part (which can be discontinuous) Region.

wps3B4C.tmp

The collection of the new generation of G1 is similar to that of ParNew. When the occupation of the new generation reaches a certain proportion, the collection starts. Similar to CMS, the G1 collector will have a short pause when collecting old generation objects.

Collection steps:

1. In the marking phase, the initial mark (Initial-Mark) is first, this phase is a stop (Stop the World Event), and a normal Mintor GC will be triggered. Corresponding to GC log:GC pause (young) (inital-mark)

2. Root Region Scanning, the survivor area will be reclaimed during the program running (survival to the old age), this process must be completed before the young GC.

3. Concurrent Marking, which performs concurrent marking (and concurrent execution of the application) in the entire heap, this process may be interrupted by young GC. In the concurrent marking phase, if all objects in the area object are found to be garbage, the area will be reclaimed immediately (X in the figure). At the same time, during the concurrent marking process, the object activity of each region (the proportion of surviving objects in the region) is calculated.

wps93E7.tmp

4. Remark, and then mark again, there will be a short pause (STW). The re-marking phase is used to collect new garbage generated by the concurrent marking phase (the concurrent phase and the application run together); G1 uses a faster initial snapshot algorithm than CMS: snapshot-at-the-beginning (SATB).

5. Copy/Clean up, multi-threaded removal of deactivated objects, there will be STW. G1 copies the surviving objects in the reclaimed area to the new area, clears the Remember Sets, and simultaneously clears the reclaimed area and returns it to the free area list.

wps47EC.tmp

6. After the copy/clear process. Active objects in the recovery area have been concentrated and recovered into the dark blue and dark green areas.

wpsEAB1.tmp

Common collector combinations

  New generation GC strategy Old generation GC strategy illustrate
Combination 1 Serial Serial Old
Both Serial and Serial Old are single-threaded for GC, which is characterized by suspending all application threads during GC.
Combination 2 Serial CMS+Serial Old CMS (Concurrent Mark Sweep) is a concurrent GC, which realizes the concurrent work of GC threads and application threads without suspending all application threads. In addition, when the CMS fails to perform GC, it will automatically use the Serial Old strategy for GC.
Combination 3
ParNew
CMS
Use the -XX:+UseParNewGC option to enable. ParNew is a parallel version of Serial, which can specify the number of GC threads. The default number of GC threads is the number of CPUs. The number of GC threads can be specified with the -XX:ParallelGCThreads option.
If the option -XX:+UseConcMarkSweepGC is specified, the new generation uses the ParNew GC strategy by default.
Combo 4
ParNew
Serial Old Use the -XX:+UseParNewGC option to enable. The young generation uses the ParNew GC strategy, and the old generation uses the Serial Old GC strategy by default.
Combo 5
Parallel Scavenge
Serial Old
The Parallel Scavenge strategy mainly focuses on a controllable throughput: application running time / (application running time + GC time). It can be seen that this will make the CPU utilization as high as possible, and is suitable for applications running persistently in the background. Not suitable for more interactive applications.
Combo 6
Parallel Scavenge
Parallel Old
Parallel Old is a parallel version of Serial Old

 

Combo 7
G1GC
G1GC
-XX:+UnlockExperimentalVMOptions -XX:+UseG1GC #Open
-XX:MaxGCPauseMillis =50 #Pause time target
-XX:GCPauseIntervalMillis =200 #Pause interval target
-XX:+G1YoungGenSize=512m #Young generation size
-XX:SurvivorRatio=6 # Survival area ratio

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325934521&siteId=291194637