JVM garbage collection mechanism and collector

1 How to determine that an object is garbage?

      To perform garbage collection, you must first know what kind of object is garbage

1.1 Reference counting method

      The logic of the reference counting method is: when storing objects in the heap, a counter is maintained at the head of the object, and if an object adds a reference to it, counter++ is added. If a reference relationship fails, then counter–. If the counter of an object becomes 0, it means that the object has been discarded and is not alive.

      Disadvantages: "Islands" are easy to occur, that is, if AB holds references to each other, it can never be recycled.

Welfare Welfare Welfare Free to receive Java architecture skills map Note that it is free 

 

Free to receive the required +V to receive

1.2 Accessibility analysis

      Through the object of GC Root, start to look down to see if an object is reachable.

      It can be used as GC Root: class loader, Thread, local variable table of virtual machine stack, static member, constant reference, variable of local method stack, etc.

     As shown in the figure below, using the GC Root as the node, search downwards. When an object is not connected by the GC Root reference chain, it will be marked as a recyclable object. Although Object5, 6, and 7 hold references to each other, they are not reachable to the GC Root, so they will be judged as recyclable objects.

2 Garbage collection algorithm

     After being able to determine that an object is garbage, the next thing to consider is recycling, so how to recycle it?

2.1 Mark-Sweep

  • mark

      Find out the objects in memory that need to be recycled and mark them

  • Clear

      Clear out the objects marked for recycling, and release the corresponding memory space

**Disadvantages:**

  1. Both marking and clearing processes are time-consuming and inefficient
  2. A large number of non-contiguous memory fragments will be generated. Too much space fragmentation may lead to the failure to find enough continuous memory when larger objects need to be allocated during the running of the program, and another garbage collection action has to be triggered in advance

2.2 Copying

      Divide the memory into two equal areas, and use only one of them at a time, as shown in the following figure:

      When one of the memory is used up, the surviving objects are copied to the other, and then the used memory space is cleared all at once.

** Disadvantages: ** Reduced space utilization

2.3 Mark-Compact

      The marking process is still the same as the "mark-sweep" algorithm, but the subsequent steps are not to directly clean up recyclable objects, but to move all surviving objects to one end, and then directly clean up the memory outside the end boundary

      Let all surviving objects move to one end, and clean up unexpected memory at the boundary.

2.4 Generational collection algorithm

  • Young area: replication algorithm (after the object is allocated, the life cycle may be relatively short, and the replication efficiency of the Young area is relatively high)
  • Old area: Mark clearing or marking sorting (the old area objects have a longer survival time, copying is unnecessary, it is better to make a mark and then clean up)

3 Garbage collector

     If the collection algorithm is the methodology of garbage collection, then the garbage collector is its landing

3.1 Serial collector

Cenozoic collector, the only choice for Cenozoic collection of early JDK versions

It is a single-threaded collector, and other threads will be suspended during garbage collection.

Advantages: simple and efficient, with high efficiency of single-threaded mobile phones

Disadvantages: need to suspend other threads

Algorithm: Copy Algorithm

Use range: new generation

Application: The default new generation collector in Client mode

3.2 ParNew collector

The new generation of collectors can be understood as a multi-threaded version of the Serial collector

Advantages: When there are multiple CPUs, it is more efficient than Serial.

Disadvantages: The collection process suspends all application threads, which is less efficient than Serial when using a single CPU.

Algorithm: Copy Algorithm

Scope of application: new generation

Application: The preferred new-generation collector in virtual machines running in Server mode

3.3 Parallel Scavenge collector

      The Parallel Scavenge collector is a new generation collector. It is also a collector that uses a replication algorithm. It is also a parallel multi-threaded collector. It looks the same as ParNew, but Parallel Scanvenge pays more attention to the throughput of the system.

      Throughput = time to run user code / (time to run user code + garbage collection time)

      For example, the virtual machine runs for a total of 100 minutes, and the garbage collection time takes 1 minute, and the throughput=(100-1)/100=99%. If the throughput is greater, it means that the garbage collection time is shorter, and the user code can make full use of CPU resources and complete the program's computational tasks as soon as possible

-XX:MaxGCPauseMillis控制最大的垃圾收集停顿时间,
-XX:GC Time Ratio直接设置吞吐量的大小
复制代码

3.4 Serial Old Collector

       The Serial Old collector is the old version of the Serial collector, and it is also a single-threaded collector. The difference is that it uses the "mark-and-sort algorithm". The operation process is the same as that of the Serial collector.

3.5 Parallel Old collector

      The Parallel Old collector is the old version of the Parallel Scavenge collector, which uses multi-threading and "mark-and-sort algorithm" for garbage collection. Throughput is the priority.

3.6 CMS collector

       The CMS (Concurrent Mark Sweep) collector is a collector that aims to obtain the shortest recovery pause time.

       The "mark-clear algorithm" is used, and the whole process is divided into 4 steps

  1. Initial mark CMS initial mark Mark the objects that GC Roots can be associated with Stop The World--->fast
  2. Concurrently mark CMS concurrent mark for GC Roots Tracing
  3. Remark CMS remark Modify the content of concurrent marking due to changes in the user program Stop TheWorld
  4. CMS concurrent sweep

      Due to the concurrent marking and concurrent removal in the whole process, the collector thread can work with the user thread, so in general, the memory recovery process of the CMS collector is executed concurrently with the user thread

Advantages: concurrent collection, low pause

Disadvantages: using the mark-sweep algorithm will generate a lot of space fragments, and the concurrent phase will reduce throughput

3.7 G1 collector

      Parallel and concurrency

      Generational collection (still retains the concept of generations)

      Spatial integration (as a whole, it belongs to the "marking-organizing" algorithm and will not cause space fragmentation)

      Predictable pauses (more advanced than CMS is that users can clearly specify a time segment of M milliseconds, and the time spent on garbage collection should not exceed N milliseconds)

      When using the G1 collector, the memory layout of the Java heap is very different from other collectors. It divides the entire Java heap into multiple independent regions of equal size (Region), although the new generation and old generation are still retained Concept, but the new generation and the old generation are no longer physically separated, they are all a collection of regions (not necessarily continuous).

Working process (similar to CMS):

  1. Initial Marking Mark the objects that GC Roots can associate, and modify the value of TAMS, you need to suspend the user thread
  2. Concurrent Marking Perform reachability analysis from GC Roots, find out the surviving objects, and execute concurrently with user threads
  3. Final Marking (Final Marking) Amend the concurrent marking phase due to the concurrent execution of the user program to change the data, the user thread needs to be suspended
  4. Screening and recycling (Live Data Counting and Evacuation) Sort the recycling value and cost of each Region, and formulate a recycling plan according to the GC pause time expected by the user

3.8 Classification

  • Serial Collector->Serial and Serial Old

Only one garbage collection thread can execute, and the user thread is suspended. Suitable for embedded devices with relatively small memory.

  • Parallel collector [throughput priority]->Parallel Scanvenge, Parallel Old

Multiple garbage collection threads work in parallel, but the user thread is still waiting. It is suitable for weak interaction scenarios such as scientific computing and background processing.

  • Concurrent collector [pause time priority]->CMS, G1

The user thread and the garbage collection thread are executed at the same time (but not necessarily in parallel, and may be executed alternately), and the garbage collection thread will not suspend the operation of the user thread during execution. It is suitable for scenarios that require relative time, such as the Web.

3.9 How to choose a suitable garbage collector

Official website

[https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/collectors.html#sthref28](https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/collectors.html#sthref28)

  • Prioritize the adjustment of the heap size and let the server choose by itself
  • If the memory is less than 100M, use the serial collector
  • If it is a single core and there is no pause time requirement, use serial or JVM to choose
  • If the pause time is more than 1 second, choose parallel or JVM
  • If response time is the most important and cannot exceed 1 second, use a concurrent collector

**Opening method:**

(1) Serial

-XX:+UseSerialGC

-XX:+UseSerialOldGC

(2) Parallel (throughput priority):

-XX:+UseParallelGC

-XX:+UseParallelOldGC

(3) Concurrent collector (response time priority)

-XX:+UseConcMarkSweepGC

-XX:+UseG1GC

 

Guess you like

Origin blog.csdn.net/yuandengta/article/details/109188592