A quick primer on the JVM Garbage Collector

A quick primer on the JVM Garbage Collector


  1. Algorithm for judging objects that can be recycled
  • reference counting

    Add a reference counter to the object. When the object is referenced, the counter is incremented by one, and when the reference is invalid, the counter is decremented by one. Objects with a counter of 0 can be recycled. The disadvantage of the reference counting method is that it cannot handle circular references, and circular references are very common in Java, so the garbage collector in Java generally does not use the reference counting method.

  • accessibility analysis

    Through a series of GC Roots objects, traverse down, mark all objects that can be referenced by GC Roots, and unreferenced objects are considered garbage objects and are recycled.

    • GC Roots objects include the following:

      • System Class: system class

      • Native Stack: Objects referenced in JNI (Java Native Interface) in the local method stack

      • Thread: The reference object in the virtual machine stack (local variable table in the stack frame) of the active thread

      • Busy Monitor: the object being locked

      • Objects referenced by class static properties in the method area

      • Objects referenced by constants in the method area

    • four references

      • Strong reference: The most common type of reference, if no other type of reference is explicitly specified, then the default is to use a strong reference. Strong references apply to any object in Java, including strings, arrays, custom classes, and more. As long as the strong reference exists, the object will not be garbage collected, because the strong reference itself is GC Roots.
String str = "Hello World!";

In this example, strthe variable is an object reference that points to a string object created from a string literal. This is a strong reference and because it is the default reference type, it will not be reclaimed by the garbage collector nulluntil .

  • Soft references: Objects pointed to by soft references will not be used as GC Roots. If the object has only soft references and no strong references, it will be recycled.
    Soft references can be created through SoftReferencethe class .
SoftReference<String> softRef = new SoftReference<>("str");
String str = softRef.get();
  • Weak references: Objects pointed to by weak references will not be used as GC Roots. When the garbage collector scans an object pointed to by only weak references, it will reclaim the object regardless of whether there is sufficient memory or whether there are other references pointing to it.
    Soft references can be created through WeakReferencethe class .
WeakReference<String> weakRef = new WeakReference<>("str");
String str = weakRef.get();
  • Phantom reference: Phantom reference is the weakest of all reference types. It cannot be used alone. It must be used together with the reference queue, mainly with ByteBuffer. When an object is referenced by a virtual reference, the garbage collector will destroy the object in the next recovery phase and put the virtual reference object corresponding to the object into the corresponding reference queue before destroying the object, so that the program can obtain it by polling the reference queue Messages for objects destroyed by phantom references.
    • Finalizer reference: A subclass of phantom reference used to track whether the object is terminated (Finalized). When the object is reclaimed by the garbage collector, if its finalizer method has not been called, then the finalizer reference will be added to a global reference queue, waiting for the Finalizer thread to call the finalizer method of the object. Since the scheduling of the Finalizer thread is managed by the JVM itself, the call time of the finalizer method is uncertain, which may affect the performance and reliability of the program.

Finalizer method finalize(): In Java, each object has a finalizer method finalize(), which is called once before the object is reclaimed by the garbage collector. It is mainly used to perform some cleanup operations before the object is destroyed, such as closing the file , Release resources, etc. However, there are some problems with the finalizer method, such as no guarantee of execution time, no way to avoid deadlock, etc., so Clean API is recommended after Java 9. Finalizer references are used to manage these finalizers .

  1. garbage collection algorithm
  • Mark-clear algorithm: first mark all surviving objects, then delete all unmarked objects and their references, and finally reclaim the memory space occupied by the deleted objects.

    Disadvantages: memory fragmentation cannot be avoided; the process of marking and clearing can affect the performance of the application.

  • Mark-sorting algorithm: first mark all surviving objects, then compress them to one end of the memory, clear other unmarked objects and their references, and finally reclaim the memory space occupied by the deleted objects.

    Advantages: can avoid memory fragmentation

    Disadvantages: The location of surviving objects needs to be moved, which affects the performance of the application.

  • Copy algorithm: copy the existing surviving objects to another memory area, and then clear all uncopied objects in the old memory area, and finally obtain a continuous memory space.

    Advantages: can avoid memory fragmentation

    Disadvantages: Additional memory space is required to store copied objects, which will also increase the time overhead of GC.

  • Generational recovery algorithm: Based on the generational assumption (that is, the life cycle of an object is divided into different stages, and objects in different stages have different survival probabilities), the memory is divided into two or more spaces, usually called the new generation and the old generation . s . The objects in the young generation have a short life cycle, so the copy algorithm is used; the objects in the old generation have a long life cycle, so the mark-compression algorithm is used.

    New generation: The new generation usually consists of three parts, one part is Eden Space, and the other two parts are Survivor Space.

    Old generation: The old generation is used to store objects with a long life cycle , and these objects come from the new generation. Since the life cycle of objects in the old generation is relatively long, the mark-clear or mark-compact algorithm is used for garbage collection. When performing garbage collection, most of the old generation space does not need to be scanned, only a part of the active objects and possible references to the young generation that exist in it will be scanned.

    • At the beginning, Eden Space is empty, and as new objects are created, objects are stored in Eden Space.

    • When the Eden Space is filled for the first time, a Minor GC will be triggered. In Minor GC, all surviving objects in Eden Space will be checked and transferred to a free area in the surviving area through the copy algorithm .

    • The subsequent Eden Space has insufficient memory, and the surviving objects in Eden Space and the from survivor area are transferred to the to survivor area through the copy algorithm .

    • When the objects in the new generation are still in stock after multiple GCs, they will enter the old generation.

    • If during minor GC, the memory of the young generation is not enough to hold the surviving objects, and the two survivor areas cannot hold all the surviving objects, then these objects will be promoted to the old generation.

    • When there is not enough space in the old generation, it will try to trigger Minor GC first; when it can no longer release enough memory space through Minor GC, it will trigger Full GC.

  1. garbage collection type
  • Full GC: mainly recycles useless objects in the old generation. The life cycle of objects in the old generation is relatively long, so the frequency of Full GC is relatively low, but Full GC will suspend all application threads (Stop The World, STW), and it takes more time to complete, so it will affect the performance of the system have a certain impact. During the Full GC process, the virtual machine reclaims the entire heap , including the new generation and the old generation, in order to reclaim more memory space.

    How Full GC is triggered:

    • When both the new generation and the old generation are full of memory, Full GC will be triggered for comprehensive garbage collection.

    • When CMS GC cannot handle memory fragmentation and reaches the specified GC trigger condition, it will trigger Full GC to clean up.

    • When the space in the Perm area is not enough to accommodate new objects, the JVM will trigger Full GC to reclaim the garbage in the Perm area to release memory.

    • When the System.gc() method is called, it will forcefully trigger Full GC for a comprehensive garbage collection.

  • Minor GC: mainly recycles useless objects in the new generation. Because the life cycle of the new generation objects is short, the frequency of Minor GC is relatively high, and generally it will not affect the performance of the system. During the Minor GC process, the virtual machine temporarily suspends all application threads (Stop The World, STW) in order to reclaim memory space.

  • Mixed GC: It is a garbage collection method between Minor GC and Full GC. In the G1 garbage collector after JDK 8, Mixed GC was introduced.

  • System GC: System garbage collection is a garbage collection method automatically triggered by the system. System GC is usually performed by the application program by calling the System.gc() method to request the system to perform garbage collection operations. Calling the System.gc() method to request execution of the System GC usually triggers a Full GC, that is, garbage collection of the entire heap memory.

  1. garbage collector
  • Common categories:

    • serial

      • The simplest garbage collector

      • Use single thread for garbage collection: STW will be triggered when garbage collection is performed until the collection is complete

      • suitable for small applications

      • simple and lightweight

      • Combination: Serial + Serial Old

    • Throughput priority

      • Designed to maximize garbage collection throughput

      • Garbage collection using multithreading

      • Suitable for applications that focus on system throughput, such as back-end data processing, batch processing, etc.

      • Applicable to scenarios with large heap memory

      • Let the unit time, STW time be the shortest

      • Combination: Parallel Scavenge + Parallel Old

    • Response time priority

      • Designed to minimize garbage collection pauses

      • Garbage collection using multithreading

      • Suitable for applications with strict requirements on response time, such as web applications and real-time systems

      • Applicable to scenarios with large heap memory

      • Make the single STW time the shortest

  • Common JVM garbage collectors:

    • Serial collector: This is a basic garbage collector of the JVM, which uses a single thread for garbage collection. The Serial collector is suitable for small applications, and its simplicity and light weight make it very popular in client applications.

    • Parallel collector (parallel): The Parallel collector is similar to the Serial collector, but uses multiple threads for garbage collection. The Parallel collector is suitable for large applications, and its concurrency capability can improve the response speed of the application.

    • CMS collector (concurrent): The CMS (Concurrent Mark Sweep) collector is a low-pause time garbage collector that uses multi-threaded concurrency for garbage collection. It is suitable for scenarios with strict requirements on application response time, such as web applications. CMS uses a mark-sweep algorithm, which is divided into the following 4 stages:

      • Initial marking: Mark the objects that GC Roots can directly relate to, and it will be STW.

      • Concurrent marking: GC Roots Tracing, which can be executed concurrently with user threads.

      • Re-marking: re-judgment of the survival of objects generated during the marking period, and correct the marking of these objects. The execution time is relatively short compared with concurrent marking, and it will be STW.

      • Concurrent clearing: clearing objects can be executed concurrently with user threads.

    • G1 collector: The G1 garbage collector is a region-based garbage collector . It adopts the idea of ​​​​region (Region) and divides the Java heap memory into multiple regions (Region) of equal size. The size of each region is usually 1MB or larger. The G1 garbage collector determines which areas should be recovered first by calculating the recovery yield of each area.

      Reclamation yield is the ratio of the amount of memory that can be freed by reclaiming a region to the amount of memory currently used by the region. The G1 garbage collector calculates the recovery yield for each region through the following steps:

      1. Track object allocation and collection for each region to calculate the percentage of garbage in each region .

      2. Mark live objects in each region during garbage collection Calculate the percentage of live objects in each region .

      3. Calculate the recycling yield for each region by multiplying the ratio of garbage and the ratio of surviving objects in each region.

      After calculating the recovery yield of each area, the G1 garbage collector will determine which areas should be recovered first according to the size of the recovery yield, that is, give priority to recycling the areas with high recovery yields to maximize the release of available memory.

  • default garbage collector

    • Java 8 and earlier: Serial Garbage Collector

    • Java 9 and later: G1 Garbage Collector

  • The difference between CMS and G1:

characteristic CMS G1
Recycling Algorithm mark-clear Compound Garbage Collection Algorithm
fragmentation problem generate lots of debris Divide the Java heap into multiple areas of equal size to reduce fragmentation
pause time The execution of the application needs to be stopped, the pause time is long Use concurrent threads for garbage collection, and allow garbage collection tasks to be distributed, with short pause times
extra space Need extra space to keep garbage objects Part of the Java heap can be used for garbage collection without additional space
Applicable scene Applications that are sensitive to pause times Applications that require large heaps and low latency to minimize pause times and reduce memory fragmentation during garbage collection

Guess you like

Origin blog.csdn.net/m0_56170277/article/details/130468533