[JVM] Garbage collection algorithm and generational collection

Reference for this article: In-depth understanding of Java virtual machine: JVM advanced features and best practices (3rd edition)

1. Overview of Garbage Collection Algorithms

According to the perspective of determining the demise of an object , garbage collection algorithms can be divided into

  1. Reference Counting Garbage Collection (Direct Garbage Collection)
  2. Tracing Garbage Collection (Indirect Garbage Collection)

The Java virtual machine uses tracking garbage collection, which has

  1. mark-sweep algorithm
  2. mark-copy algorithm

2. Mark-sweep algorithm

This algorithm is the most basic collection algorithm, which is divided into two stages: mark and clear.

image-20230129132943477

  1. Mark: Mark all objects that need to be recycled
  2. Cleanup: After the marking is completed, all marked objects are collected uniformly

The reverse is also possible, mark the surviving objects, and clear the unmarked objects after the marking is completed.

The mark here is marked according to the reachability analysis, and the marks are unreachable (reachable) objects in the reachability analysis algorithm.

image-20230129131516319

Disadvantages of the mark-sweep algorithm

  1. The execution efficiency is unstable . If most of the objects need to be recycled, a large number of marking and clearing operations are required, resulting in a decrease in the execution efficiency of the two processes of marking and clearing as the number of objects increases.
  2. Fragmentation of memory space . After marking and clearing, a large number of discontinuous memory fragments will be generated. Memory fragmentation will lead to the inability to find continuous memory when large objects need to be allocated, and another garbage collection action has to be triggered.

3. Mark-copy algorithm

The mark-copy algorithm is to solve the problem of low execution efficiency when faced with a large number of recyclable objects.

The idea of ​​the algorithm is to

  1. Divide the available memory into two pieces of equal size by capacity. Use only one piece at a time.
  2. When this piece of memory runs out, copy the surviving object to another piece.
  3. Then clean up the used memory at once

image-20230129132535082

If a large number of objects in memory are alive, then this algorithm will generate a large amount of memory copy overhead.

Although this algorithm is simple to implement and efficient to run, it also has shortcomings.

  1. The available memory is reduced to half of the original, and the space is seriously wasted.

The advantage is that there will be no memory fragmentation.

image-20230129132918262


4. Marking-Collating Algorithm

The mark-copy algorithm requires more copy operations when the object survival rate is high, and the efficiency will decrease.

And half of the memory will be wasted. If you don’t want to waste memory, you need additional space for allocation guarantee, so this algorithm cannot be directly used in the old generation.

The difference between the mark-organization algorithm and the mark-copy algorithm is that the subsequent steps after marking are no longer directly clearing the reclaimed objects, and then moving all surviving objects to the end of the memory space, and then clearing the memory outside the boundary.

image-20230129133517688

Moving surviving objects, especially in the old age area where a large number of objects survive each collection, moving surviving objects and updating all addresses that refer to these objects will be a heavy-duty operation, and the user must be suspended during the object moving operation application to proceed, that is Stop The World.

image-20230129134009031


5. Generational recycling

The generational collection theory is based on two hypotheses

  1. Weak Generational Hypothesis : Most objects are ephemeral
  2. Strong Generational Hypothesis : Objects that survive more garbage collections are more difficult to perish.
  3. Intergenerational Citation Hypothesis : Intergenerational citations are only a small minority compared to intragenerational citations.
    1. Two objects that have a mutual reference relationship should live and die together. Since the objects in the old generation are difficult to perish, the objects in the new generation can also survive. As the age grows, the objects in the new generation will be promoted to the old generation. Cross-generational references are also eliminated.

The Java heap is divided into two areas: the new generation and the old generation.

  1. Cenozoic: Most of the objects in the Cenozoic are ephemeral. Each recycling only needs to focus on how to save a small number of surviving objects, instead of marking a large number of objects that will be recycled, so that a large number of objects can be recovered at a low cost. space.
  2. Old Age: Most of the objects in the Old Age are objects that are difficult to perish, and the virtual machine can reclaim these areas at a lower frequency.

image-20230129135908714

Generational recycling process

  1. Objects are first allocated in eden( Eden Space)
  2. When the space in the new generation is insufficient, it is triggered minor gc, and the from surviving are copy copied to to , and the age of the surviving objects is increased by 1 and from和to the pointing of is exchanged
  3. minor gcIt will cause stop the world, suspend the threads of other users, and wait for the garbage collection to end before the user threads resume running. When the life of the object exceeds the threshold, it will be promoted to the old generation, and the maximum life is 15 ( 4bit)
  4. When there is not enough space in the old generation, it will try to trigger first minor gc. If there is still not enough space, then the trigger full gcwill STWtake longer

Definitions of several collections

  1. Partial collection ( Partial GC): Refers to garbage collection where the goal is not to completely collect the entire Java heap.
    1. Young generation collection ( Minor GC/Young GC): Refers to the garbage collection whose goal is only the new generation.
    2. Old generation collection ( Major GC/Old GC): Refers to garbage collection that targets only the old generation.
    3. Mixed collection ( Mixed GC): Refers to the garbage collection whose goal is to collect the entire young generation and part of the old generation. Currently only the G1 collector exhibits this behavior.
  2. Whole Heap Collection( Full GC): Collects the entire Java heap and methods go to garbage collection.

Guess you like

Origin blog.csdn.net/weixin_51146329/article/details/128788460