JVM -- garbage collection; garbage collection algorithm (3)

Reference before reading

https://blog.csdn.net/MinggeQingchun/article/details/126947384

https://blog.csdn.net/MinggeQingchun/article/details/127066302

1. Whether the marked object is garbage

The memory structure of the JVM includes five areas: program counter, virtual machine stack, local method stack, heap area, and method area.

Among them, the three areas of program counter, virtual machine stack, and local method stack are born and destroyed with threads, so the memory allocation and recovery of these areas are deterministic, and there is no need to think too much about recycling, because When the method ends or the thread ends, the memory is naturally recovered.

The Java heap area is different from the method area. The allocation and recovery of this part of memory is dynamic, which is what the garbage collector needs to pay attention to.

(1) Reference counting method

Reference counting was an early strategy in garbage collectors. In this approach, each object instance in the heap has a reference count. When an object is created, the object instance is assigned to a variable whose count is set to 1. When any other variable is assigned a reference to this object, the count is increased by 1 (a = b, then the counter of the object instance referenced by b is +1), but when a reference of an object instance exceeds the lifetime or is set to When a new value is obtained, the reference counter of the object instance is decremented by 1. Any object instance with a reference count of 0 can be garbage collected. When an object instance is garbage collected, the reference counters of any object instances it references are decremented by 1

  • Advantages: The reference counting collector can be executed very quickly and intertwined in the running of the program. It is more beneficial to the real-time environment where the program needs not to be interrupted for a long time.

  • Cons: Circular references cannot be detected.  If the A object has a reference to the B object, the B object in turn references the A object. This way, their reference count can never be 0

JVM does not take this approach, but reachability analysis

(2) Accessibility Analysis

The garbage collector in the Java virtual machine uses reachability analysis to explore all surviving objects

The reachability algorithm is an algorithm used by mainstream virtual machines at present. The program regards all reference relationships as a graph, starts from a node GC Roots, searches for the corresponding reference node, and after finding this node, continues to search for the reference node of this node. Reference nodes, when all reference nodes are found, the remaining nodes are considered as unreferenced nodes, that is, useless nodes, and useless nodes will be judged as recyclable objects.

In the Java language, objects that can be used as GC Roots include the following:

(1) Objects referenced in the virtual machine stack (local variable table in the stack frame)

(2) Objects referenced by class static properties in the method area

(3) Objects referenced by constants in the method area

(4) Objects referenced by JNI (Native method) in the local method stack

From the above figure, reference1, reference2, and reference3 are all GC Root

reference1 --> object instance 1

reference2 --> object instance 2

reference3 --> object instance 4 --> object instance 6

It can be concluded that object instances 1, 2, 4, and 6 all have object reachability, that is, surviving objects, objects that cannot be recycled by GC. However, although instances 3 and 5 are directly connected, there is no GC Roots connected to them, that is, objects that are unreachable by GC Roots and will be recycled by GC

2. The principle of garbage collection

The basic principle of garbage collection GC (Garbage Collection): to recycle objects that are no longer used in memory. The method used for recycling in GC is called a collector. Since GC needs to consume some resources and time, Java is concerned about the life of objects After the cycle characteristics are analyzed, the objects are collected according to the new generation and the old generation, so as to shorten the pause caused by GC to the application as much as possible

  • The heap memory is divided into 两块one piece 年轻代and the other is老年代。老年代:年轻代比例为2:1

  • The young generation is further divided into Edenand survivor. The ratio of their space size is 8:2 by default

  • Survival area is divided into s0(From Space and s1(To Space . These two spaces are exactly the same size, they are a pair of twins, the ratio of them is 1:1

Heap memory garbage collection process 

1. 新生成The object is first placed in Edenthe zone (Eden zone ) , when the Eden zone 满了will triggerMinor GC

2. The objects that survived the GC in the first step will be moved to the From Space in the S0 area in survivorthe zone . When the S0 area is full, it will be triggered . The objects that survived the S0 area will be moved to the To Space in the S1 area , and the S0 area is free.Minor GC

After S1 is full, then GC, and the surviving objects move to the S0 area again, and the S1 area is free , so that the GC is repeated repeatedly. Every time GC is performed, the age of the object will 涨一岁reach a certain threshold (15), and it will enter老年代

3. After one occurrence Minor GC(precondition), the old generation may appear Major GC, depending on the garbage collector

Full GC trigger conditions

  • Manually calling System.gc will continuously execute Full GC

  • The old generation space is insufficient/full

  • Insufficient/full space in the method area

stop-the-world(STW)

stop-the-worldWill happen in any GC algorithm. stop-the-world means that the JVM stops 停止the execution of the application because it needs to perform GC.

When stop-the-world happens, all but the threads needed for GC 线程go into 等待state until the GC task is complete. GC optimization is often to reduce the occurrence of stop-the-world

JVM GC only recycles 堆内存and 方法区内objects. And 栈内存the data will be automatically released by the JVM after it goes out of scope, so it is not within the management scope of the JVM GC

Can refer to

7 kinds of jvm garbage collectors, all of them are understood this time- let's take a look

1、Minor GC

Collection of objects in the new generation

Minor GC refers to the new generation GC, that is, the garbage collection operation that occurs in the new generation (including Eden area and Survivor area). When the new generation cannot allocate memory space for new objects, Minor GC will be triggered. Because the life cycle of most objects in the new generation is very short, the frequency of Minor GC is very high. Although it will trigger stop-the-world, its recycling speed is very fast

2、Major GC

collection of objects in the old generation

Major GC cleans up the Tenured area, which is used to recycle the old age. When Major GC occurs, there will usually be at least one Minor GC

3、Full GC

Actively call System.gc() to enforce GC in the program

Full GC is a global GC for the entire new generation, old generation, and metaspace (metaspace, java8 and above replace the permanent generation perm gen). Full GC is not equal to Major GC, nor is it equal to Minor GC+Major GC. The occurrence of Full GC depends on the combination of garbage collectors used to explain what kind of garbage collection is.

Three or five references

1. Strong reference

2. Soft references

3, weak quotation

4. Phantom references

5. Terminator reference

(1) Strong reference

Only when all GC Roots objects do not refer to the object through [strong reference], the object can be garbage collected

(2) Soft Reference (SoftReference)

When only soft references refer to the object, after garbage collection, garbage collection will be triggered again when the memory is still insufficient . Recycling soft reference objects can cooperate with the reference queue to release the soft reference itself

For the following code, the VM sets the heap memory to 20M, and the comment part will report an error of heap memory overflow after running the cycle 5 times

/**
 * 软引用
 * -Xmx20m -XX:+PrintGCDetails -verbose:gc
 */
public class GC2SoftReference {
    private static final int _4MB = 4 * 1024 * 1024;

    public static void main(String[] args) throws IOException {
        //java.lang.OutOfMemoryError: Java heap space
        /*List<byte[]> list = new ArrayList<>();
        for (int i = 0; i < 5; i++) {
            list.add(new byte[_4MB]);
        }
        System.in.read();*/

        //软引用
        soft();
    }

    public static void soft() {
        // list --> SoftReference --> byte[]

        List<SoftReference<byte[]>> list = new ArrayList<>();
        for (int i = 0; i < 5; i++) {
            SoftReference<byte[]> ref = new SoftReference<>(new byte[_4MB]);
            System.out.println(ref.get());
            list.add(ref);
            System.out.println(list.size());

        }
        System.out.println("循环结束:" + list.size());
        for (SoftReference<byte[]> ref : list) {
            System.out.println(ref.get());
        }
    }
}

Use soft reference output, the first 4 objects are recycled when checking the value, and only the fifth object exists 

Add parameters to VM to view output

-Xmx20m -XX:+PrintGCDetails -verbose:gc

GC is triggered when the third object is output 

soft reference queue

/**
 * 软引用, 配合引用队列
 * -Xmx20m -XX:+PrintGCDetails -verbose:gc
 */
public class GC3SoftReferenceQueue {
    private static final int _4MB = 4 * 1024 * 1024;

    public static void main(String[] args) {
        List<SoftReference<byte[]>> list = new ArrayList<>();

        // 引用队列
        ReferenceQueue<byte[]> queue = new ReferenceQueue<>();

        for (int i = 0; i < 5; i++) {
            // 关联了引用队列, 当软引用所关联的 byte[]被回收时,软引用自己会加入到 queue 中去
            SoftReference<byte[]> ref = new SoftReference<>(new byte[_4MB], queue);
            System.out.println(ref.get());
            list.add(ref);
            System.out.println(list.size());
        }

        // 从队列中获取无用的 软引用对象,并移除
        Reference<? extends byte[]> poll = queue.poll();
        while( poll != null) {
            list.remove(poll);
            poll = queue.poll();
        }

        System.out.println("===========================");
        for (SoftReference<byte[]> reference : list) {
            System.out.println(reference.get());
        }

    }
}

(3) Weak Reference (WeakReference)

When only weak references refer to the object, during garbage collection, regardless of whether the memory is sufficient, the weak reference object will be recycled , and the weak reference itself can be released in conjunction with the reference queue

/**
 * 弱引用
 * -Xmx20m -XX:+PrintGCDetails -verbose:gc
 */
public class GC4WeakReference {
    private static final int _4MB = 4 * 1024 * 1024;

    public static void main(String[] args) {
        //  list --> WeakReference --> byte[]
        List<WeakReference<byte[]>> list = new ArrayList<>();
        for (int i = 0; i < 5; i++) {
            WeakReference<byte[]> ref = new WeakReference<>(new byte[_4MB]);
            list.add(ref);
            for (WeakReference<byte[]> w : list) {
                System.out.print(w.get()+" ");
            }
            System.out.println();

        }
        System.out.println("循环结束:" + list.size());
    }
}

(4) Phantom Reference (PhantomReference)

Must be used with the reference queue, mainly used with ByteBuffer. When the referenced object is recycled, the phantom reference will be enqueued, and the Reference Handler thread will call the phantom reference related method to release the direct memory

(5) FinalReference (FinalReference)

All objects inherit from Object, and there is a finalize() method in Object, the object can override the finalize() method, which will be called when the object is garbage collected. But the object no longer has a strong reference, and the finalize() method is actually implemented through a finalizer reference. After the B object disconnects the strong reference of A4, the finalizer reference will be added to the reference queue, which will be scanned by a finalizeHandler with a very low priority. When the finalizer reference in the reference queue is scanned, the referenced A4 will be executed The object's finalize() method. Since the finalize() method will not be executed immediately, it will be enqueued first, and the finalizeHandler responsible for scanning has a low priority, which may cause finalize() to be delayed, so it is not recommended to use it for resource recycling

There is no need for manual coding, but it is used internally with the reference queue. During garbage collection, the finalizer reference is enqueued (the referenced object has not been recycled temporarily), and then the Finalizer thread finds the referenced object through the finalizer reference and calls its finalize method , the referenced object can only be recovered during the second GC

4. Garbage collection algorithm

(1) Mark-Clear Algorithm (Mark Sweep)

Marking stage:

The process of marking is actually the process of the reachability analysis algorithm introduced earlier, traversing all GC Roots objects, and marking the objects reachable from the GCRoots object, usually in the header of the object, recording it as reachable object

Cleanup phase:

The process of clearing is to traverse the heap memory, and if it is found that an object is not marked as a reachable object (by reading the object header information), it will be recycled

shortcoming:

(1) Will generate memory fragmentation

(2) When there are large objects that need to allocate continuous memory space, the garbage collection mechanism may be triggered twice

Conclusion: It is suitable for the old generation, and it is more efficient when there are many surviving objects

(2) Mark-collation/compression algorithm (Mark Compact)

After cleaning up the marked garbage, compress the scattered memory space, and continuously copy the active memory and discontinuous memory to the approximately continuous memory space to ensure that the used memory has as many holes as possible

Advantages: avoid a lot of memory fragmentation

Disadvantages: lower overall efficiency

(3) Copy algorithm (Copy)

Mark all surviving objects, and copy these surviving objects to a new piece of memory (the right memory space in the figure), and then reclaim all the shipped memory (the left memory space in the figure)

advantage:

(1) High efficiency, no debris

(2) Only scan the entire space once

shortcoming:

(1) Need an empty memory space

(2) Need to copy the moving object

(3) The memory utilization rate is low, and it is not suitable for use at the end of the year when the object survival rate is high

Applicable to the new generation, that is, "life and death"

5. Generational recovery algorithm

The generational collection algorithm is the recycling algorithm currently used by virtual machines. It solves the problem that mark collation does not work for the old generation, dividing the memory into generations. In general, the heap area is divided into the old generation (Tenured Generation) and the new generation (Young Generation), and there is another generation outside the heap area, which is the permanent generation (Permanet Generation).

Before JDK8, its implementation of the method area was called the permanent generation , which used a part of the heap as the method area

After JDK8, the implementation of the permanent generation was removed, and a metaspace implementation was replaced. The metaspace used part of the operating system (some memory) as the method area instead of being part of the heap.

Different algorithms are used in different ages, so that the most suitable algorithm is used. The survival rate of the new generation is low, and the replication algorithm can be used. However, the survival rate of objects in the old age is high, and there is no extra space to allocate guarantees for it, so only mark clearing or mark finishing algorithms can be used

Objects are first allocated in the Eden area

When the space in the new generation is insufficient, minor gc is triggered, the surviving objects of Eden and from are copied to to by copy, the age of surviving objects is increased by 1 and from to is exchanged

Minor gc will trigger stop the world, suspend other user threads, and wait for the garbage collection to end before the user thread resumes running

When the life of the object exceeds the threshold, it will be promoted to the old age, and the maximum life is 15 (4bit)

When the space in the old generation is insufficient, it will try to trigger minor gc first, and if there is still insufficient space, then trigger full gc, and the STW time will be longer

Set VM-related parameters 

parameter meaning
-Xms Heap initial size (heap memory initial size, unit m, g)
-Xmx or -XX:MaxHeapSize=size The maximum size of the heap, generally not greater than 80% of the physical memory
-Xmn 或 (-XX:NewSize=size + -XX:MaxNewSize=size ) Cenozoic size
-XX:InitialSurvivorRatio=ratio 和 -XX:+UseAdaptiveSizePolicy Survival area ratio (dynamic)
-XX:SurvivorRatio=account Ratio of Survival Zone
-XX:MaxTenuringThreshold=threshold Promote the old age threshold
-XX:+PrintTenuringDistribution Promotion Details
-XX:+PrintGCDetails -verbose:gc GC details
-XX:+ScavengeBeforeFullGC Before FullGC MinorGC
-XX:PermSize The initial size of the non-heap memory, the general application settings are initialized to 200m, and the maximum 1024m is enough
-XX:MaxPermSize Maximum allowed size of non-heap memory
-XX:SurvivorRatio=8 The capacity ratio of the Eden area to the Survivor area in the young generation, the default is 8, that is, 8:1
-XX:+DisableExplicitGC Close System.gc()
-XX:+CollectGen0First Whether to YGC first during FullGC, the default is false
-XX:TLABWasteTargetPercent The percentage of TLAB in the eden area, the default is 1%
-Xnoclassgc disable garbage collection

TLAB memory 

The full name of TLAB is Thread Local Allocation Buffer , that is,线程本地分配缓存from the name, it is a memory allocation area dedicated to threads, which is born to speed up object allocation.

Each thread will generate a TLAB, which is an exclusive working area for the thread. The Java virtual machine uses this TLAB area to avoid multi-thread conflicts and improve the efficiency of object allocation.

TLAB space is generally not too large. When a large object cannot be allocated in TLAB, it will be directly allocated to the heap

parameter meaning
-Xx:+UseTLAB use TLAB
-XX:+TLABSize Set TLAB size
-XX:TLABRefillWasteFraction Set and maintain the size of a single object entering the TLAB space, which is a proportional value, the default is 64, that is, if the object is larger than 1/64 of the entire space, it will be created on the heap
-XX:+PrintTLAB View TLAB information
Xx:ResizeTLAB Self-adjusting TLABRefillWasteFraction threshold
/**
 * 分代回收
 * -Xms20M -Xmx20M -Xmn10M -XX:+UseSerialGC -XX:+PrintGCDetails -verbose:gc -XX:-ScavengeBeforeFullGC
 */
public class GC5Generational {
    private static final int _512KB = 512 * 1024;
    private static final int _1MB = 1024 * 1024;
    private static final int _6MB = 6 * 1024 * 1024;
    private static final int _7MB = 7 * 1024 * 1024;
    private static final int _8MB = 8 * 1024 * 1024;

    // -Xms20M -Xmx20M -Xmn10M -XX:+UseSerialGC -XX:+PrintGCDetails -verbose:gc -XX:-ScavengeBeforeFullGC
    public static void main(String[] args) throws InterruptedException {

    }
}

Garden of Eden 

The Eden object is transferred to the From and to spaces, and the From and to spaces are exchanged

Large objects are directly promoted to the old generation

OOM

Guess you like

Origin blog.csdn.net/MinggeQingchun/article/details/127089533