13.JVM-garbage collection related concepts

 Table of Contents of Series Articles

1. JVM and Java architecture

2. JVM-class loading subsystem

3. JVM-runtime data area overview and threads

4. JVM-Program Counter (PC Register)

5. JVM-Virtual Machine Stack

6. JVM-Native Method Interface

7. JVM-native method stack

8. JVM-heap

9. JVM-method area

10.JVM-StringTable/StringPool

11.JVM-Garbage Collection Overview

12.JVM-garbage collection related algorithms

13.JVM-garbage collection related concepts

14.JVM-Garbage Collector


1. Understanding System.gc()

By default, Full GC will be explicitly triggered by calling System.gc() or Runtime.getRuntime().gc() , while recycling the old and new generations and trying to release the memory occupied by discarded objects.

However, System.gc() calls come with a disclaimer that calls to the garbage collector are not guaranteed. (System.gc() reminds the JVM that it wants to perform a garbage collection. As for whether it is called, it is hard to say)

JVM implementers can determine the JVM's GC behavior through System.gc() calls. Under normal circumstances, garbage collection should be carried out automatically and does not need to be triggered manually, otherwise it will be too troublesome . In some special cases, such as if we are writing a performance benchmark, we can call System.gc() between runs.

example:

package _03;
​
public class _77_SystemGCTest {
    public static void main(String[] args) {
        new _77_SystemGCTest();
        //提醒jvm的垃圾回收器执行gc,但是不确定是否马上执行gc。 与Runtime.getRuntime().gc();的作用一样。
        System.gc();
​
//        System.runFinalization();//强制调用使用引用的对象的finalize()方法
    }
​
    @Override
    protected void finalize() throws Throwable {
        super.finalize();
        System.out.println("SystemGCTest 重写了finalize()");
    }
}
​

After running it multiple times, sometimes the GC is triggered, sometimes not. After calling System.gc();, it may not be executed in time. The program may exit before there is GC.

Open the comment: System.runFinalization(); is executed multiple times, and the result is that the GC is triggered.

2. Memory overflow and memory leak

2.1. Out of memory (OOM)

Although memory overflow is easier to understand than memory leak, it is also one of the main culprits that cause program crashes.

Since GC has been developing, under normal circumstances, unless the memory occupied by the application grows very fast, causing garbage collection to be unable to keep up with the speed of memory consumption, OOM is not likely to occur.

In most cases, the GC will carry out garbage collection of various ages. If it really doesn't work, it can be enlarged and perform an exclusive Full GC operation. At this time, a large amount of memory will be recovered for the application to continue to use.

The explanation of OutOfMemoryError in javadoc is that there is no free memory and the garbage collector cannot provide more memory. (For example, if there is no room to put things at home, and if you still can’t put them away after tidying them up, you won’t be able to put them anymore, and it will become OOM.)

First, let’s talk about the situation where there is no free memory: it means that the heap memory of the Java virtual machine is not enough. There are two reasons:

(1) The heap memory setting of the Java virtual machine is not enough.

For example: there may be a memory leak problem; it is also very likely that the heap size is unreasonable. For example, we have to process a considerable amount of data, but the JVM heap size is not explicitly specified or the specified value is too small. We can adjust it through the parameters -Xms and -Xmx.

(2) A large number of large objects are created in the code and cannot be collected by the garbage collector for a long time (there are references)

For older versions of Oracle JDK, because the size of the permanent generation is limited, and the JVM is very inactive about permanent generation garbage collection (such as constant pool recycling, unloading no longer needed types), so when we continue to add new types , OutOfMemoryError is also very common in the permanent generation, especially when a large number of dynamic types are generated at runtime; similar intern string cache takes up too much space, which can also cause OOM problems. The corresponding exception information will be marked and related to the permanent generation: " java.lang.OutOfMemoryError: PermGen space ".

With the introduction of the metadata area, the memory in the method area is no longer so embarrassing, so the corresponding OOM has changed. When OOM occurs, the exception message becomes: " java.lang.OutofMemoryError: Metaspace ". Insufficient direct memory (local memory) can also cause OOM.

The implicit meaning here is that before an OutOfMemoryError is thrown, the garbage collector is usually triggered and does its best to clean up space.

  • For example: In the reference mechanism analysis, it involves the JVM trying to recycle the objects pointed to by soft references, etc.
  • In the java.nio.Bits.reserveMemory() method, we can clearly see that System.gc() will be called to clean up space.

Of course, the garbage collector will not be triggered under all circumstances.

  • For example, if we allocate a very large object, similar to a very large array that exceeds the maximum size of the heap, the JVM can determine that garbage collection cannot solve this problem, so it directly throws an OutOfMemoryError.

2.2. Memory Leak

(Memory leaks may cause memory overflow (OOM), but not necessarily OOM)

Also known as "storage leakage". Strictly speaking, it is called a memory leak only when the objects will no longer be used by the program, but the GC cannot recycle them. (It’s a bit like buying a house and there are stalls (stairs and walls) that you can’t use, but the money still counts when buying a house.)

But in reality, many times some bad practices (or negligence) will cause the object's life cycle to become very long or even cause OOM, which can also be called a "memory leak" in a broad sense .

Although a memory leak will not immediately cause the program to crash, once a memory leak occurs, the available memory in the program will be gradually eroded until all memory is exhausted, and eventually an OutOfMemory exception occurs, causing the program to crash.

Note that the storage space here does not refer to physical memory, but to virtual memory size. This virtual memory size depends on the size of the disk swap area.

Example:

1. Singleton mode

The life cycle of a singleton is as long as the application, so in a singleton program, if it holds a reference to an external object , the external object cannot be recycled, which will lead to memory leaks.

2. Some resources that provide close are not closed, causing memory leaks.

Database connections (dataSourse.getConnection()), network connections (socket) and io connections must be closed manually, otherwise they cannot be recycled.

3. Stop The World

Stop-the-World, or STW for short, refers to the application pause that occurs when a GC event occurs. When a pause occurs, the entire application thread will be suspended without any response , a bit like a stuck feeling. This pause is called STW.

  • Enumerating root nodes (GC Roots) in the reachability analysis algorithm will cause all Java execution threads to pause. (GC Roots will keep changing, so STW is required)

    • Analysis must be performed in a consistent snapshot
    • Consistency means that the entire execution system appears to be frozen at a certain point in time during the entire analysis
    • If the object reference relationship continues to change during the analysis process, the accuracy of the analysis results cannot be guaranteed.

Application threads interrupted by STW will resume after completing the GC. Frequent interruptions will make users feel like movie cassettes caused by slow network speed, so we need to reduce the occurrence of STW.

The STW event has nothing to do with which GC is used. All GCs have this event.

Even G1 cannot completely avoid the Stop-the-World situation. It can only be said that the garbage collector is getting better and better, the recycling efficiency is getting higher and higher, and the pause time is shortened as much as possible.

STW is automatically initiated and completed by the JVM in the background . When the user is invisible, stop all the user's normal working threads.

Do not use System.gc() during development; it will cause Stop-the-World.

example:

package _03;
​
import java.util.ArrayList;
import java.util.List;
​
public class _79_StopTheWorldDemo {
    public static class WorkThread extends Thread {
        List<byte[]> list = new ArrayList<byte[]>();
​
        public void run() {
            try {
                while (true) {
                    for(int i = 0;i < 1000;i++){
                        byte[] buffer = new byte[1024];
                        list.add(buffer);
                    }
​
                    if(list.size() > 10000){
                        list.clear();
                        //会触发full gc,进而会出现STW事件。 出现STW就会卡顿,打印的时间就不是每隔一秒打印一次了
                        System.gc();
                    }
                }
            } catch (Exception ex) {
                ex.printStackTrace();
            }
        }
    }
​
    public static class PrintThread extends Thread {
        public final long startTime = System.currentTimeMillis();
​
        public void run() {
            try {
                while (true) {
                    // 每秒打印时间信息
                    long t = System.currentTimeMillis() - startTime;
                    System.out.println(t / 1000 + "." + t % 1000);
                    Thread.sleep(1000);
                }
            } catch (Exception ex) {
                ex.printStackTrace();
            }
        }
    }
​
    public static void main(String[] args) {
        WorkThread w = new WorkThread();
        PrintThread p = new PrintThread();
//        w.start();
        p.start();
    }
}

When running, you can see that there is no STW process, and the interval between adjacent printed times is basically 1 second.

0.0
1.0
2.0
3.1
4.2
5.3
6.4
7.4
8.4
...

Open comments

As a result of the operation, STW will freeze when it appears, and the printing time is no longer printed every second, which indirectly proves the existence of STW.

0.0
1.2
2.6
3.6
4.8
5.11
6.14
7.17
...

4. Parallelism and concurrency of garbage collection

4.1. Concurrent

In the operating system, it means that several programs are in a period of time from starting to running, and these programs are all running on the same processor.

Concurrency is not "simultaneous" in the true sense. It is just that the CPU divides a time period into several time segments (time intervals), and then switches back and forth between these time intervals. Since the CPU processing speed is very fast, as long as time Properly handled intervals can make users feel that multiple applications are running at the same time. (The examples of eating and talking on the phone mentioned earlier)

4.2. Parallel

When the system has more than one CPU, when one CPU executes a process, the other CPU can execute another process. The two processes do not seize each other's CPU resources and can proceed at the same time. We call this parallel (Parallel).

In fact, the factor that determines parallelism is not the number of CPUs, but the number of CPU cores. For example, multiple cores of a CPU can be parallelized.

Suitable for weak interaction scenarios such as scientific computing and background processing.

4.3. Concurrency vs Parallelism

Compare the two:

Concurrency refers to multiple things happening at the same time within the same time period .

Parallel refers to multiple things happening at the same time at the same time .

Multiple concurrent tasks compete for resources from each other.

Multiple parallel tasks do not compete for resources with each other.

Parallelism will only occur in the case of multiple CPUs or multiple cores on one CPU. Otherwise, things that seem to happen at the same time are actually executed concurrently.

4.4. Parallelism and concurrency of garbage collection

Concurrency and parallelism, in the context of talking about garbage collectors, can be explained as follows:

  • Parallel (Paralle1): refers to multiple garbage collection threads working in parallel, but at this time the user thread is still in a waiting state.

    • Such as ParNew, Parallel Scavenge, Parallel Old;
  • Serial

    • Compared to the concept of parallelism, single-threaded execution.
    • If there is not enough memory, the program is paused and the JVM garbage collector is started for garbage collection. After recycling, start the program thread again.

Concurrency and parallelism, in the context of talking about garbage collectors, can be explained as follows:

Concurrency (Concurrent): refers to the user thread and the garbage collection thread executing at the same time (but not necessarily in parallel, they may execute alternately). The garbage collection thread will not pause the running of the user program during execution.

  • The user program continues to run, while the garbage collector thread runs on another CPU;

    Such as: CMS, G1

5. Safe spots and safe areas

(User threads can only stop at safe points or safe areas.)

5.1. Safepoint

When the program is executed, it is not possible to pause and start GC at all places. It can only pause and start GC at specific locations. These locations are called "Safepoints".

The choice of Safe Point is very important. If it is too few, it may cause the GC to wait too long. If it is too frequent, it may cause runtime performance problems. The execution time of most instructions is very short, and the standard is usually based on " whether it has characteristics that allow the program to execute for a long time ." For example: select some instructions with long execution times as Safe Points, such as method calls, loop jumps, and exception jumps .

How to check that all threads run to the nearest safe point and stop when GC occurs?

  • Preemptive interrupt : ( currently no virtual machine adopts it ) interrupt all threads first. If there are threads that are not at the safe point, restore the thread and let the thread run to the safe point.
  • Active interrupt : Set an interrupt flag, and each thread actively polls this flag when running to the Safe Point. If the interrupt flag is true, it will interrupt and suspend itself.

5.2. Safe Region

The Safepoint mechanism ensures that when the program is executed, it will encounter a Safepoint that can enter the GC within a short period of time. But what about when the program "does not execute"? For example, when a thread is in the Sleep state or Blocked state, the thread cannot respond to the JVM's interrupt request and "walks" to a safe point to interrupt the suspension, and the JVM is unlikely to wait for the thread to be awakened. For this situation, a safe region (Safe Region) is needed to solve it.

The safe zone means that the reference relationship of the object will not change in a piece of code. It is safe to start GC anywhere in this zone. We can also think of Safe Region as an expanded Safepoint.

When actually executed:

  • 1. When a thread runs to the Safe Region code, it is first marked as having entered the Safe Relgion. If GC occurs during this period, the JVM will ignore the thread marked as Safe Region.
  • 2. When the thread is about to leave the Safe Region, it will check whether the JVM has completed the GC. If it is completed, it will continue to run. Otherwise, the thread must wait until it receives a signal that it can safely leave the Safe Region;

6. Let’s talk about citations again

We hope to describe a class of objects that can be kept in memory when there is enough memory space and can be discarded if the memory space is still tight after garbage collection.

[Both partial and very high-frequency interview questions] What is the difference between strong quotations, soft quotations, weak quotations, and virtual quotations? What are the specific usage scenarios?

After JDK version 1.2, Java expanded the concept of references and divided references into:

  • Strong Reference
  • Soft Reference
  • Weak Reference
  • Phantom Reference

These four citation strengths gradually weaken in sequence. Except for strong references, the other three types of references can be found in the java.1ang.ref package. The figure below shows the classes corresponding to these three reference types. Developers can use them directly in applications.

In the Reference subclass, only the finalizer reference is visible within the package. The other three reference types are public and can be used directly in the application.

  • Strong Reference : The most traditional definition of "reference" refers to the reference assignment that is ubiquitous in program code, that is, a reference relationship like "Object obj=new Object()". In any case, as long as the strong reference relationship still exists, the garbage collector will never reclaim the referenced object.
  • SoftReference : Before the system is about to experience a memory overflow, these objects will be included in the recycling scope for a second recycling. If there is not enough memory after this recycling, a memory overflow exception will be thrown. (If there is insufficient memory, recycle it; if there is enough memory, keep it.)
  • Weak Reference : Objects associated with weak references can only survive until the next garbage collection. When the garbage collector works, objects associated with weak references will be recycled regardless of whether the memory space is sufficient. (If there is enough memory, it will be recycled, and if it is found, it will be recycled immediately)
  • Phantom Reference : Whether an object has a virtual reference will not affect its survival time at all, and it is impossible to obtain an instance of an object through a virtual reference. The only purpose of setting a virtual reference association for an object is to receive a system notification when the object is reclaimed by the collector. (Object recycling tracking)

6.1. Strong references

In Java programs, the most common reference type is strong reference ( more than 99% of ordinary systems are strong references ), which is our most common ordinary object reference and the default reference type.

When you use the new operator in the Java language to create a new object and assign it to a variable, the variable becomes a strong reference to the object.

A strongly referenced object is reachable, and the garbage collector will never reclaim the referenced object.

For an ordinary object, if there are no other reference relationships, as long as it exceeds the scope of the reference or explicitly assigns the corresponding (strong) reference to nu11, it can be collected as garbage. Of course, the specific recycling time still depends on the garbage. Collection strategy.

In contrast, objects with soft references, weak references and virtual references are soft-reachable, weakly-reachable and virtual-reachable, and can be recycled under certain conditions. Therefore, strong references are one of the main causes of Java memory leaks.

example:

package _03;
​
public class _80_StrongReferenceTest {
    public static void main(String[] args) {
        StringBuffer str = new StringBuffer ("Hello WORD");
        StringBuffer str1 = str;
​
        str = null;
        System.gc();
​
        try {
            Thread.sleep(3000);
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
​
        System.out.println(str1);
    }
}

Run, you can see that there is no recycling

Hello WORD

Summarize:

The two references in this example are both strong references. Strong references have the following characteristics:

  • Strong references can directly access the target object.
  • The object pointed by the strong reference will not be recycled by the system at any time. The virtual machine would rather throw an OOM exception than recycle the object pointed by the strong reference.
  • Strong references can cause memory leaks.

6.2. Soft references

(Soft references and weak references can be kept if there is enough space, and cleared if there is not enough space. For example, cache can use these two.)

Soft references are used to describe some objects that are useful but not necessary. Objects that are only associated with soft references will be listed in the recycling scope for a second recycling before a memory overflow exception occurs in the system. If there is not enough memory for this recycling, a memory overflow will be thrown. abnormal.

Soft references are often used to implement memory-sensitive caches. For example: cache uses soft references. If there is free memory, the cache can be temporarily retained and cleared when the memory is insufficient. This ensures that the cache will be used without running out of memory. (MyBatis’s internal classes use soft references)

When the garbage collector decides to recycle a soft-reachable object at a certain moment, it will clean up the soft reference and optionally store the reference in a reference queue (Reference Queue).

It is similar to a weak reference, except that the Java virtual machine tries to keep the soft reference alive for a longer time and cleans it as a last resort.

After JDK version 1.2, the java.lang.ref.SoftReference class is provided to implement soft references.

// 声明强引用
Object obj = new Object();
// 创建一个软引用
SoftReference<Object> sf = new SoftReference<>(obj);
obj = null; //销毁强引用

(As mentioned before, they are all reachable: objects with soft references, weak references and virtual references are soft-reachable, weakly-reachable and virtual-reachable.)

6.3. Weak citations

Weak references are also used to describe non-essential objects. Objects associated with weak references can only survive until the next garbage collection occurs . During system GC, as long as a weak reference is found, objects associated with only weak references will be recycled regardless of whether the system heap space is sufficient.

Discover and recycle

However, since the garbage collector's thread usually has a very low priority, objects holding weak references may not be found quickly. In this case, the weak reference object can exist for a longer period of time .

Weak references are the same as soft references. When constructing a weak reference, you can also specify a reference queue. When the weak reference object is recycled, it will be added to the specified reference queue. Through this queue, the recycling status of the object can be tracked.

Soft references and weak references are very suitable for saving dispensable cache data. If you do this, when the system memory is insufficient, these cached data will be recycled without causing memory overflow. When memory resources are sufficient, these cached data can exist for a long time, thereby speeding up the system.

After JDK version 1.2, the java.lang.ref.WeakReference class is provided to implement weak references.

// 声明强引用
Object obj = new Object();
// 创建一个弱引用
WeakReference<Object> sf = new WeakReference<>(obj);
obj = null; //销毁强引用

The biggest difference between weak reference objects and soft reference objects is that when GC is recycling, it needs to use an algorithm to check whether to recycle soft reference objects, while for weak reference objects, GC always recycles them. Weak reference objects are easier and faster to be recycled by GC .

Interview question: Have you ever used WeakHashMap in development?

6.4. Virtual references

Object recycling tracking

Also known as "ghost reference" or "phantom reference", it is the weakest of all reference types.

Whether an object has a virtual reference does not determine the life cycle of the object at all. If an object only holds a virtual reference, it is almost the same as having no reference and may be collected by the garbage collector at any time.

It cannot be used alone, nor can it obtain the referenced object through a virtual reference. When trying to get an object through the get() method of a virtual reference, it is always null.

The only purpose of setting a virtual reference association for an object is to track the garbage collection process. For example: you can receive a system notification when this object is recycled by the collector.

Virtual references must be used with reference queues. A virtual reference must be provided as a parameter when creating a reference queue. When the garbage collector is preparing to recycle an object, if it finds that it still has a virtual reference, it will add the virtual reference to the reference queue after recycling the object to notify the application of the object's recycling status.

Since virtual references can track the recycling time of objects, some resource release operations can also be executed and recorded in virtual references.

After JDK version 1.2, the java.lang.ref.PhantomReference class is provided to implement virtual references. '

// 声明强引用
Object obj = new Object();
// 声明引用队列
ReferenceQueue phantomQueue = new ReferenceQueue();
// 声明虚引用(还需要传入引用队列)
PhantomReference<Object> sf = new PhantomReference<>(obj, phantomQueue);
obj = null; 

6.5. Terminator references

It is used to implement the finalize() method of the object and can also be called a finalizer reference.

No manual coding required, it works internally with reference queues.

During GC, finalizer references are enqueued (similar to virtual references). The Finalizer thread finds the referenced object through the finalizer reference and calls its finalize() method. The referenced object is not recycled until the second GC.

Guess you like

Origin blog.csdn.net/weixin_47465999/article/details/127104039