JVM Garbage Collection heap and method area

GC of HotSpot JVM heap https://www.cnblogs.com/shudonghe/p/3457990.html

1 Overview

Garbage Collection (Garbage Collection, this technology has been accompanied by the growth of Java from beginning to end, but in fact GC appeared earlier than Java, it was born in 1960 MIT language Lisp using dynamic allocation and garbage collection technology. After nearly After 60 years of development, the current dynamic memory allocation and memory recovery technologies have become very mature. All garbage collection has been automated. After iterative updates, automatic recovery has also been repeatedly optimized. The efficiency and performance are very impressive.

Why do you need to know about GC?

When troubleshooting problems such as memory overflow and memory leaks, as well as program performance tuning and solving performance bottlenecks caused by garbage collection in concurrent scenarios, it is necessary to monitor and adjust the GC mechanism.

2 How to identify garbage objects?

Objects that are no longer in use are considered "garbage" and should be recycled. In Java, GC is only for heap memory. There is no pointer in the Java language, but a reference. Objects in the heap memory that are not referenced by any stack memory should be recycled.

2.1 Reference counting algorithm

The reference counting algorithm is one of the algorithms for judging whether an object is alive: it adds a reference counter to each object, and whenever there is a reference to it, the counter value increases by 1; when the reference becomes invalid, the counter value decreases by 1; Objects with a time counter of 0 cannot be used and will be reclaimed by the garbage collector.

Disadvantages: It is impossible to solve the problem of mutual circular reference between objects. That is, when two objects are circularly referenced, the reference counters are both 1. When the object cycle ends, it should be recycled but cannot be recycled, resulting in a memory leak.

public class GcTest {
    
    

    public static void main(String[] args) {
    
    
        MyObject myObject_1 = new MyObject();
        MyObject myObject_2 = new MyObject();
        
        myObject_1.instance = myObject_2;
        myObject_2.instance = myObject_1;

        myObject_1 = null;
        myObject_2 = null;  

        //  对象循环引用,当时用引用计数算法时,无法回收这两个对象
        System.gc();
    }
    
    static class MyObject{
    
    
        Object instance;
    }
}

2.2 Accessibility Analysis Algorithm

At present, the mainstream use is the reachability analysis algorithm to determine whether the object is alive. The basic idea of ​​the algorithm: take "GC Roots" as the starting point of the object, start searching downward from this node, and the path traveled by the search becomes a reference chain (Reference Chain). When an object is connected to GC Roots without any reference chain, it proves that This object is not available.

Which objects can be used as GC Roots?

Objects referenced in the virtual machine stack (local variable table in the stack frame);
objects referenced by class static properties in the method area;
objects referenced by constants in the method area;
objects referenced by JNI (Native method) in the local method stack;
active threads reference object.

3 Four kinds of references in Java

Before JDK1.2, the reference definition in Java was very simple: if the value stored in the reference type data represented the starting address of another piece of memory, it was said that this piece of memory represented a reference. But this definition is too narrow. If an object is between the referenced and unreferenced states, then this definition is powerless. After JDK1.2, Java expanded the concept of references, and divided references into strong references (Strong Reference), soft references (Soft Reference), weak references (Weak Reference), phantom references (Phantom Reference), these four references Intensity decreases gradually.

Strong references (Strong Reference)
Strong references are values ​​that are ubiquitous in program code. Objects created with the new keyword are all strong references. As long as strong references still exist, the garbage collector will never recycle the referenced objects.

Soft Reference (Soft Reference)
Soft references are used to describe some objects that are still useful but not necessary. Before the system is about to overflow memory, these objects will be included in the scope of recycling for the second recycling. If there is not enough memory for this recovery, a memory overflow exception will be thrown. Can be used to implement caching. Soft reference objects will be put into the reference queue (ReferenceQueue) when they are recycled.

//  软引用
SoftReference<String> softReference = new SoftReference<>("北风IT之路");
弱引用(Weak Reference)

Weak references are also used to describe non-essential objects, but their strength is weaker than soft references. **Objects associated with weak references can only survive until the next GC occurs. When the garbage collector works, regardless of the current memory Whether it is enough, this type of object will be recycled. **Weak reference objects will be put into the reference queue (ReferenceQueue) when they are recycled.

//  弱引用
WeakReference<String> weakReference = new WeakReference<>("北风IT之路");

Phantom Reference (Phantom Reference)
Phantom reference is called ghost reference or phantom reference, it is the weakest kind of reference relationship. Whether an object has virtual references will not affect its lifetime at all, and object instances cannot be obtained through virtual references. It may be recycled at any time, and it is generally used to track the activities of objects being recycled by the garbage collector and act as a sentinel. Must be used in conjunction with ReferenceQueue.

//  虚引用,必须配合引用队列使用
ReferenceQueue<String> referenceQueue = new ReferenceQueue<>();
PhantomReference<String> phantomReference = new PhantomReference<>("北风IT之路",referenceQueue);

4 finalize() gives the object rebirth

Objects marked as unreachable in the reachability analysis algorithm may not necessarily be recycled, and they have a second chance to be reborn. Each object needs to be marked twice before being recycled. One is to mark once if there is no associated reference chain. The second time is to judge whether the object overrides the finalize() method. .

If this object is judged by jvm to be necessary to execute the finalize() method, then this object will be put into the F-Queue queue, and a low-priority finalizer thread automatically created by the virtual machine will be burned to execute it. But "execution" here means that the virtual machine triggers this method, but ** does not mean that it will wait for it to finish running. **The virtual machine is optimized here, because if an object runs for a long time in the finalize method or sends an infinite loop, it may cause other objects in the F-Queue to wait forever, and may even cause the entire memory The recycling system crashes. If you want to regenerate this object in the finalize method you can do as follows:

public class GcTest {
    
    
    public static GcTest instance = null;

    @Override
    protected void finalize() throws Throwable {
    
    
        super.finalize();
        System.out.println("收集器检测到finalize方法,对象即将获得一次重生的机会");
        instance = this;
    }

    public static void main(String[] args) throws InterruptedException{
    
    
        instance = new GcTest();
        //  引用置为空,堆内对象将视为垃圾
        instance = null;
        //  执行gc
        System.gc();
        Thread.sleep(500);
        //  虽然执行了gc,但是可能在finalize方法中获得重生,
        //  因此可能会打印出myObject的地址
        System.out.println(instance);
        //  最后打印出jvm.GcTest@7cc355be
    }
}

Notice! The finalize() method will only be called once by the system, and will only be called for the first time after multiple gcs, so there is only one chance of rebirth.

GC in the 5 method area

If a string "abc" has entered the constant pool, but the current system does not have any String object that is "abc", then this object should be recycled. Garbage collection in the method area (the permanent generation in the HotSpot virtual machine) mainly recycles two parts: obsolete constants and useless classes. For example, the above "abc" is an obsolete constant, so which classes are useless?

All instances of this class have been recycled, that is, there are no instances of this class in the Java heap; the
ClassLoader that loads this class has been recycled;
the java.lang.Class object corresponding to this class is not referenced anywhere, and cannot Access methods of that class via reflection anywhere.
Many people think that the method area (or the permanent generation in the HotSpot virtual machine) is not garbage collected. The Java virtual machine specification does say that the virtual machine is not required to implement garbage collection in the method area, and garbage collection is performed in the method area. The "cost performance" is generally relatively low: in the heap, especially in the new generation, a garbage collection of conventional applications can generally reclaim 70% to 95% of the space, while the garbage collection efficiency of the permanent generation is much lower than this.

Permanent generation garbage collection mainly recycles two parts: obsolete constants and useless classes. Recycling obsolete constants is very similar to reclaiming objects in the Java heap. Take the recycling of literals in the constant pool as an example. Suppose a string "abc" has entered the constant pool, but there is no String object called "abc" in the current system. In other words, there is no String object reference The "abc" constant in the constant pool has no other references to this literal. If memory recovery occurs at this time, and if necessary, the "abc" constant will be cleared out of the constant pool by the system. The symbolic references of other classes (interfaces), methods, and fields in the constant pool are similar to this.

It is relatively simple to determine whether a constant is an "abandoned constant", but the conditions for determining whether a class is a "useless class" are relatively harsh. A class needs to meet the following three conditions at the same time to be considered a "useless class":

All instances of this class have been recycled, that is, there are no instances of this class in the Java heap.
The ClassLoader that loaded this class has been recycled.
The java.lang.Class object corresponding to this class is not referenced anywhere, and the method of this class cannot be accessed through reflection anywhere.
The virtual machine can recycle useless classes that meet the above three conditions. What is said here is only "can", not the same as objects, which will inevitably be recycled if they are not used. Whether to recycle classes, the HotSpot virtual machine provides the -Xnoclassgc parameter to control, you can also use -verbose: class and -XX: +TraceClassLoading, -XX: +TraceClassUnLoading to view class loading and unloading information, where -verbose: class and - XX : +TraceClassLoading can be used in the Product version of the virtual machine, -XX : +TraceClassUnLoading parameter needs to be supported by the FastDebug version of the virtual machine.

In scenarios where reflection, dynamic proxies, CGLib and other ByteCode frameworks are widely used, dynamically generated JSPs, and OSGi are frequently customized ClassLoader scenarios, the virtual machine needs to have the function of class unloading to ensure that the permanent generation will not overflow.

6 Garbage Collection Algorithms

1. Mark-Sweep Algorithm (Mark-Sweep)

Algorithm idea: The algorithm is divided into two steps of "marking" and "cleaning". First, mark all objects that need to be recycled, and then recycle all marked objects uniformly after the marking is completed.

defect:

The two processes of marking and cleaning are not efficient;
memory fragmentation is prone to occur, and too much fragmented space may make it impossible to store large objects.
Applicable to the case where there are a majority of surviving objects.

Image source: https://cloud.tencent.com/developer/article/1336613

2. Copy algorithm (Copy)

Algorithm idea: Divide the available memory into two pieces of equal size, and only use one of them at a time. When this piece of memory is used up, copy the surviving object to another piece, and then clean up the used memory space at one time.

defect:

The available memory is reduced to half of the original
. The algorithm has high execution efficiency and is suitable for situations where there are only a few surviving objects.

Image source: https://cloud.tencent.com/developer/article/1336613

3. Mark-compact algorithm (Mark-compact)

Algorithm idea: The marking process is the same as the marking-cleaning algorithm, but the latter is different. It moves all surviving objects to one end, and then directly cleans up the memory outside the end boundary.

Effectively avoid the generation of memory fragmentation.

4. Generational Collection

Most of the current garbage collection uses the generational collection algorithm. This algorithm does not have any new ideas. It just divides the memory into several blocks according to the different life cycles of the objects, and each block is collected using a different algorithm. **Before jdk8, it was divided into three generations: young generation, old generation, and permanent generation. After jdk8, the term "permanent generation" was canceled, and the metaspace replaced it. **Generally, the young generation uses the copy algorithm (low object survival rate), and the old generation uses the mark finishing algorithm (high object survival rate).

4.1 Young generation (mainly copy algorithm)

It is possible to quickly collect objects with short lifecycles. The entire young generation occupies 1/3 of the heap space. The young generation is divided into three areas, Eden, Survivor-from, and Survivor-to. The default ratio of its memory size is 8:1:1 (adjustable). Most newly created Objects are created in the Eden area. When recycling, first copy the surviving objects in the Eden area to a Survivor-from area, then clear the Eden area, and increase the age of the surviving objects by 1; when the Survivor-from area is also full, copy the Eden area and Survivor-from Copy the surviving objects in the Survivor-to area to another Survivor-to area, then clear Eden and this Survivor-from area, and the age of the surviving objects is +1; at this time, the Survivor-from area is empty, and then the Survivor-from area and the Survivor-to area Exchange, that is, keep the Survivor-from area empty (the Survivor-from at this time is the original Survivor-to area), and so on. The GC performed by the young generation is Minor GC.

The iterative update of the young generation is very fast, and the survival time of most objects is relatively short, so the efficiency and performance of GC are high, so the copy algorithm is used, and it is divided into three areas in this way, which ensures that each GC only wastes 10 % memory, memory utilization has also improved.

4.2 Old generation (mark-sorting algorithm-based)

Objects that are still alive after many garbage collections in the young generation (15 years old by default) will be put into the old generation. Because most of the objects in the old generation are alive, the algorithm used is the mark-sort algorithm. The GC performed in the old generation is Full GC.

4.3 Permanent Generation/Metaspace

jdk8 and earlier:

The permanent generation is used to store static files, such as Java classes, methods, etc. The recovery of this area is consistent with the "method area memory recovery" mentioned above. However, the permanent generation is the heap memory used. If too many objects are created, it is easy to cause memory overflow OOM (OutOfMemory).

After jdk8:

After jdk8, the term "permanent generation" was canceled and replaced by metaspace. The stored content has not changed, but the stored address has changed. The metaspace uses the host memory instead of the heap memory. The size of the metaspace is limited by the host. Memory limit, which effectively avoids memory overflow when creating a large number of objects.

Seven, Minor GC and Full GC

Minor GC and Full GC have been mentioned many times before, so what is the difference between them?

Minor GC is the new generation GC: the garbage collection action that occurs in the new generation, because Java has the characteristics of eternity, so Minor GC is very frequent, and the recovery speed is generally faster.
Major GC / Full GC: Occurs in the old age, often accompanied by at least one Minor GC. The speed of Major GC is generally more than two times slower than Minor GC.
Conditions for Minor GC to occur:

When a new object is generated and fails to apply for space in Eden;
Full GC occurs under the following conditions:

Insufficient space in the old generation
Insufficient space in the permanent zone (before jdk8)
System.gc() is displayed calling
Minor GC The average size of promotion to the old generation is greater than the remaining space in the old generation
JDK applications that use RMI for RPC or management, execute 1 per hour Second Full GC
Eight, common garbage collectors (jdk8 and before)

A picture can be used to clear and see the relationship between different garbage collectors, and the connection indicates that they can be used together.

Serial collector (copy algorithm)
new generation single-threaded collector, both marking and cleaning are single-threaded, the advantage is simple and efficient. It is the default GC method at the client level, which can be specified by -XX:+UseSerialGC.

Serial Old collector (mark-collation algorithm)
old generation single-threaded collector, the old version of the Serial collector.

ParNew collector (replication algorithm)
The new generation collector, the multi-threaded version of the Serial collector, performs better in the case of multi-core CPUs.

Parallel Scavenge collector (replication algorithm)
parallel collector, pursuing high throughput and efficient use of CPU. It is suitable for background applications and other scenarios that do not require high interaction response. It is the default GC method at the server level. You can use -XX:+UseParallelGC to force the specification, and use -XX:ParallelGCThreads=2 to specify the number of threads.

Parallel Old collector (replication algorithm) The old generation version of the Parallel Scavenge collector, a parallel collector, with throughput priority.
CMS (Concurrent Mark Sweep) collector (mark-cleaning algorithm) High concurrency, low pause, pursue the shortest GC recovery pause time (Stop The World), relatively high CPU usage, fast response time, short pause time, multi-core CPU pursues high response The choice of time, but because of the use of mark cleaning algorithm, it is easy to generate memory fragmentation.
G1 Collector
G1 is a garbage collector for server-side applications that supports parallelism and concurrency, generational collection, space integration, and predictable pause capabilities, which can be applied to both the young generation and the old generation.

Image source: https://cloud.tencent.com/developer/article/1336613

9. Summary of Garbage Collector Parameters

UseSerialGC: The default value of the virtual machine running in Client mode. After this switch is turned on, the collector combination of Serial+Serial Old is used for memory recovery. UseParNewGC: After this switch is turned on, the collector combination of ParNew+Serial Old is used for memory recovery
.
UseConcMarkSweepGC : After this switch is turned on, the collector combination of ParNew+CMS+Serial Old is used for memory recovery. The Serial Old collector will be used as a backup collector after the Concurrent Mode Failure of the CMS collector. Use
UseParallelGC: the default value of the virtual machine running in Server mode. After this switch is turned on, use the collection of Parallel Scavenge + Serial Old (PS MarkSweep) Collector combination for memory recovery
UseParallelOldGC: After this switch is turned on, the collector combination of Parallel Scavenge + Parallel Old is used for memory recovery
SurvivorRatio: The ratio of the capacity of the Eden area to the Survivor area in the new generation, the default value is 8, representing Eden: Survivor=8 : 1
PretenureSizeThreshold: The size of objects directly promoted to the old generation. After setting this parameter, objects larger than this parameter will be allocated directly in the old generation. MaxTenuringThreshold
: The age of objects promoted to the old generation. The age is increased by 1, and when this parameter is exceeded, it will enter the old age
UseAdaptiveSizePolicy: Dynamically adjust the size of each area in the Java heap and the age of entering the old age
HandlePromotionFailure: Whether to allow allocation guarantee failure, that is, the remaining space of the old generation is not enough to cope with the extreme situation where all objects in the entire Eden and Survivor areas of the new generation are alive ParallelGCThreads: Set the number of threads for memory recovery during
parallel
GC The ratio of the total time, the default value is 99, which allows 1% of the GC time. Only effective when using the Parallel Scavenge collector
MaxGCPauseMillis: Set the maximum pause time of GC, only effective when using the Parallel Scavenge collector
CMSInitingOccupancyFraction: Set the CMS collector to trigger garbage collection after how much space in the old age is used. The default value is 68%, which only takes effect when using the CMS collector.
UseCMSCompactAtFullCollection: Set whether the CMS collector should perform a memory defragmentation after garbage collection is completed, and only take effect when using the CMS collector. CMSFullGCsBeforeCompaction
: Set the CMS collector to perform several Start a memory defragmentation after the first garbage collection. Only takes effect when using the CMS collector

Guess you like

Origin blog.csdn.net/weixin_44313315/article/details/105785556