[Three] JVM garbage collection mechanism

 

table of Contents

1. Conditions for triggering object recycling

2. Judgment of garbage objects:

1. Reference counter algorithm

2. Root search algorithm

Three, reference type

1. Strong references:

2. Soft references:

3. Weak reference:

4. PhantomReference:

Four, garbage collection algorithm:

1. Mark-clear algorithm:

2. Copy algorithm

3. Marking/organizing algorithm:

4. Comparison of the above three algorithms

5. Generational collection algorithm (most of the collections are for the young generation)

1) Cenozoic

2) Old age

Five, other


1. Conditions for triggering object recycling

GC mainly deals with the recycling of objects. The following situations will trigger the recycling of an object:

1. The object is not referenced

2. An uncaught exception occurred in the scope

3. The program has been executed normally in the scope

4. The program executed System.exit()

5. The program terminates unexpectedly (killed process, etc.)

 

2. Judgment of garbage objects:

1. Reference counter algorithm

Before JDK1.2, using this algorithm , that is, when this class is loaded into memory, a series of information such as method area, stack, program counter, etc. will be generated. When the object is created, the object is allocated in the stack space. At the same time, a reference counter is generated, and the reference counter is +1. When there is a new reference, the reference counter continues to be +1, and when one of the references is destroyed, the reference counter is -1. When the reference counter is reduced to zero At that time, it indicates that the object has no references and can be recycled! This algorithm was widely used in versions before JDK1.2, but as the business developed, a problem soon appeared

New objects from the objects = ();

ObjectB b = new ObjectB();

a.obj = b;

b.obj = a;

In fact, these two objects have no additional references and are garbage, but objA and objB also have a mutual reference, which means that the reference counters of the two objects are each 1.

2. Root search algorithm

Use this algorithm after JDK1.2 to treat all reference relationships as a graph, start from a node GC ROOT, find the corresponding reference node, after finding this node, continue to look for the reference node of this node, when all references After the node search is completed, the remaining nodes are considered as unreferenced nodes, that is, useless nodes.

At present, the objects that can be used as GC Root in java are

1. Objects referenced in the virtual machine stack (local variable table)

2. Objects referenced by static properties in the method area

3. Objects referenced by constants in the method area

4. Reference object of JNI (Native method) in the local method stack

JNI: You can use JNI to implement "native methods" and call them in Java programs.

 

Only when the object has a reference chain connected to GC Roots can it be guaranteed not to be recycled, that is, the object is a strong reference. Then some objects, we hope not to reclaim when the memory is sufficient, and then reclaim them when the memory is insufficient. If there is only a strong reference, the object will never be recycled. So there are the concepts of soft references, weak references, and phantom references.

* 1. Strong citation

* Even OOM will not be recycled

* 2. Soft references

* Will be recycled when there is insufficient memory

* 3. Weak references

* As long as the GC will be recycled

* 4. Virtual references

* The only function is to listen to be recycled

 

Three, reference type

1. Strong references:

Strong references will not be reclaimed by the GC, and there is no actual corresponding type in java.lang.ref. Strong references are the most frequently encountered in work. Object obj = new Object(); The obj reference here is a strong reference. If an object has a strong reference, it is similar to an essential household item, and the garbage collector will never recycle it. When the memory space is insufficient, the Java virtual machine would rather throw an OutOfMemoryError error to make the program terminate abnormally, and would not solve the problem of insufficient memory by reclaiming objects with strong references at will.

2. Soft references:

If an object only has soft references, it is similar to a mere commodity. If the memory space is sufficient, the garbage collector will not reclaim it, and if the memory space is insufficient, it will reclaim the memory of these objects. As long as the garbage collector does not reclaim it, the object can be used by the program. Soft references can be used to implement memory-sensitive caches. The soft reference can be used in conjunction with a reference queue (ReferenceQueue). If the object referenced by the soft reference is garbage collected, the Java virtual machine will add the soft reference to the reference queue associated with it.

3, the weak reference (weak Reference) :

Weak references are weaker than soft references in strength and are represented by the class WeakReference . Its role is to reference an object, but it does not prevent the object from being recycled. If a strong reference is used, as long as the reference exists, the referenced object cannot be recycled. Weak references do not have this problem. When the garbage collector is running, if all references to an object are weak references, the object will be recycled. The function of weak references is to solve the coupling relationship between the objects brought by strong references in terms of survival time. The most common use of weak references is in collections, especially in hash tables. The interface of the hash table allows any Java object to be used as a key. When a key-value pair is put into the hash table, the hash table object itself has references to these key and value objects. If this kind of reference is a strong reference, then as long as the hash table object itself is still alive, the key and value objects contained in it will not be recycled. If a long-lived hash table contains many key-value pairs, it may eventually consume all the memory in the JVM.

The solution to this situation is to use weak references to refer to these objects, so that the key and value objects in the hash table can be garbage collected. WeakHashMap is provided in Java to meet this common requirement.

4. Phantom Reference ( PhantomReference ):

Before introducing ghost references, we must first introduce the object finalization mechanism (finalization) provided by Java. There is a finalize method in the Object class, which was originally designed to perform some cleanup work before an object is actually recycled. Because Java does not provide a mechanism similar to C++'s destructor, it is implemented through the finalize method. But the problem is that the running time of the garbage collector is not fixed, so the actual running time of these cleanup tasks is also unpredictable. A phantom reference can solve this problem. When creating a Phantom Reference PhantomReference, a reference queue must be specified. When the finalize method of an object has been called, the ghost reference of this object will be added to the queue. By checking the contents of the queue, you can know whether an object is ready to be recycled.

The use of ghost references and their queues is rare, and they are mainly used to achieve finer memory usage control, which is very meaningful for mobile devices. After determining that an object is to be recycled, the program can apply for memory to create a new object. In this way, the memory consumed by the program can be maintained at a relatively low amount.

 

Four, garbage collection algorithm:

1. Mark-clear algorithm:

When the available memory in the heap is exhausted, the entire program will be stopped (also known as stop the world), and then two tasks will be performed, the first is marking, and the second is clearing .

(1) Marking: The marking process is actually to traverse all GC Roots, and then mark all objects reachable by GC Roots as surviving objects.

(2) Clear: The process of clearing will traverse all the objects in the heap and clear all the unmarked objects.

It can be understood that when the program is running, if the available memory is exhausted, the GC thread will be triggered and the program will be suspended, and then the objects that are still alive will be marked again, and finally all the unmarked objects in the heap Clear all, and then clear the marked objects, and then let the program resume operation.

Disadvantages:

1. The efficiency is relatively low (recursive and full heap object traversal), and the application needs to be stopped during GC , which will lead to a very poor user experience.

2. The free memory cleared in this way is not continuous. Our dead objects appear in all corners of the memory at random. Now that they are cleared, the memory layout will naturally be messed up. In order to cope with this, the JVM has to maintain a free list of memory, which is another overhead.

2. Copy algorithm

The replication algorithm divides the memory into two sections . At any point in time, all dynamically allocated objects can only be allocated in one section (called the active section), while the other section (called the free section) is free. When the effective memory space is exhausted , the JVM will suspend the program and start the replication algorithm GC thread. Next GC thread will live objects within a range of activities, all copied to the free zone, and in strict accordance with the memory address in order of priority, at the same time, GC thread will be updated live objects referenced memory address points to a new memory address. At this point, the free section has been exchanged with the active section, and all garbage objects have now been left in the original active section, which is the current free section. In fact, when the active interval is converted into a spatial interval, the garbage objects have been collected all at once.

After the GC thread processing using the replication algorithm:

Advantages: The copy algorithm makes up for the shortcomings of the memory layout confusion in the mark/clear algorithm.

Disadvantages :

1. It wastes half of the memory, which is terrible.

2. If the survival rate of the object is very high, we can be extreme, assuming 100% survival, then we need to copy all surviving objects and reset all reference addresses. The time it takes to copy this work will become non-negligible when the survival rate of the object reaches a certain level.

Therefore, it is not difficult to see from the above description that if the replication algorithm is to be used, at least the survival rate of the object must be very low.

3. Marking/organizing algorithm:

(1) Marking: Its first stage is exactly the same as the mark/sweep algorithm, which is to traverse GC Roots and then mark the surviving objects.

(2) Sorting: Move all surviving objects and arrange them in order of memory address, and then reclaim all the memory after the end memory address. Therefore, the second stage is called the finishing stage.

The marking/organizing algorithm can not only make up for the shortcomings of the memory area scattered in the marking/clearing algorithm, but also eliminates the high cost of halving the memory in the copy algorithm. The only disadvantage of the marking/organizing algorithm is that it is not efficient, not only marking All surviving objects must also sort out the reference addresses of all surviving objects.

4. Comparison of the above three algorithms

The similarities of the three algorithms:

1. The three algorithms are based on the root search algorithm to determine whether an object should be recycled , and the theoretical basis that supports the root search algorithm to work normally is the content of the variable scope in the grammar.

2. When the GC thread starts, or when the GC process starts, they must stop the application (stop the world) . because:

 Suppose we have just marked the rightmost object in the figure, and it will be marked as A for the time being. As a result, a new object B is added in the program at this time, and the A object can reach the B object, but because the A object has been marked as end at this time, The mark bit of the B object at this time is still 0, because it missed the mark phase, so when it comes to the clearing phase next, the new object B will be forced to clear it out. In this way, it is not difficult to imagine the result, the GC thread will cause the program to not work properly. The above results are of course unacceptable. We just created an object, and after a GC, it suddenly became null.

Comparison of the three:

Efficiency: Copy Algorithm>Mark/Organize Algorithm>Mark/Clear Algorithm (The efficiency here is just a simple comparison of time complexity, which is not necessarily the case in practice).

Memory neatness: Copy algorithm=mark/organize algorithm>mark/clear algorithm.

Memory utilization: mark/organize algorithm=mark/clear algorithm>copy algorithm.

5. Generational collection algorithm ( most of the collections are for the young generation )

It divides the memory into several blocks according to the life cycle of the object, and generally divides the Java heap into the young generation and the old generation. In the new generation, a large number of objects are found dead each time garbage collection, and only a few survive. Therefore, the replication algorithm can be used to complete the collection. In the old generation, because the object survival rate is high and there is no additional space for allocation guarantees, it is Must use mark-sweep algorithm or mark-sort algorithm for recycling.

Minor GC (minor GC) : GC only for the young generation area.

Global GC (major GC or Full GC) : GC for the old generation, occasionally accompanied by GC for the young generation and GC for the permanent generation. Since the old generation and the permanent generation have relatively poor GC effects, and the memory usage growth rate of the two is also slow, in general, it takes several ordinary GCs to trigger a global GC.

1) Cenozoic

Divided into three areas, an Eden area and two Survivor areas , the ratio between them is (8:1:1), this ratio can also be modified. Under normal circumstances, objects are mainly allocated to the Eden area of ​​the young generation (when the Eden area is not enough, the Minor  gc is triggered , if it is still not enough, then the object can only be placed in the old generation through the allocation guarantee mechanism ) . In a few cases, it may also be directly Allocated in the old age (as mentioned in the previous brackets), a Survivor area is used as an active area (store the objects that survived the last gc), and a Survivor area is used as an idle area. Each time the Java virtual machine uses Eden in the new generation and one of the Survivor (From), after a Minor GC, the surviving objects in Eden and Survivor are copied to another Survivor space at once (the copy used here ) The algorithm performs GC ), and finally clears Eden and the Survivor (From) space just used. Set the age of the objects surviving in the Survivor space at this time to 1. Every time these objects survive a GC in the Survivor area, they may encounter another Survivor space and there is not enough space to store the last new generation collection. The surviving objects, these objects will enter the old age directly through the allocation guarantee mechanism (the old generation is the "spare warehouse" of the new generation.);

to sum up:

1. Minor GC is the garbage collection that occurs in the new generation, and the replication algorithm adopted;

2. The space used in the new generation does not exceed 90% each time, the Eden area is used to store new objects, and the Survivor activity area is used to store the objects that survived the last GC;

3. The Eden area and a Survivor area are emptied after each collection of the Minor GC;

2) Old age

There is no backup warehouse. When the object that is about to enter the elderly area exceeds the remaining size of the elderly area, a full GC (major GC) is triggered. 

The storage in the old age is all objects with a long life cycle. For some larger objects (that is, a larger contiguous memory space needs to be allocated), they are directly stored in the old age (it cannot be stored after minorGC is triggered). There are also many objects that survived the Survivor area of ​​the new generation.

In the old age, Full GC was used , and Full GC used the mark-sweep/mark-sort algorithm. The Full GC in the old age is not as frequent as the Minor GC operation, and the time required to perform a Full GC is longer than that of the Minor GC.

Recovery conditions of method area (permanent generation):

1. All instances are recycled

2. The ClassLoader that loaded the class is recycled

3. The Class object cannot be accessed through any means (including reflection)

Five, other

Garbage collector: the concrete realization of the garbage collection algorithm.

System.gc();

In addition, the GC of Java is mobilized by the JVM itself, and is executed when needed. The above instructions only tell the JVM to GC once as soon as possible , but will not execute the GC immediately .

Guess you like

Origin blog.csdn.net/Jack_PJ/article/details/87979882