Article directory
foreword
JVM's garbage collection mechanism (Garbage Collection) is one of the important features in Java. It is responsible for automatically reclaiming unused memory when the program is running, so as to avoid memory leaks and improve program performance. The design and implementation of garbage collection mechanism is crucial to the running efficiency and stability of Java programs. Therefore, as a qualified programmer, it is necessary to have a deep understanding of the garbage collection mechanism of the JVM in order to write and design reasonable and efficient code.
1. Judgment method of the dead object
In the garbage collection mechanism of the JVM, it is very important to determine whether an object is dead, because only dead objects can be reclaimed by the garbage collector and release the memory it occupies . The commonly used judgment methods include 引用计数算法
and 可达性分析算法
.
1.1 Reference counting algorithm
The reference counting algorithm is a simple garbage collection algorithm. Its basic idea is to maintain a reference counter for each object to record how many references currently point to the object. When the reference counter becomes zero, it means that the object does not have any references pointing to it, so it can be judged that the object is in a dead state, and it can be recycled .
The implementation of reference counting is very simple, and the judgment efficiency is also very high. It is a good algorithm in most cases. For example, Python uses this method to manage memory. But the biggest problem with reference counting is circular references.
For example, the following code demonstrates a circular reference:
class Test{
public Test test;
}
public class Demo {
public static void main(String[] args) {
Test test1 = new Test();
Test test2 = new Test();
test1.test = test2;
test2.test = test1;
test1 = null;
test2 = null;
}
}
The above code is a case of simple circular reference. The field Test
in the class test
is a reference type, which points to another Test
object. In the function, two objects main
are common , namely and , and then they are assigned to each other's fields.Test
test1
test2
test
If the method of reference counting is used, because test1
and test2
each other refer to each other, their reference counters are always not zero, because the number of references between them is always 1. According to the principle of the reference counting algorithm, even if these two objects are no longer used by the program, their reference counters will not become zero, so they will not be reclaimed by the garbage collector.
1.2 Accessibility Analysis Algorithm
In order to solve the problem of circular references, JVM uses a more complex reachability analysis algorithm. The basic idea of the reachability analysis algorithm is to use a group GC Roots
of objects as the starting point, and determine whether the object is reachable by searching downward and traversing the object application chain .
The meaning of object reachable:
- An object is reachable, meaning it can be
GC Roots
reached from the object through a chain of references.- If an object is unreachable, that is, it is not
GC Roots
connected to any object, then the object will be judged as dead, that is, it can be reclaimed by the garbage collector.
Objects in the JVM GC Roots
include:
- the class object of the loaded class (Class Object);
- Objects referenced by class static variables;
- The stack frame (Stack Frame) of the active thread (Active Thread);
JNI(Java Native Interface)
The object referenced in the native method stack (Native Method Stack) in .
Through the reachability analysis algorithm, the JVM can accurately judge the life cycle of the object, avoid the circular reference problem of the reference counting algorithm, and effectively perform garbage collection.
2. Garbage collection algorithm
The garbage collection algorithm is an important part of the JVM garbage collection mechanism. It is responsible for recycling objects that are no longer used, releasing memory resources, and ensuring the operating efficiency and stability of the program. In the JVM, common garbage collection algorithms include: mark-clear algorithm, copy algorithm, mark-organize algorithm and generational algorithm.
2.1 Mark-Clear Algorithm
标记-清除算法
It is one of the most basic garbage collection algorithms. The process is divided into two phases: marking phase and clearing phase.
- Marking phase : Starting from a group of
GC Roots
objects called , traverse all reachable objects and mark them, indicating that these objects are active objects and will not be recycled. - Cleanup phase : traverse the entire heap space, treat unmarked objects as garbage objects, and directly reclaim the memory of these garbage objects. The reclaimed memory forms some discontinuous fragments, which may cause memory fragmentation problems .
2.2 Replication Algorithm
复制算法
It is 解决标记-清除算法
a garbage collection algorithm designed for the memory fragmentation problem. It divides the heap space into two equally sized regions, using only half of them at a time. Its process is divided into three phases: marking phase, copying phase and role swapping phase.
- Marking phase : same
标记-清除算法
as that,GC Roots
starting from the object, traversing all reachable objects, and marking active objects. - Copy stage : copy all active objects from one area to another, so that the copied memory is continuous and there will be no memory fragmentation problem .
- Role exchange : After the copy is completed, the roles of the two areas are exchanged, the original surviving object becomes the new free area, and the original free area becomes the new working area.
Today's commercial virtual machines, including
HotSpot
all use this collection algorithm to reclaim objects in the new generation area.
- 98% of the objects in the new generation are
朝生夕死
private, so it is not necessary to1 : 1
divide the memory space according to the ratio, but to divide the memory (new generation memory) into a largerEden
(Eden) space and two smallerSurvivor
(survival) spaces. or) space.- Each use
Eden
and one of the two regionsSurvivor
( one is called the region and the other is called the region).Survivor
From(S0)
To (S1)
- When recycling, copy the surviving objects in
Eden
and to another space at one time, and finally clean up and the space just used .Survivor
Survivor
Eden
Survivor
- When
Survivor
the space is not enough, you need to rely on other memory (such as the old generation) for allocation guarantee.
About HotSpot
:
HotSpot
Eden
The default Survivor
size ratio is 8 : 1
, that is Eden : Survivor From : Survivor To = 8:1:1
. Therefore, the available memory space of each new generation is 90% of the entire new generation capacity, and the remaining 10% is used to store surviving objects after recycling.
HotSpot is an implementation of the most widely used Java Virtual Machine (JVM) on the Java platform. It is developed by Oracle (formerly Sun Microsystems) as part of the Java > Development Kit (JDK) and one of the default JVM implementations of OpenJDK.
HotSpot
The implemented replication algorithm flow is as follows:
- When
Eden
the area is full, it will trigger the first timeMinor GC
to copy the surviving objects toSurvivor From
the area; whenEden
the area is triggered againMinor GC
, it will scanEden
the area andFrom
the area, and perform garbage collection on the two areas. object, it is directly copied toTo
the area,Eden
andFrom
the area is cleared. - When the follow-up
Eden
happens againMinor GC
,Eden
the andTo
area will be garbage collected, the surviving objects will be copied toFrom
the area, andEden
the andTo
area will be emptied. - Part of the objects will be exchanged 15 times in the
From
andTo
area ( determined by the JVM parameter, this parameter is 15 by default), and finally if they are still alive, they will be stored in the old age.来回复制
MaxTenuringThreshold
2.3 Marking-Collating Algorithm
复制收集算法
When the object survival rate is high, more copy operations will be performed, and the efficiency will become lower. Therefore, the copy algorithm cannot generally be used in the old age .
According to the characteristics of the old age, it is proposed 标记-整理算法
. The marking process is still 标记-清除
consistent with the process, but the subsequent steps are not to directly clean up the recyclable objects, but to move all surviving objects to one end, and then directly clean up the memory outside the end boundary . The flow chart is as follows:
2.5 Generation Algorithm
The above three algorithms all have some common problems:
-
Efficiency issues :
标记-清除算法
and标记-整理算法
need to traverse the entire heap space, which may lead to low efficiency of garbage collection . -
Memory fragmentation problem :
标记-清除算法
Memory fragmentation will occur during the recycling phase, resulting in waste of memory space and discontinuous memory layout .
In 分代算法
order to solve these problems, 分代算法
it is a strategy that comprehensively utilizes multiple garbage collection algorithms. By dividing the heap memory into regions, different garbage collection strategies are adopted for different regions, so as to achieve better garbage collection effects .
分代算法
Design thinking:
In Java programs, differences对象的生命周期
are often differences. Most newly created objects become garbage very quickly, while some objects may live for a long time. Therefore, according to this feature, the heap is divided into different generations to process objects with different life cycles, so that the efficiency and performance of garbage collection can be better optimized.
The generational algorithm usually divides the heap memory into the following generations:
-
Young Generation : Newly created objects are usually allocated to the Young Generation . The new generation is used
复制算法
for garbage collection, because these objects have a short life cycle and generate more garbage . -
Old Generation (Old Generation) : Objects that are still alive after multiple GCs are moved to the Old Generation . The old generation is used
标记-清除
or标记-整理算法
garbage collected, because these objects have a long life cycle and generate relatively little garbage . -
Permanent Generation (Permanent Generation) : The permanent generation is used to store information such as metadata and constants of the class . After Java 8, the permanent generation is replaced by Metaspace .
Through the generational algorithm, different generations adopt different garbage collection strategies, which can perform finer garbage collection for objects with different life cycles, avoid traversing the entire heap space, and improve garbage collection efficiency. This design enables the garbage collector to dynamically adjust the recycling strategy according to the running status of the application, so as to better adapt to different application scenarios.
2.6 Minor GC and Major GC
In the JVM, garbage collection (Garbage Collection, GC) can be divided into two types according to the tasks performed : namely Minor GC
and Major GC(也称为 Full GC)
.
Minor GC
:
Minor GC
is for 新生代
the garbage collection process. 新生代
Is part of the Java heap memory, used to store just created objects. Usually, newly created objects have a short lifetime, so they are garbage collected after 新生代
use .复制算法
Minor GC
The working process is as follows:
-
Marking phase :
GC Roots
Starting from the object, mark all the objects that survive in the new generation. -
Copy phase : All surviving objects are
Eden
copied from zone toSurvivor
zone. -
Role swap : After the copy is completed, the roles of
Eden
the zone andSurvivor
the zone are reversed, so that the originalEden
zone becomes the new free zone, and the originalSurvivor
zone becomes the new work zone.
Minor GC
The purpose is to clean up the garbage objects in the new generation, so that the new generation can allocate space for new objects, try to ensure that the space of the new generation is continuous, and avoid memory fragmentation .
Major GC
:
Major GC
It is the garbage collection process for the old area, which is used to store long-lived objects. 标记-清除
Objects used in the old area have a long life cycle. If the copy algorithm is used for garbage collection, it may lead to a large copy cost, so or is usually used 标记-整理算法
.
Major GC
The working process is as follows:
-
Marking phase :
GC Roots
Starting from the object, mark all the objects that survived in the old age. -
Clearing or sorting phase : Perform corresponding garbage collection operations according to the adopted algorithm.
标记-清除算法
Unmarked garbage objects will be cleaned up,标记-整理算法
live objects will be moved and unmarked garbage objects will be cleaned up.
Major GC
The purpose of is to clean up the garbage objects in the old generation to avoid occupying too much memory resources in the old generation. It is also to ensure that the space in the old generation is continuous and avoid memory fragmentation .