In-depth understanding of JVM garbage collection algorithm-replication algorithm

Generally speaking, in the running process of the entire program, garbage collection will only take up a small part of the time, and the execution of the evaluator will take more time. Therefore, the speed of memory allocation will directly determine the performance of the entire program. Obviously, the mark-clean algorithm mentioned earlier is not a good example. Although its algorithm is simple and easy to implement, it has serious memory fragmentation problems, which will seriously affect the memory allocation speed.

The mark-defragmentation algorithm can eradicate the fragmentation problem, and the allocation speed is also very fast, but during the garbage collection process, multiple heap traversals are performed, which significantly increases the collection time.

This article will introduce the third basic garbage collection algorithm: half-zone copy algorithm. The collector organizes the heap by copying in the whole process, thereby improving the memory allocation speed, and only needs to facilitate the live objects once during the recovery process. The biggest disadvantage is that the available space of the heap is reduced by half.

The copy algorithm is a typical algorithm that trades space for time.

Implementation principle

In the copy algorithm, the collector divides the heap space into two equal-sized semispaces, which are **source space (fromspace)** and **target space (tospace)**. During garbage collection, the collector copies the surviving objects from the source space to the target space. After the copy is over, all surviving objects are closely arranged at one end of the target space, and finally the source space and the target space are exchanged. The outline of the half-zone copy algorithm is shown in the figure below.

image

 

image

 

Next, look at how the code is implemented? The main process is very simple. There is a  free pointer to the starting point of TOSPACE, traversing from the root node, copying all the root node and its referenced child nodes to TOSPACE, every time an object is copied, the  freepointer is moved back by the corresponding size position, and finally Exchange FROMSPACE and TOSPACE, roughly can be described by the following code:

collect() {
// 变量前面加*表示指针
// free指向TOSPACE半区的起始位置
*free = *to_start;
for(root in Roots) {
copy(*free, root);
}
// 交换FROMSPACE和TOSPACE
swap(*from_start,*to_start);
}

copy The implementation of the core function  is as follows:

copy(*free,obj) {
// 检查obj是否已经复制完成
// 这里的tag仅是一个逻辑上的域
if(obj.tag != COPIED) {
// 将obj真正的复制到free指向的空间
copy_data(*free,obj);
// 给obj.tag贴上COPIED这个标签
// 即使有多个指向obj的指针,obj也不会被复制多次
obj.tag = COPIED;
// 复制完成后把对象的新地址存放在老对象的forwarding域中
obj.forwarding = *free;
// 按照obj的长度将free指针向前移动
*free += obj.size;

// 递归调用copy函数复制其关联的子对象
for(child ingetRefNode(obj.forwarding)){
*child = copy(*free,child);
}
}
returnobj.forwarding;
}

There are two issues that need to be paid attention to in this code. One is  tag=COPIED just a logical concept to distinguish whether the object has been copied to ensure that even if the object is referenced multiple times, it will only be copied once; the other is It is a  forwarding domain, which  forwarding指针 has been mentioned many times before. It is mainly used to save the new address after the object is moved. For example, in the tag sorting algorithm, after the object is moved, it needs to traverse and update the reference relationship of the object, and it needs forwarding指针 to be used  to find the moved In the copy algorithm, its function is similar. If you encounter an object that has been copied, you can directly return the new address of the object through the forwarding field. The basic flow of the entire replication algorithm is shown in the figure below.

image

 

Next, take a detailed example to see the general flow of the replication algorithm. The relationship between objects in the heap is shown in the figure below, where the free pointer points to the starting point of TOSPACE.

image

 

First, starting from the root node, find the objects B and E that it directly references, and the object B is first copied to TOSPACE. The relationship of the heap after B is copied is shown in the figure below.

image

 

Here, the object generated after B is copied becomes B', and the tag domain in the original object B  has been labeled with the copy completed, and forwarding指针 the address of B'is also stored.

After the object B is copied, the object A it refers to is still in FROMSPACE, and then the object A will be copied to TOSPACE.

image

 

Next, copy the object E referenced from the root and its reference object B, but because B has been copied, only the pointer from E to B needs to be replaced with a pointer to B'.

image

image

Finally, as long as FROMSPACE and TOSPACE are interchanged, GC is over. The state of the heap at the end of GC is shown in the figure below.

image

 

Here, the search order of the program is to search for objects in the order of B, A, E, that is, the depth-first algorithm is used to search.

Algorithm evaluation

The replication algorithm has the following advantages:

  • High throughput: The entire GC algorithm only searches and copies live objects, especially the larger the heap, the more obvious the gap. After all, the time it consumes is only proportional to the number of live objects.

  • High-speed allocation can be realized: After the GC is completed, the free space is a continuous memory block. During memory allocation, as long as the application space is smaller than the free memory block, only the free pointer needs to be moved. Compared with the allocation method of the free linked list used by the mark-clean algorithm, the copy algorithm is significantly faster. After all, it is necessary to traverse the linked list to find a suitable size of memory in the free linked list.

  • No Fragmentation: Nothing to say.

  • Compatible with cache: You can review the principle of locality mentioned earlier. Since all live objects are closely arranged in memory, it is very conducive to CPU cache.

Compared with the previous two GC algorithms, its disadvantages are mainly bright points:

  • Low utilization of heap space: The replication algorithm divides the heap into two, only half can be used, and the memory utilization is extremely low, which is also the biggest defect of the replication algorithm.

  • Recursive call function: When copying an object, it needs to recursively copy the object it refers to. Compared with iterative algorithms, recursion is less efficient, and there is a risk of stack space overflow.

Cheney copy algorithm

Cheney algorithm is an algorithm used to solve how to traverse the reference graph and move surviving objects to TOSPACE. It uses an iterative algorithm instead of recursion.

Let's take a simple example to see the execution process of the Cheney algorithm. First of all, it is the initial state. A little change has been made in the previous example. At the same time, there are two pointers pointing to the starting point of TOSPACE.

image

 

First, copy all the objects directly referenced from the root node, here is to copy B and E.

image

 

At this time, the objects directly referenced by the root node have been copied, scan still points to the starting point of TOSPACE, and free moves forward by B and E lengths from the starting point.

Next, scan and free continue to move forward. Each movement of scan means the search for the copied object is completed, and the forward movement of free means that the new object is copied.

As an example again, after B and E complete the copy, then start copying all the objects associated with B, here are A and C.

image

 

When copying A and C, free moves forward. After copying A and C is completed, scan moves forward B lengths to E. Then, continue to scan the object B referenced by E, and find that B has been copied, then scan moves forward by E lengths, and free remains unchanged. Since object A does not refer to any object, scan moves forward by A length, and free remains unchanged.

image

 

Next, continue to copy the associated object D of C. After completing the copy of D, it is found that scan and free have met, and the copy is ended.

image

 

In the end, FROMSPACE and TOSPACE are still interchanged, and GC ends.

The code implementation only needs to modify the previous code slightly to:

collect() {
// free指向TOSPACE半区的起始位置
*scan = *free = *to_start;
// 复制根节点直接引用的对象
for(root in Roots) {
copy(*free, root);
}
// scan开始向前移动
// 首先获取scan位置处对象所引用的对象
// 所有引用对象复制完成后,向前移动scan
while(*scan != *free) {
for(child ingetRefObject(scan)){
copy(*free, child);
}
*scan += scan.size;
}
swap(*from_start,*to_start);
}

And the  copy function no longer contains recursive calls, just complete the copy function:

copy(*free,obj) {
if(!is_pointer_to_heap(obj.forwarding,*to_start)) {
// 将obj真正的复制到free指向的空间
copy_data(*free,obj);
// 复制完成后把对象的新地址存放在老对象的forwarding域中
obj.forwarding = *free;
// 按照obj的长度将free指针向前移动
*free += obj.size;
}
returnobj.forwarding;
}

For  is_pointer_to_heap(obj.forwarding,*to_start) , if it  obj.forwarding is a pointer to TOSPACE, it returns TRUE, otherwise it returns FALSE. There is no use here  tag to distinguish whether the object has been copied, but to directly judge the  obj.forwarding pointer. If it  obj.forwarding is not a pointer or does not point to TOSPACE, then it is considered that it has not completed the copy, otherwise it means that the copy has been completed.

It can be seen from the code that the Cheney algorithm uses a breadth first algorithm. Those who are familiar with the algorithm may know that the breadth-first search algorithm requires a first-in first-out queue to assist, but there is no queue here. In fact, the heap between scan and free becomes a queue. The left side of scan is the object that has been searched, and the right side is the object to be searched. If free moves forward, the queue will append objects, and scan moves forward, and some objects will be taken out and searched. In this way, the first-in, first-out queue conditions are met.

The following is a typical implementation of a breadth-first traversal algorithm. You can use it for comparison and deepen your understanding.

voidBFS(List<Node> roots){
// 已经被访问过的元素
List<Node> visited =newArrayList<Node>();
// 用队列存放依次要遍历的元素
Queue<GraphNode> queue =newLinkedList<GraphNode>();

for(node in roots) {
visited.add(node);
process(node);
queue.offer(node);
}

while(!queue.isEmpty()) {
Node currentNode = queue.poll();
if(!visited.contains(currNode)) {
visited.add(currentNode);
process(node);
for(child ingetChildren(node)){
queue.offer(node);
}
}
}
}

Compared with the previous algorithm, the advantage of the Cheney algorithm is that it uses an iterative algorithm instead of recursion, avoiding stack consumption and possible stack overflow risks, especially using heap space as a queue to achieve breadth-first traversal, which is very clever. The disadvantage is that the objects referencing each other are not adjacent, and there is no way to make full use of the cache. Note that this is not to say that the Cheney algorithm is not compatible with the cache, but it is not as good as the previous algorithm.

At last

There are many variants of the replication algorithm. There is no way to list them here. For more information, you can read the two books in the reference materials.

The biggest drawback of the replication algorithm is the low utilization of the heap space, so in most scenarios, it is used in conjunction with other algorithms; and we do not really divide the heap space into two, but according to the actual situation, reasonable division . For example, the heap space can be divided into 10 parts, and 2 parts of the space can be used as the From space and the To space to execute the copy algorithm, and the remaining 8 points can be combined with the mark-clean algorithm.

Did you think about the division of the new generation and the old generation of JVM again? Well, the reason is what we are talking about.

Guess you like

Origin blog.csdn.net/AI_mashimanong/article/details/109159167