The default JVM garbage collector works

This translation from Oracle in an article .

Garbage collection (GC) is a program memory space is no longer used in an automatic manner and recovered for reuse. Unlike other need to manually create and destroy objects programming language, because of the presence of GC mechanism, Java developers do not need to check each object if necessary. Conversely, a strong GC process will silently discard unwanted objects in the back, and the remaining objects sort. This mechanism makes the program more efficient running.

What is garbage collection?

JVM using objects are organized as program data. Domain contains several objects (data), such data exists called heap managed address space.

Consider a binary tree node class definition:

class TreeNode {
    public TreeNode left, right;
    public int data;
    TreeNode(TreeNode l,  TreeNode r, int d) {
        left = l; right = r; data = d;
    }
    public void setLeft(TreeNode l) { left = l;}
    public void setRight(TreeNode r) {right = r;}
}
复制代码

Suppose the following classes:

TreeNode left = new TreeNode(nullnull13);
TreeNode right = new TreeNode(nullnull19);
TreeNode root = new TreeNode(left, right, 17);
复制代码

In the end, we created a binary tree, the root node is 17, left child is 13, the right child is 19, the following figure

Binary Tree
Binary Tree

If we will replace the right child, the child node 19 becomes an isolated junk objects:

root.setRight(new TreeNode(nullnull21));
复制代码

The results shown below

Replace node
Replace node

It is conceivable that in the process of construction and operation of the data structure, the stack should be similar to a state:

Heap
Heap

Organize data, it implies the need to change its address in memory. Java program can expect to find the appropriate object based on a specific address, if the garbage collector moves an object, then the Java program also needs to know the new location of the object. To achieve this requirement, the easiest way is to stop all Java threads, finishing all objects, update all applications pointing to the old address to point to the new address, and then resume after the Java program. However, this approach will lead to a long period of GC ( GC pause time ), the Java thread is no longer running time.

Program can not run, which is every R & D personnel are unacceptable. In this regard, there are two ways to reduce the GC pause time, usually referred to in the Java literature concurrent algorithms (work at runtime) and parallel algorithms (in the Java thread to stop, in order to enable faster and more threads end of the work) . The default garbage collector in JDK 8 (via the command line -XX:+UseParallelGCmanually enabled) is the use of parallel strategies, the use of a large number of GC threads to get good throughput.

Parallel garbage collector

Parallel garbage collector GC cycle based on the number of live objects, the object will be placed in two sub-regions - the young generation and the old era. The new generation of objects in the young generation will be allocated initially in the finishing stage, if the object does not reach a certain number of cycles of survival value, then it is to remain in the young generation. If you survive long enough, it will ascend to the old era. This way to clean up the entire heap space will not pause after the program - it will take a long time, and only clean up the heap may contain short-lived objects. As the program runs, it is necessary to survive longer objects to clean up.

If you want to organize young objects only, the garbage collector will need to know which objects old age refers to an object in the young generation. These elderly subjects need to update the references to the new location younger objects. JVM by taking the name card table to complete the data structure, when the old target's written references, will be marked in the card table. Because the next young GC cycle, JVM can find references to the young generation old's execution by scanning the card table. Because these references are known, concurrent garbage collector also can identify which objects can be cleared, which references need to be updated. When the garbage collector pause the program, you use multiple threads to ensure GC finishing work can be completed as soon as possible.

G1 garbage collector

JDK in the G1 garbage collector uses both concurrent threads and parallel threads. The program is running, use of concurrent threads to scan live objects; use parallel threads to quickly copy the object, the program reduces pause times.

G1 will heap space is divided into many partitions, the program is running, a partition can be both years old, may be the young generation. The young generation partitions must be recycled in each GC cycle, but for years the old partition, G1 based on user-specified GC pause time requirement, the flexibility to choose the number of partitions can be recycled. This flexibility also ensures that the G1 can be old GC's work focused on the most garbage objects partition, but also makes the G1 can be adjusted pause time garbage collection based on user-specified GC pauses.

As shown below, G1 will organize the object to the new partition. Region1And Redion2objects within are sorted into Region4, the new object will be allocated Region4. Region3Copy operation because too much (70%) and lower space recovery (30%), it is not processed garbage collector.

Before and after a G1 run
Before and after a G1 run

Collector G1 substantially clear how much time data, wherein the copy live objects and consumed by each partition there. If the user desires GC pause time is short, G1 will choose to recover some of the partition, if the user does not care about GC pauses, or desired pause a long time, G1 will choose to recover more partitions.

G1 collector if you want only the young generation phones partition, you must maintain a card table data structure, but also need to record each partition referenced other years old years old partition, this data structure is called into remembered set.

Setting a shorter dwell time is the disadvantage that, Gl may not keep up with the rate of the program memory allocation, recovery will give up and fall back mode STW GC in this case. In other words, scanning and copying in Java threads are suspended when completed. Note that if the garbage collector during the part unable to meet the requirements of the dwell time of the collection, it full GCwill be more than a specified dwell time.

In summary, G1 is a balance and excellent throughput and pause time garbage collector.

Shenandoah garbage collector

Shenandoah is a garbage collector OpenJDK project, is part of the OpenJDK release 12, it has also been ported to JDK 8 and IDK 11. G1 collector with the same, using the same based on the Shenandoah heap space from the partition, and as concurrent threads scan survival data calculated for each partition. The difference between the two is that the different treatment stages of finishing.

Shenandoah use of concurrent way to organize data. (It should be noted that a discerning eye problem, GC may migrate data when the program object data read and write operations, do not worry, this problem will soon mentioned) Therefore, Shenandoah do not need to pause the program in order to minimize the time limit recovery The number of partitions. In contrast, it will choose the most effective partition - that is, contain few surviving objects partition, or partition containing a large amount of dead space. Throughout the process, only the initial and final stages of scanning some of the steps needed to stop the program.

The main difficulty lies in Shenandoah concurrent copy the object, copying the work of GC threads accessing the heap memory of the Java threads need to agree on the memory address of the object. Address may be stored in multiple locations, and update operations must be carried out to address the same time. Like most problems in computer science like solution is to add another layer of conversion.

(Such recovery unit) object results in additional storage space allocated indirect pointer. When Java threads access the object, first read the indirect pointer to see if the object is moving. If the garbage collector moves an object, it will update its indirect pointer to the new location. Newly allocated objects in indirect pointer pointing to itself, and indirectly the object pointer in the GC process will only point to other positions when copied.

Use indirect pointer is not without cost. Read pointer has to find the current location of the object needs to consume time and space, but the code than you think small. Space, Shenandoah need not be supported by the outer portion of the recovered card table, and the like into remembered sets and other heap data structure. Time, at present there are some strategies to eliminate dyslexia. Optimization JIT compiler may also identify the program to access an immutable property, such as the size of the array, in this case, whether data is read or copy the original object are the same, so there is no need to read indirectly. Moreover, if the Java program to read a plurality of properties of the same object, the JIT will identify and remove the back read pointer for indirect.

If you write a Java program object is an object Shenandoah collector being copied, competitive conditions will appear. This problem can be solved through collaboration Java threads and GC threads. If you want to target a Java thread needs to write copy, Java process will first copy the object to its own distribution area, and check if this is the first time a copy of the object, and then write. If the GC thread first copy of the object, then Java thread can release its memory allocation, using a copy of the GC thread.

Shenandoah eliminating the moratorium on the thread of copying live objects and, therefore, provides a shorter program pause time.

Guess you like

Origin juejin.im/post/5df63bf6e51d4557eb30cd41