Garbage collection mechanism in JVM

  When our program runs, the operating system will allocate the corresponding memory space for the program. But our memory space is limited, and our program continuously generates data during the running process. If the useless data is not cleared, it may lead to insufficient memory, and eventually a memory overflow exception will occur. 
  Therefore, the JVM provides a garbage collection mechanism (GC), so that the program running on the JVM will be reclaimed by the JVM during the running of the program (data that is no longer used). In an object-oriented program, the release of memory means the destruction of objects, so which objects are destroyed by the JVM and how are they destroyed? That's what I'm going to talk about next.

1. How to judge which objects need to be recycled?

1. Reference counting method (JVM implementation generally does not use):

When an object does not have any references, the object cannot be used, which is garbage that needs to be released. The reference counting method is to record the reference value of the object in the object. When the object is referenced somewhere, the reference value is +1. When the reference is invalid, the reference value is -1. When the reference value becomes 0, it will be recycled.

// At this point p points to the newly created object (denoted as A), so the reference value of A is 1, not garbage 
Person p = new Person(); 
// At this time, p's reference to A is invalid, so the reference of A Value is 0, which is garbage 
p = null;

advantage:

1. Simple implementation and high judgment efficiency. 
1. There is no delay in recycling, and garbage is collected immediately when it is generated.

shortcoming:

1. Every time an object is assigned a value, it is necessary to maintain a reference counter, which requires additional overhead. 
​2.
Objects with circular references cannot be recycled. Because the object counter value of the circular reference is all 1, it cannot be judged as garbage, and it cannot be recycled together.
Circular reference: When there are two objects (A, B), and there is an attribute in A and B pointing to each other, their reference counter value is 1 at this time, but we have no way to use them, that is, they are actually Rubbish.

2. Accessibility Analysis:

Use a series of root objects called GC Roots as the starting node set. From these nodes, search down through the reference relationship. The path that the search has traveled is called "reference chain". If an object does not have any links to GC Roots If the reference chain is connected, it means that the object is unreachable and can be recycled.

What objects can be used as GC Roots? 
● Objects referenced by local variables; 
● Objects referenced by class static properties in the method area; 
● Objects referenced by constants in the method area; 
● Objects referenced by JNI (Native method) in the local method stack.

advantage:

1. It can solve the circular reference problem that the reference counter cannot solve.

shortcoming:

1. All Java execution threads must be stopped (also called "Stop The World") while reachability analysis is in progress. Because the reachability analysis is performed in the marking phase, if the object reference relationship is still changing during the analysis process, the result of the reachability analysis may be inaccurate.

2. Common Garbage Collection Algorithms

1. Mark-clear algorithm:

Overview: As the name suggests, the algorithm is divided into two phases: mark and clear:

1. Marking phase: Use the reachability analysis method to traverse all GCRoot objects and mark all reachable objects.
  1. Cleanup phase: traverse the heap memory and clear all unmarked objects.

Advantages: simple algorithm, easy to implement

shortcoming:

  1. In terms of time/efficiency:

    Both stages of marking and clearing need to traverse all objects in memory. In many cases, the number of objects in memory is very large, so the efficiency is not high, and the application needs to be stopped during GC. Apps can freeze, making the user experience poor.

  2. Space:

    A lot of discontinuous memory is generated, which makes the memory fragmented. If you need to use a large continuous memory when creating a large object later, you may not be able to find it, thus repeatedly triggering GC or even OOM.

2. Replication algorithm:

Overview: Divide the heap memory into two parts, and only use one part each time. When GC is triggered, copy the objects that do not need to be recycled in this part to another piece of memory, and then clear the previous half of the memory area. The replication algorithm is suitable for areas with a high recovery rate (if the recovery rate is not high, there will be a large number of objects that need to be copied frequently). For example: in the Survivor of the young generation, the garbage collection mechanism of the copy algorithm is used

advantage :

  1. The algorithm is simple;

  2. Solved the problem of memory fragmentation;

shortcoming :

  1. Only half of the memory can be used;

3. Marking-sorting algorithm:

Overview: The mark-sort algorithm is very similar to the mark-clear algorithm. In fact, the marking process of the mark-collate algorithm is still the same as the mark-clear algorithm, but the subsequent steps are not to directly recycle the recyclable objects, but to let all surviving Objects are moved to one end, and then directly clean up the memory beyond the edge line of the end.

advantage:

  1. The mark-sort algorithm makes up for the problem of memory fragmentation in the mark-clear algorithm;

  2. Eliminates the high cost of halving the memory of the copy algorithm;

shortcoming:

  1. It is not only necessary to mark surviving objects, but also to organize the reference addresses of all surviving objects, which is not as efficient as the copy algorithm.

4. Generational recycling algorithm: (emphasis)

In Java, the survival time of each object is different, which can be divided into two categories:

	1. Objects with a short survival time: they are more likely to be garbage collected, and there are fewer objects; 
	1. Objects with a long survival time: relatively few collections are required, but there are more objects;

Therefore, for objects with different characteristics, we need to adopt different algorithms to recycle them. Therefore, the memory is divided into the following blocks according to the life cycle of the object, so that the most suitable collection algorithm can be adopted according to the characteristics of each age.

1. New generation: Objects with a short survival time are collected using the replication algorithm;
  • The new generation is divided into Eden area and Survivor area (Survivor from, Survivor to), the default size ratio is 8:1:1.

  1. Old generation: Objects with a long survival time are collected by mark/clear algorithm or mark/sort algorithm;

  2. Permanent generation (JDK6, JDK7): It is a unique implementation of HotSpot. Other virtual machine implementations do not have this concept. The collection effect of the permanent generation is very poor. Generally, the permanent generation is rarely garbage collected;

 

Newly created objects are first stored in the Eden area (except for large objects). When the Eden area is full, then Survivor from is used. When the Survivor from is also full, Minor GC (new generation GC) is performed, and Eden and Survivor are copied using the copy algorithm. The surviving objects in from are copied into Survivor to, then Eden and Survivor from are cleared, and the age of the surviving objects is +1. At this time, the original Survivor from became the new Survivor to, and the original Survivor to became the new Survivor from.

When copying, if Survivor to cannot accommodate all surviving objects, the objects that cannot be accommodated will be placed in the old generation, and if the old generation cannot be accommodated, a Full GC will be performed (the memory of the entire heap, including the old generation and the new generation) .

Large objects can directly enter the old age: There is a parameter configuration -XX:PretenureSizeThreshold in the JVM, so that objects larger than this setting value directly enter the old age, the purpose is to avoid a large amount of memory duplication between Eden and Survivor areas.

Long-lived objects enter the old age: JVM defines an object age counter for each object. If the object is born in Eden and survives after the first Minor GC, and can be accommodated by Survivor, it will be moved into Survivor to and its age is set is 1. Every time a Minor GC is passed, the age will be increased by 1. When his age reaches a certain level (the default is 15 years old, which can be set by -XX:MaxTenuringThreshold), it will be moved to the old generation. But the JVM does not always require that the age must reach the maximum age to be promoted to the old age. If the sum of the size of all objects of the same age (such as age x) in the Survivor space is greater than half of Survivor, all objects with an age greater than or equal to x will directly enter the old age. , without waiting for the maximum age requirement.

3. Common Garbage Collectors

  1. Serial: The earliest single-threaded serial garbage collector, used to reclaim the young generation.

  2. Serial Old: The old version of the Serial garbage collector is also single-threaded and can be used as an alternative to the CMS garbage collector to recycle the old age.

  3. ParNew: It is a multi-threaded version of Serial, used to recycle the young generation.

  4. Parallel: A multi-threaded garbage collector that uses a replicated memory recovery algorithm. It is a throughput-first collector that can sacrifice waiting time for system throughput.

  5. Parallel Old is the old generation version of Parallel, which uses the mark-organize memory recovery algorithm.

  6. CMS: It is implemented using the mark-clear algorithm, so a large amount of memory fragments will be generated during gc. When the remaining memory cannot meet the program running requirements, the system will appear Concurrent Mode Failure, and the temporary CMS will be recovered by Serial Old performance will be degraded at this time. It is a garbage collector that sacrifices throughput to obtain the shortest collection pause time, and is used to reclaim the old generation.

  7. G1: A GC implementation that takes into account both throughput and pause time. It is the default GC option after JDK 9 and is a full heap collector.

Guess you like

Origin blog.csdn.net/m0_57614677/article/details/128967816