JVM garbage collection mechanism (GC)

Table of contents

The role of GC:

When to apply for memory and when to release memory 

Memory leaks and out-of-memory 

memory leak 

out of memory 

GC (Disadvantage of Garbage Collection) 

GC (garbage collection) working process

The process of garbage collection: 

The first stage: looking for garbage / judging garbage 

Option 1: Based on reference counting (non-Java) 

Drawbacks of reference counting 

1. Serious waste of memory space (low space utilization)

2. There will be a circular reference problem

Solution 2: Accessibility Analysis (Java) 

What are GCRoots?

After a reference is set to null, will the object it pointed to be recycled immediately?  

Phase Two: Recycling Garbage 

1. Mark removal 

The problem of mark clearing: the released memory is fragmented (memory discontinuity), which affects the efficiency of program operation 

2. Copy algorithm 

Replication algorithm problem: low space utilization (half), high overhead (less garbage)

3. Marking and finishing 

Generational recycling 


 

The role of GC:

GC: Garbage Clean (garbage collection). We often apply for memory, new operations, create variables, etc. when writing code, but memory is limited, and continuous applications will exhaust memory. In order to solve memory consumption The problem is that GC is introduced so that some unused memory can be reclaimed and more space released. 

When to apply for memory and when to release memory 

The timing of applying for memory is relatively easy to determine, such as our new, creating variables, etc., but it is not easy to know when these variables are not used. If we release the memory too early, but we need to use it later, it will be embarrassing. If we release it too late, it will not work if the memory is not enough to apply later. Due to these mechanisms, some problems are prone to occur, the common ones are memory leaks and memory overflow problems.

Memory leaks and out-of-memory 

memory leak 

If the applicant applies for more and more memory in the process of applying for memory, and eventually no memory is available, this phenomenon is a memory leak. Garbage collection can save us programmers from worrying about memory leaks, but GC still have certain disadvantages

out of memory 

There is no necessary connection between memory overflow and the above problems. Memory overflow refers to the application of memory, and there is not enough memory for use. For example, a long type of data applies for the space size of int type, which will lead to memory overflow.

GC (Disadvantage of Garbage Collection) 

1. Introduce additional overhead (more resources are consumed)

2. Affect the running speed of the program (and GC will also have STW (stop the work) problem, which is also an important reason why C++ does not introduce GC, C++ pursues speed to the limit) 

GC (garbage collection) working process

First of all, the memory area of ​​the JVM is divided into program counter, stack, heap, and method area (metadata area). The memory in the stack will be automatically recovered without GC. The main function of GC is our heap area, which stores a large number of The objects we created new, and the objects to be recycled by GC are all objects that are not used but occupy memory space.

The process of garbage collection: 

The first stage: looking for garbage / judging garbage 

Option 1: Based on reference counting (non-Java) 

This solution is to introduce a small memory space to store how many references point to the object. If the number of references is 0, it means that it can be recycled. 

for example:

public static void fun(){
    
    Test t1=new Test();
    Test t2=t1;
 }

The reference count for the new Test() object is 2. When the fun method is executed, the stack frame on the stack will disappear, and then the reference count for new Test() will become 0. At this time, you can GC is recycling. 

It can be seen that the defect of reference counting is obvious

Drawbacks of reference counting 

1. Serious waste of memory space (low space utilization)

Using reference counting, every time an object is new, a counter must be introduced. This counter also needs to occupy space, and sometimes the space occupied is not small. For example, when our object is 4 bytes, the counter is also 4 words In festivals, this situation is a waste of space.

2. There will be a circular reference problem

Let me illustrate with an example what a circular reference is:

For example, if we want to find treasure:

If this example is not very understandable, let's use code as an example:

For example, such a class:

class Test{
    Test test=null;
}

Create an instance of this class in the test class:

public class TestDemo {
    public static void main(String[] args) {
        Test t1=new Test();
        Test t2=new Test();
    }
}

The reference object graph at this time:

At this point we modify the reference point:

public class TestDemo {
    public static void main(String[] args) {
        Test t1=new Test();
        Test t2=new Test();
        t1.test=t2;
        t2.test=t1;
    }
}

At this time, the reference object points to:

To be more intuitive:

At this time, if we set t1 and t2 to null, the reference counts of these two objects will become 1 at this time, which is equivalent to this after becoming 1:

The two objects refer to each other, which makes it impossible for the outside world to access these two objects (same as the treasure hunt above), so these two objects can never be recycled, and they can never be used. This is not the result we want , will also cause a memory leak.

So Java does not use reference counting to determine garbage.

Solution 2: Accessibility Analysis (Java) 

Reachability analysis is to periodically scan objects in the entire memory through a thread . The scanning process is similar to depth-first search (the starting position is generally GCroots ), and all reachable objects are marked once. Marked objects It is reachable, and unmarked objects are unreachable, that is, garbage. ( to avoid circular references )

 

Although reachability analysis avoids the problem of circular references, if the number of objects is relatively large, it will still take a lot of time to search, which consumes performance.

What are GCRoots?

1. Local variables on the stack ;

2. The variable pointed to by the reference in the constant pool ;

3. The object pointed to by the static member in the method area .

After a reference is set to null, will the object it pointed to be recycled immediately?  

Won't

One is because even if a reference is set to null, it does not mean that the object has no other references.

The second is because the reachability analysis scan takes time, and only after the scan is judged to be garbage will it be recycled.

Phase Two: Recycling Garbage 

There are three strategies for recycling garbage:

1. Mark removal

2. Copy algorithm

3. Marking and finishing

1. Mark removal 

Marking is the process of our reachability analysis. After marking, if it is found to be garbage, it can be cleared directly and the memory can be released. 

 

The problem of mark clearing: the released memory is fragmented (memory discontinuity), which affects the efficiency of program operation 

2. Copy algorithm 

The copy algorithm is simply to divide the memory into two, and then copy the normal object to the other side. Then release that side of the trash all over. ( avoids memory fragmentation )

Then release all the memory on the left:

 

Replication algorithm problem: low space utilization (half), high overhead (less garbage)

3. Marking and finishing 

Marking is similar to the movement of elements in an array. It is to move objects that are not garbage forward, and move objects that are garbage backward, and then recycle the garbage together.

The strategic overhead of marking and sorting is also relatively large. 

The above-mentioned schemes are all single. In fact, the scheme in the JVM is not single, but a strategy that combines the above-mentioned schemes, which is called "generational recycling".

Generational recycling 

Generational recycling refers to classifying objects and dividing them into different categories according to "age" for recycling.

The age of the object: every time a round of GC scanning is passed, the age is increased by 1, and the age is stored in the object header 

The memory area for storing objects is divided into new generation and old generation 

The new generation is divided into the Eden area and the survivor area (there are two survivor areas) 

Generational recycling process:

1. The newly generated object is placed in the Eden area

2. Survive a round of GC and copy to the survivor area ( using the copy algorithm ), most objects cannot survive a round of GC

3. In the subsequent GC, the objects in the survival area are copied back and forth between the two survival areas ( using the copy algorithm ) to eliminate the objects

4. After multiple rounds of GC, if an object is still not eliminated, it will be put into the old generation . For objects in the old generation, the number of GC scans is much lower than that in the new generation. At the same time, the old generation adopts the method of "marking-sorting" to recycle.

Special case: An object is very large (it takes up a lot of memory), and it does not need to go through multiple rounds of GC, and directly enters the old age. (Because it consumes too much performance) 

Guess you like

Origin blog.csdn.net/m0_67995737/article/details/130022596