In-depth understanding of Java garbage collection mechanism

1. Introduction

The Java garbage collection mechanism is one of the core components of the Java Virtual Machine (JVM) and plays a vital role in memory management. It can automatically track and manage the objects created in the application. When these objects are no longer used, the garbage collection mechanism will automatically reclaim the memory they occupy, so that this part of the memory can be reused. This mechanism greatly reduces the burden on developers to manually manage memory and prevents memory leaks caused by negligence. It is a significant advantage of the Java language compared to other languages ​​such as C++.

Two, Java memory structure

Java memory is mainly divided into five areas:

  1. Method Area : Used to store data such as class information, constants, and static variables that have been loaded by the virtual machine.
  2. Heap : Java Heap is the largest memory area managed by the JVM. Almost all object instances and arrays must be allocated on the heap. It is also divided into two parts, the young generation and the old generation, for efficient memory allocation and recycling.
  3. Virtual machine stack (Java Stack) : Each thread has a private stack whose life cycle is synchronized with the thread. The stack frame stores information such as local variable table, operand stack, dynamic link and method exit.
  4. Native Method Stack : The native method stack is similar to the virtual machine stack, except that it serves the native method.
  5. Program Counter Register : It is the line number indicator of the bytecode executed by the current thread.

Among them, the method area and the heap are the main areas that the Java garbage collector focuses on, and they are also the focus of our next discussion.

3. What is garbage

In Java, the life cycle of an object starts from creation (new) and ends when it is no longer referenced by other objects. In other words, when an object does not have any references pointing to it, the object becomes garbage, waiting for the garbage collector to reclaim. It is worth noting that the object may still be in scope, but it is impossible to be used again by the program (for example: the object is only used in a local scope), at this time, the object will also be regarded as garbage. The main job of the garbage collector is to find these garbage objects and release the memory they occupy, so as to provide space for new objects.

4. Garbage collection algorithm

1. Mark and Sweep algorithm (Mark and Sweep)

This is the most basic garbage collection algorithm. It is divided into two phases: marking phase and clearing phase. The marking phase traverses all objects to find out which ones are still alive. The cleanup phase removes all unmarked objects. As shown in the picture:
insert image description here

Although the mark-clear algorithm is very intuitive, there are two problems: one is the efficiency problem, and the efficiency of the two processes of mark and clear is not high; the other is the space problem, and a large number of discontinuous memory fragments will be generated after mark-clear.

2. Copying algorithm (Copying)

In order to solve the efficiency problem, the "copy" algorithm can be adopted. The replication algorithm divides the available memory into two equal-sized blocks according to capacity, and only uses one of them at a time. When this piece of memory is used up, copy the surviving object to another piece, and then clean up the used memory space at one time. In this way, one piece of memory is reclaimed every time, and there is no need to consider memory fragmentation and other issues when memory is allocated. As shown in the picture:

insert image description here
Although the replication algorithm is simple to implement, has high memory efficiency, and is not prone to fragmentation, the biggest problem is that the available memory is compressed to half of the original, and the memory is not fully utilized. And if the number of surviving objects increases, the efficiency of the replication algorithm will be greatly reduced.

3. Mark and Compact

To solve the space problem, a "mark-sort" algorithm can be used. The marking process is still the same as the "mark-clear" algorithm, but the subsequent steps do not directly clean up the recyclable objects, but let all surviving objects move to one end, and then directly clean up the memory outside the end boundary. As shown in the picture:
insert image description here

4. Generational Collection

The current garbage collection of commercial virtual machines adopts the "Generational Collection" algorithm. This algorithm divides the Java heap into the new generation and the old generation, so that we can adopt the most appropriate collection algorithm according to the characteristics of each age. In the new generation with a low survival rate of objects, a copy algorithm can be selected, and the collection can be completed only by paying a small amount of copy cost of surviving objects. In the old generation, because the object has a high survival rate and there is no additional space to allocate it, we can choose the "mark-clean" or "mark-compact" algorithm for garbage collection.

Note that Java itself does not provide an API to directly control these garbage collection algorithms, they are automatically executed by the Java virtual machine in the background. However, understanding these basic garbage collection algorithms is the basis for understanding more advanced garbage collection techniques (such as: parallel collection, concurrent collection, incremental collection, etc.).

5. Garbage Collector

The Java HotSpot VM includes several types of garbage collectors, each with its own characteristics and suitable for different systems and usage scenarios. include:

  • Serial collector: A single-threaded collector that must suspend all other worker threads until it finishes collecting garbage.
  • Parallel collector: A multi-threaded collector that stops all other worker threads during garbage collection until it finishes collecting.
  • CMS (Concurrent Mark Sweep) collector: Concurrent collector, its main design goal is to avoid long-term lag when collecting garbage in the old age.
  • G1 (Garbage First) collector: It is a garbage collector for server-side applications, which can meet the needs of predictable pause time and high throughput of garbage collection.

It should be noted that each garbage collector has its applicable scenarios, and there is no absolute distinction between good and bad. In the actual system design and development, we need to select the most suitable collector according to the characteristics of the application (such as: whether there is a high requirement for system response time, etc.) and hardware resources.

6. When to trigger garbage collection

In Java, the timing of garbage collection is determined by the JVM. Although we can System.gc()request the JVM to perform garbage collection by calling a method, this is only a suggestion, and the JVM can choose to ignore this request.

In practice, the JVM usually performs garbage collection in the following situations:

  • When the JVM's heap memory space is insufficient, the JVM will trigger garbage collection to release the memory occupied by objects that are no longer used, so as to allocate space for new objects.
  • When an Old Generation (Old Generation) space is full, a Full GC will be triggered, which will cause all Java application threads to be suspended until the GC ends.
  • When the system is idle, the JVM may also choose to perform garbage collection to improve the memory usage efficiency of the system.

The following Java code will show how garbage collection is performed during runtime:

public class GCDemo {
    
    
    public static void main(String[] args) {
    
    
        Runtime runtime = Runtime.getRuntime();
        long before = runtime.freeMemory(); //获取开始时JVM空闲内存
        for (int i = 0; i < 1000000; i++) {
    
    
            String s = new String("Hello, World!");
            s = null; // 显式地断开s的引用,使得s所指向的对象可以被垃圾回收
        }
        long after = runtime.freeMemory(); //获取结束时JVM空闲内存
        System.out.println("Memory freed by GC: " + (before - after));
    }
}

This code will output the amount of memory freed by the garbage collector, and you can see that even if we did not explicitly trigger garbage collection, the JVM will perform garbage collection at the appropriate time.

epilogue

Understanding and mastering Java's garbage collection mechanism is crucial to writing efficient and stable Java programs. In this blog, we introduced the basic principles of the garbage collection mechanism, the memory structure of the JVM, the garbage collection algorithm, various garbage collectors, and the trigger timing of garbage collection. Although Java has already handled most of the memory management issues for us, as Java developers, we still need to understand these basic concepts in order to write more efficient code and avoid problems such as memory leaks. I hope this blog is helpful to you. If you have any questions, please leave a message to discuss.

Guess you like

Origin blog.csdn.net/weixin_46703995/article/details/131253783