In-depth understanding of JVM (focus: Parental delegation model + garbage collection algorithm)

1. What is JVM?

JVM is the abbreviation of Java Virtual Machine, which means Java virtual machine. A virtual machine refers to a complete computer system with complete hardware functions simulated by software and running in a completely isolated environment. It can be considered that the JVM is a customized computer that does not exist in reality. The Java program is ultimately run in the JVM (Java Virtual Machine).

2. JVM execution process

3. JVM runtime data area

  1. Heap: It is the largest memory area in a Java program, used to store object instances and arrays created using the new keyword object.

  2. Stack: Allocate separate stack space for each thread, mainly used to store stack frames, local variables, etc. when calling methods.

  3. Method Area: Mainly used to store structural information of classes (class objects), method meta-information, constant pools, static variables, etc.

  4. Program Counter Register: Each thread has an independent program counter, which is used to store the address of the instruction being executed by the current thread or the next instruction to be executed. address.

4. JVM class loading mechanism

1. Class loading process

If the program wants to run, it must load the dependent "instructions and data" into the memory. This is mainly reflected in the process of loading the .class file into the memory. Summed up in 5 words:

  1. Loading: It finds and reads the bytecode file of the class through the class loader (ClassLoader) and loads it into memory. During the loading process, a Class object representing the class will be generated for subsequent operations.
  2. Verify:.classThe file has an unambiguous data format
  3. Preparation: The stage of formally allocating memory for variables defined in the class (ie, static variables, variables modified by static) and setting the initial value of the class variable.
  4. Parsing: It is the process by which the Java virtual machine replaces the symbol reference in the constant pool with a direct reference, which is the process of initializing the constant.
  5. Initialization: Mainly initializes static members, executes static code blocks, loads parent classes (if a parent class exists) and other processes.

2. Parental delegation model

In the process of "Loading", the concept of class loader is involved. There are three class loaders built into the JVM, which constitute " Parental delegation model:

  1. BootStrap ClassLoader: Responsible for loading classes in the Java standard library
  2. Extension ClassLoader: Responsible for loading classes of Sun and Oracle extension libraries
  3. Application ClassLoader: Responsible for loading custom classes in the project and classes in third-party libraries

If a class loader receives a class loading request, it will not try to load the class itself first, but delegates the request to the parent class loader to complete. This is true for every level of class loader, so all The loading request should eventually be transmitted to the top-level class loader. Only when the parent loader reports that it cannot complete the loading request (the required class is not found in its search scope), the child loader will try to to finish loading. Until the loading of a certain layer is completed and the next step is entered, if it is still not found, an exception will be thrown: ClassNotFoundException

5. JVM garbage collection strategy

JVM garbage collection (Garbage Collection) is mainly used to recover memory objects that are no longer referenced by the program. The object is stored on the heap memory, so the main target of CG is , and CG is released in units of 对象. As for other memory areas:

Stack: After the method call is completed, the stack frame and local variables of the method are destroyed with the pop operation. The entire stack is also destroyed along with the thread.

Method area: Mainly stores class objects and rarely involves "unloading" operations.

Program Counter: It is just a simple address integer, which is destroyed along with the thread.

1. Algorithm for judging dead objects

All object instances are stored in the Java heap. Before garbage collection on the heap, the garbage collector must first determine which of these objects are still alive and which are "dead". There are two main algorithms for judging whether an object is "dead":

(1) Reference counting algorithm

Add a reference counter to the object. Whenever there is a reference to it, the counter will be +1; when the reference expires, the counter will be -1; an object with a counter of 0 at any time can no longer be used, that is, the object has been " "die".

The idea of ​​using the reference counting method to determine the survival of an object is very simple, and generally the determination efficiency is relatively high, butReference counting cannot solve the problem of circular references to objects.

(2) Reachability analysis method (the solution adopted by Java)

Understand the reference relationship in the object as a tree structure, use a series of objects called "GC Roots" as the starting point, and search downward from these nodes. As long as the objects that can be traversed are reachable, otherwise Proves that this object is unavailable.

The GC Roots here include the following:

  1. object referenced on the stack
  2. Objects referenced in the constant pool in the method area
  3. Objects referenced by static members in the method area

Although reachability analysis solves the problem in reference counting methodcircular referenceproblem, but the search process may consume more time, and in order to prevent the reference relationship from changing during the search process, some business threads will pause their work, which causes the STW (Stop-The-World) problem.

2. Garbage collection algorithm

(1) Mark clearing

The "mark-sweep" algorithm is the most basic garbage collection algorithm. The algorithm is divided into two stages: "marking" and "clearing": first, mark all objects that need to be recycled, and after the marking is completed, all marked objects will be recycled uniformly.

The biggest problem with the mark-and-sweep algorithm is that it producesmemory fragmentation. Generally, when applying for memory, the entire continuous space is often applied for, and memory fragmentation will greatly reduce the space utilization.

(2) Copy algorithm

It will整个可用内存 be divided into two equal-sized pieces according to capacity, and only one piece of them will be used at a time. When this piece of memory needs to be garbage collected, the surviving objects in this area will be copied to another piece, and then the used memory area will be cleared at once. The advantage of this is that the entire half area is recycled every time, and there is no need to consider complex situations such as memory fragmentation when allocating memory.

The disadvantage of this algorithm is that the utilization rate of memory space is relatively low, reducing the maximum available space to the original average. And if there is very little garbage in the available memory and there are many objects that need to be retained, the cost of copying will be relatively high and the efficiency will be low.

(3) Marking and sorting

Mark defragmentation can also solve the problem of memory fragmentation. The general idea is similar to deleting intermediate elements in a sequential table. Each time, the surviving objects are moved to one end, and then the memory outside the end boundary is directly cleared.

Marking and sorting algorithm, because it needs to be moved every time, it leads to a decrease in efficiency.

(4) Generational recycling (adopted by JVM)

The generational algorithm implements different garbage collection strategies in different areas (stages) through area division. A lot of experience shows that most Java objects have the characteristics of ephemeral life. Generally, a large number of new generation objects die after a round of scanning. If an object survives for a long time, then experience shows that it will continue to survive longer. time.

  1. The newly created objects will be placed in the Eden area. When garbage collection scans the Eden area, most objects will be killed in the first round of GC.
  2. If the objects in the Eden area survive the first round of GC, the surviving objects will be copied to the survival area through the copy algorithm.
  3. The survival area is divided into two parts, of equal size, and half of them are used at a time. Garbage collection scans the survival area, and when garbage is found, the copy algorithm is used to copy the surviving objects to the other half of the survival area.
  4. When an object in the survival area has survived several rounds of GC and is considered to have grown to a certain age, it will enter the old area and will be copied to the old generation through a copy algorithm.
  5. Objects that enter the old generation are generally objects that have a longer survival time, and the probability of death is much smaller than that in the new generation. Therefore, the GC frequency for the old generation will be much lower. If a scan finds that an object in the old generation is garbage, it will be deleted directly by marking it.
  6. Special case: If an object is very large, it will enter the old generation directly. Because the cost of copying large objects is relatively high, and there are not many large objects.

6. Garbage collector in Java virtual machine (understanding)

The three garbage collectors in the Java virtual machine are based on the specific implementation of the above algorithms, and some improvements and optimizations are usually made based on the above. Here are two main ones:

  1. CMS (Concurrent Mark-Sweep) Garbage Collector: CMS is a garbage collector designed to reduce application pause time. It uses a concurrent mark-and-sweep algorithm that allows applications to continue running during most of the cleanup process.
  1. G1 (Garbage-First) Garbage Collector: G1 is a garbage collector for server-side applications, designed to provide controllable pause times and high throughput . It uses a generational and regionalized garbage collection strategy to more precisely control pause times.

Guess you like

Origin blog.csdn.net/LEE180501/article/details/132414662