Detailed explanation of java virtual machine (JVM) heap, stack, method area, and class loader

1. The basic structure of JVM

jvm basic structure diagram
From the above figure, we can see that the basic structure of JVM includes: class loading subsystem, memory area, execution engine, and local library interface.

2. Detailed diagram of each area of ​​JVM

Insert picture description here
Through this figure, we will have some basic understanding of jvm, let's explain each area in detail.

3. Detailed explanation of each area of ​​the data area during JVM runtime

3.1 Program Counter (PC) Register

The memory space is small and the thread is private. The job of the bytecode interpreter is to select the next bytecode instruction that needs to be executed by changing the value of this counter. Basic functions such as branch, loop, jump, exception handling, and thread recovery all need to be completed by the counter.

If the thread is executing a Java method, this counter records the address of the virtual machine bytecode instruction being executed; if it is executing a Native method, the value of this counter is (Undefined). This memory area is the only area that does not specify any OutOfMemoryError conditions in the Java Virtual Machine Specification.

3.2Java Virtual Machine Stacks

The thread is private, and the life cycle is the same as that of the thread. It describes the memory model of Java method execution: each method creates a stack frame when it is executed to store information such as the local variable table, operand stack, dynamic link, method exit, and so on. Each method from the call to the end of the execution corresponds to the process of a stack frame from the virtual machine stack to the stack.

Local variable table: stores various basic types (boolean, byte, char, short, int, float, long, double), object references (reference type) and returnAddress types (pointing to a bytecode instruction address)

StackOverflowError: The stack depth requested by the thread is greater than the depth allowed by the virtual machine.
OutOfMemoryError: If the virtual machine stack can be dynamically expanded, and sufficient memory cannot be applied for during expansion.

3.3 Native Method Stacks

The difference from the Java virtual machine stack is that the Java virtual machine stack serves the virtual machine to execute Java methods (that is, bytecode), while the local method stack serves the native methods used by the virtual machine. There will also be StackOverflowError and OutOfMemoryError exceptions.

3.4 Heap Memory

For most applications, this area is the largest piece of memory managed by the JVM. Thread sharing is mainly to store object instances and arrays. Multiple thread private allocation buffers (Thread Local Allocation Buffer, TLAB) are divided internally. It can be located in physically discontinuous space, but logically continuous.

OutOfMemoryError: If there is no memory in the heap to complete the instance allocation, and the heap can no longer be expanded, this exception is thrown.

3.5 Method Area

It belongs to the shared memory area and stores data such as class information (including class name, method information, field information), constants, static variables, and code compiled by the JIT compiler that have been loaded by the virtual machine.

In addition to the description information of the fields, methods, and interfaces of the class in the Class file, there is also a constant pool, which is used to store the literal and symbolic references generated during compilation.

A very important part in the method area is the runtime constant pool, which is the runtime representation of the constant pool of each class or interface. After the class and interface are loaded into the JVM, the corresponding runtime constant pool is Created.

Of course, it is not the content in the class file constant pool that can enter the runtime constant pool, and new constants can also be placed in the runtime constant pool during runtime, such as the intern method of String. Although the JVM specification describes the method area as a logical part of the heap, it has the alias non-heap (non-heap).

Here is a picture to introduce the contents of the above area storage.
Insert picture description here

4. Virtual machine class loading mechanism

The virtual machine loads the data describing the class from the Class file into the memory, checks the data, analyzes and initializes the data, and finally forms a Java type that can be directly used by the virtual machine.
In the Java language, the loading, connection, and initialization of types are all completed during the running of the program.

4.1 Class life cycle

Insert picture description here
The sequence of the five stages of loading, verification, preparation, initialization and unloading is determined. The parsing phase can start after initialization (runtime binding or dynamic binding or late binding).

The following five situations must initialize the class (and loading, verification, and preparation naturally need to be completed before this):

  1. When encountering the 4 bytecode instructions new, getstatic, putstatic or invokestatic, the initialization is not triggered and the initialization is triggered. Usage scenarios: use the new keyword to instantiate an object, read the static fields of a class (except for the static fields that have been modified by final and put the results in the constant pool at compile time), and call the static methods of a class.
  2. When using the java.lang.reflect package method to make a reflection call to the class.
  3. When initializing a class, if it is found that its parent class has not yet been initialized, the initialization of its parent class needs to be triggered first.
  4. When the virtual machine starts, the user needs to specify a main class to be loaded (the class containing the main() method), and the virtual machine initializes the main class first.
  5. When using the dynamic language support of JDK 1.7, if the method handles of REF_getStatic, REF_putStatic, and REF_invokeStatic are the final analysis results of a java.lang.invoke.MethodHandle instance, and the class corresponding to this method handle has not been initialized, it needs to be triggered first Its initialization.

4.2 class loading process

4.2.1 Loading

  1. Obtain the binary stream (ZIP package, network, operation generation, JSP generation, database read) that defines the subclass through the fully qualified name of a class.
  2. The static storage structure represented by this byte stream is transformed into the runtime data structure of the method area.
  3. A java.lang.Class object representing this class is generated in memory as a method to access various data of this class.

The particularity of the array class: the array class itself is not created by the class loader, it is created directly by the Java virtual machine. However, the array class and the class loader still have a close relationship, because the element type of the array class is ultimately created by the class loader. The array creation process is as follows:

  1. If the component type of the array is a reference type, use class loading recursively.
  2. If the component type of the array is not a reference type, the Java virtual machine will mark the array as a bootstrap class loader association.
  3. The visibility of the array class is consistent with the visibility of its component type. If the component type is not a reference type, the visibility of the array class will default to public.

The java.lang.Class object of the instance in memory is stored in the method area. As an external interface for these types of data in the program access method area. Part of the content of the loading phase and the connection phase are interleaved, but the starting time remains in order.

4.2.2 Verification

It is the first step in connection to ensure that the information contained in the byte stream of the Class file meets the requirements of the current virtual machine.

File format verification

  1. Whether to start with the magic number 0xCAFEBABE
  2. Whether the major and minor version numbers are within the processing range of the current virtual machine
  3. Whether the constants of the constant pool have unsupported constant types (check the constant tag flag)
  4. Is there a constant that points to a non-existent constant or a constant that does not conform to the type among the various index values ​​that point to the constant
  5. Is there any data that does not conform to UTF8 encoding in the constant of CONSTANT_Utf8_info type
  6. Whether each part set file itself in the Class file has other additional information deleted
  7. ……

Only after passing this stage of verification, the byte stream will enter the method area of ​​the memory for storage, so the following three verification stages are all based on the storage structure of the method area, and the byte stream is no longer directly manipulated.

Metadata verification

  1. Does this class have a parent class (except java.lang.Object)
  2. Whether the parent class of this class inherits a class that is not allowed to be inherited (final modified class)
  3. If this class is not an abstract class, has it implemented all the methods required to be implemented in its parent class or interface
  4. Whether the fields and methods in the class conflict with the parent class (override the final field of the parent class, and overload that does not conform to the specification)

This stage is mainly to perform semantic verification on the metadata information of the class to ensure that there is no metadata information that does not conform to the Java language specification.

Bytecode verification

  1. Ensure that the data type of the operand stack and the instruction code sequence work together at any time (reading an int type data according to the long type will not occur)
  2. Ensure that jump instructions will not jump to bytecode instructions outside the method body
  3. Ensure that the type conversion in the method body is valid (it is safe to assign subclass objects to the parent data type, and vice versa is illegal)
  4. ……

This is the most complicated stage in the entire verification process. The main purpose is to determine that the program semantics are legal and logical through data flow and control flow analysis. At this stage, the method body of the class is verified and analyzed to ensure that the method of the verification class will not cause an event that endangers the security of the virtual machine during operation.

Symbol reference verification

  1. Whether the fully qualified name described by the string in the symbol reference can find the corresponding class
  2. Whether there is a field descriptor of the symbol method and the method and field described by the simple name in the specified class
  3. Whether the accessibility (private, protected, public, default) of the class, field, and method in the symbol reference can be accessed by the current class
  4. ...... The
    final stage of verification occurs when the symbol reference is quickly converted into a direct reference. This conversion action will occur in the third stage of the connection-the parsing phase. Symbol reference verification can be seen as a matching verification of information other than the class itself (various symbol references in the constant pool), as well as the content mentioned above.
    The purpose of symbol reference is to ensure that the parsing action can be executed normally. If the symbol reference verification fails, a subclass of java.lang.IncompatibleClass.ChangeError will be thrown. Such as java.lang.IllegalAccessError, java.lang.NoSuchFieldError, java.lang.NoSuchMethodError, etc.

4.2.3 Preparation

This stage formally allocates memory for the class and sets the initial value of the class variable, and the memory is allocated in the method (variables with static modification do not include instance variables).

public static int value = 1127;

This code is 0 after the initial value is set, because any Java method has not yet been executed at this time. The putstatic instruction that assigns the value to 1127 is stored in the clinit() method after the program is compiled, so the value is assigned during the initialization phase.

The zero value of the basic data type

Special case: If the ConstantValue attribute exists in the field attribute table of the class field, the virtual machine will assign the value to 1127 according to the ConstantValue setting during the preparation phase.

4.2.4 Analysis

At this stage, the virtual machine replaces symbol references in the constant pool with direct references.

  1. Symbol reference
    Symbol reference is a set of symbols to describe the referenced target, and the symbol can be any form of literal.
  2. Direct reference
    Direct reference can be a pointer to the target, a relative offset, or a handle that can indirectly locate the target. Direct reference is related to the implementation of Xunji's memory layout

The parsing action is mainly carried out for 7 types of symbol references of classes or interfaces, fields, class methods, interface methods, method types, method handles, and call point qualifiers, which correspond to the 7 constant types in the constant pool.

4.2.5 Initialization

The previous process is dominated by the virtual machine, and the initialization phase starts to execute the Java code in the class.

4.3 class loader

Obtain a binary byte stream describing this class through the fully qualified name of a class.

4.3.1 Parent delegation model

From the perspective of the Java virtual machine, there are only two class loaders: one is the startup class loader (implemented in C++, which is part of the virtual machine); the other is the loader for all other classes (implemented in Java, independent of the virtual machine) Externally and fully inherited from java.lang.ClassLoader)

1. Start the class loader and
load the classes under lib or under -Xbootclasspath

2. The extended class loader
loads lib/ext or classes under the path specified by the java.ext.dirs system variable

3. The reference program class loader
ClassLoader is responsible for loading the class library specified on the user path.

Insert picture description here
Except for the top-level startup class loader, all others have their own parent class loader.
Working process : When a certain class loader receives a request to load a class, it first delegates the loading task to the parent class loader, and then recursively. If the parent class loader can complete the class loading task, it returns successfully; if the parent class loads If the device cannot complete the loading task, it will throw a ClassNotFoundException, and then call its own findClass() method to load, and so on.

4.3.2 The role of the parental delegation model

1. Ensure that the core classes provided by JVM are not tampered with, and ensure the safety of class execution

For example, the string class above, no matter which loader wants to load this class, due to the parent delegation mechanism, it will eventually be loaded by the top-level startup class loader, which ensures that the string class is in various class loader environments. They are all the same class. Imagine that if there is no parental delegation mechanism, each loader loads the string class by itself. It is possible that the string methods loaded by different class loaders are different. In that case, our program will be chaotic.

2. Prevent repeated loading of the same class

Guess you like

Origin blog.csdn.net/cyb_123/article/details/108513384