JVM class loading process

1. Load.

During the loading phase, the virtual machine mainly completes three things:

(1) Obtain the binary byte stream of a class by its fully qualified name.

(2) Convert the static storage structure represented by this byte stream into the runtime data structure of the method area.

(3) Generate a java.lang.Class object representing this class in memory as the access entry for various data of this class in the method area.

 

Here is the source of the binary byte stream (not specified by the virtual machine):

(1) Read from ZIP, JAR, EAR, WAR;

(2) Obtained from the network

(3) Run-time calculation generation (dynamic proxy)

(4) JSP file generation

(5) Read from the database

……

The loading of a non-array class can either be done using the bootstrap class loader provided by the system, or it can be done by a user-defined class loader (that is, overriding the loadClass() method of a class loader)

(PS: Why is there a custom classloader?

On the one hand, because java code is easy to be decompiled, if you need to encrypt your own code, you can encrypt the compiled code, and then decrypt it by implementing your own custom class loader, and then load it again. On the other hand, it is also possible to load code from non-standard sources, such as from network sources, then you need to implement a class loader yourself to load from a specified source. )

The loading of the array class is different, it is not created by the class loader, but directly created by the Java virtual machine. But the individual elements of the array class are ultimately created by the class loader. The creation of an array class (called Array C below) follows these rules:

(1) If the component type of the array (referring to the type with one dimension removed from the array) is a reference type, then load the component type recursively according to these rules, and the array C will be in the class namespace of the class loader that loads the component type. is identified ( a class must work with the class loader to determine uniqueness )

(2) If the component type of the array is not a reference type (such as int[] array), the Java virtual machine will mark the array C as associated with the bootstrap class loader.

(3) The visibility of the array class is consistent with the visibility of its component type. If the component type is not a reference type, the visibility of the array class will default to public.

After the loading phase is completed, the binary byte stream outside the virtual machine is stored in the method area in the format required by the virtual machine. Then instantiate an object of the java.lang.Class class in memory (it is not clearly specified in the Java heap. For HotSpot, the Class object is very special. Although it is an object, it is stored in the method area), and these objects will be As an external interface for programs to access these types of data in the method area.

Part of the content of the loading phase and the linking phase (such as the verification actions of some bytecode file formats) are interleaved. The loading phase has not been completed, and the linking phase may have already started, but the start times of the two still maintain a fixed sequence.

2. Verification

The purpose of verification is to ensure that the information contained in the byte stream of the Class file meets the requirements of the current virtual machine and does not endanger the security of the virtual machine itself. This stage is important, but not absolutely necessary. If all the running code has been used and verified repeatedly, you can use the parameter -Xverify:none to turn off verification and shorten the class loading time.

The verification process is roughly divided into the following four stages of verification actions: file format verification, metadata verification, bytecode verification, and symbol reference verification.

(1) File format verification

Verify that the byte stream conforms to the specification of the Class file format and can be processed by the current version of the virtual machine.

The purpose of this stage is to ensure that the input byte stream can be correctly parsed and stored in the method area, and the format conforms to the requirements for describing a Java type information. The verification at this stage is based on the binary byte stream. Only after passing the verification at this stage, the byte stream will be stored in the method area of ​​the memory, so the next three verification phases are all based on the storage structure of the method area. instead of directly manipulating the byte stream.

Possible verification points are:

Whether it starts with the magic number 0xCAFEBABE;

Whether the major and minor version numbers are within the processing range of the current virtual machine;

Whether the constants in the constant pool have unsupported constant types;

……

(2) Metadata verification

Perform semantic verification on the metadata information of the class to ensure that the described information conforms to the Java language specification.

Possible verification points are:

Whether this class has a parent class (except for the java.lang.Object class, all classes should have a parent class);

Whether the parent class of this class inherits classes that are not allowed to be inherited (classes modified by final);

If this class is not an abstract class, whether it implements all the methods required to be implemented in its parent class or interface;

……

(3) Bytecode Verification

The most complex stage, through data flow and control flow analysis, determines that the program semantics are legal and logical. At this stage, the method body of the class is verified and analyzed to ensure that the method of the verified class will not cause events that endanger the security of the virtual machine at runtime, such as:

Ensure that the jump instruction will not jump to the bytecode instruction outside the method body;

Ensure that the type conversion in the method body is valid;

……

PS: Even if the method body passes the bytecode verification, it is not absolutely safe. (It is impossible to be absolutely accurate by testing the program through the program)

(4) Symbolic reference verification

This verification occurs when the virtual machine converts a symbolic reference to a direct reference, which occurs during the third phase of the linking, the parsing phase. The purpose is to ensure that the parsing action can be executed normally. The usual verification points are:

Whether the corresponding class can be found by the fully qualified name described by the string;

Whether the accessibility (public, protected, private, default) of the class, field, and method of the symbol Yin Yongzhong can be accessed by the current class;

……

3. Prepare

The preparation phase is the phase in which memory is formally allocated for class variables and initial values ​​of class variables are set, all of which will be allocated in the method area .

PS:

(1) The memory allocation here only includes class variables (variables modified by static), not instance variables. Instance variables will be allocated in the Java heap along with the object when the object is instantiated;

(2) The initial value mentioned here usually refers to the zero value of the data type, such as public static int value = 123; then the initial value of the variable value after the preparation phase is 0, not 123. But there are also special cases, assuming public static final int value = 123; then at compile time, the ConstantValue property will be generated for the value, and the value will be assigned a value of 123 in the preparation stage.

4. Analysis

The resolution phase is the process by which the virtual machine replaces symbolic references in the constant pool with direct references.

(1) Symbolic reference:

Symbolic references use a set of symbols to describe the referenced target. Symbolic references can be literals in any form. Symbolic references have nothing to do with the memory layout implemented by the virtual machine. The referenced target is not necessarily already in memory.

(2) Direct citation:

A direct reference can be a pointer directly to the target, a relative offset, or a handle that can be located indirectly to the target. Direct references are related to the memory layout implemented by the virtual machine. The direct references translated from the same symbolic reference on different virtual machine instances are generally different. If there is a direct reference, the referenced target must already exist in memory. .

 

The parsing action is mainly performed for 7 types of symbol references such as classes or interfaces, fields, class methods, interface methods, method types, method handles and call site qualifiers. The added dynamic language support is related, which will not be discussed here):

(1) Analysis of class or interface

Determine whether the direct reference to be converted is a reference to an array type or a common object type, so as to perform different analysis.

(2) Field Analysis

When parsing a field, it will first check whether there is a field whose simple name and field descriptor match the target in this class. If so, the search will end; if not, it will recurse from top to bottom according to the inheritance relationship. Search for each interface implemented by the class and their parent interface, if there is none, then recursively search its parent class from top to bottom according to the inheritance relationship until the end of the search.

(3) Class method analysis

The parsing of class methods is similar to the search steps for field parsing, except that there are more steps to determine whether the method is located in a class or an interface, and the matching search for class methods is to search for the parent class first, and then search for the interface.

(4) Interface method analysis

Similar to the class method parsing step, except that the interface will not have a parent class, so just recursively search for the parent interface upwards.

5. Initialization

The class initialization phase is the last step of the class loading process. In the previous class loading process, except for the loading (Loading) phase, the user application can participate through the custom class loader, and the rest of the actions are completely dominated and controlled by the virtual machine. At the initialization stage, the Java program code defined in the class is actually executed.

Initialization, assigning correct initial values ​​to the static variables of the class, the JVM is responsible for initializing the class, mainly initializing the class variables. There are two ways to initialize class variables in Java:

① Specify the initial value when declaring a class variable

②Use static code blocks to specify initial values ​​for class variables

 

JVM initialization steps

(1) If the class has not been loaded and connected, the program will first load and connect the class

(2) If the direct parent class of the class has not been initialized, initialize its direct parent class first

(3) If there are initialization statements in the class, the system executes these initialization statements in turn

 

The initialization phase is the process of executing the class constructor <clinit>() method, the details of which are as follows:

(1) The <clinit>() method is generated by the compiler automatically collecting the assignment action of all class variables in the class and combining the statements in the static statement block (static{} block). The order of the compiler collection is determined by the statement in the source code. The order in which they appear in the file is determined.

(2) The <clinit>() method is different from the class constructor. It does not need to explicitly call the parent class constructor. The virtual machine ensures that the parent class <clinit> before the subclass's <clinit>() method is executed. () method has been executed, so the class of the first <clinit>() method executed in the virtual machine must be java.lang.Object.

(3) Since the <clinit>() method of the parent class is executed first, it means that the static statement block defined in the parent class takes precedence over the variable assignment operation of the child class.

(4) The <clinit>() method is not necessary for a class or interface. If there is no static statement block in a class and no assignment to variables, the compiler may not generate <clinit>() for this class method.

(5) Static statement blocks cannot be used in the interface, but there may be variable assignment operations, so the interface will also generate the <clinit>() method. But interfaces are different from classes. Executing the <clinit>() method of the interface does not need to execute the <clinit>() method of the parent interface first. The parent interface is initialized only when the variables defined in the parent interface are used. In addition, the implementation class of the interface will not execute the <clinit>() method of the interface during initialization.

(6) The virtual machine ensures that the <clinit>() method of a class is properly locked and synchronized in a multithreaded environment. If there are multiple threads to initialize a class at the same time, only one thread will execute the <clinit>() method of this class, and other threads need to block and wait until the active thread finishes executing the <clinit>() method. If there is a long operation in the <clinit>() method of a class, it may cause multiple processes to block.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325227579&siteId=291194637