[JVM]-[In-depth understanding of Java virtual machine study notes]-Chapter 7 Virtual Machine Class Loading Mechanism

Overview

The class loading mechanism of the virtual machine refers to the Java virtual machine loading the data describing the class from the Class file into the memory, verifying the data, converting, parsing and initializing it, and finally forming a Java type that can be directly used by the virtual machine. process

Class loading timing

From the time a class is loaded into the virtual machine memory to the time it is unloaded from the memory, its entire life cycle will go through Loading Loading .Loading),验证 ( V e r i f i c a t i o n Verification Verification),准备 ( P r e p a r a t i o n Preparation Preparation),解析 ( R e s o l u t i o n Resolution Resolution),初始化 ( I n i t i a l i z a t i o n Initialization Initialization),使用 ( U s i n g Using U s in g ), andunloading(U nloading UnloadingU n l o a d in g ) seven stages
, among whichof verification, preparation, and analysisare collectively referred to asconnect ( L i n k i n g Linking L inkin g )
Class life cycle
loading, verification, preparation, initialization, and unloadingis determined and must bestarted

Situations that must be initialized

The virtual machine specification strictly stipulates that there are and only the following six situations where a class must be initialized immediately , so naturally, loading, verification, and preparation will start before it:

  1. When encountering the four bytecode instructions of new, getstatic, putstatic or invokestatic, if the class has not been initialized, its initialization phase needs to be triggered first. Typical scenarios that can generate these four instructions are: new keyword instantiates an object ; read Get or set the static fields of a class (except for being modified by final, the results have been put into the static fields of the constant pool during compilation. These are considered global invariants and have little to do with the class); calling the static method of a class when
  2. When making a reflection call to a class , if the class has not been initialized, it needs to be initialized first.
  3. When initializing a class, if you find that its parent class has not been initialized, you need to trigger the initialization of its parent class first.
  4. When the JVM starts, the user needs to specify a main class to be executed (the class containing the main() method). The virtual machine will first initialize the main class.
  5. When an interface defines the default method introduced by JDK 8 , if an implementation class of this interface is initialized, the interface must be initialized before it.

The class needs to be initialized naturally because the class needs to be used, so the above-mentioned behaviors are the behaviors that need to use the class. The
above-mentioned behaviors are called active references to a class . In addition, all methods of referencing types, although they are referenced to the class, will not trigger the initialization of the class, which is called passive reference . It will lead to subclass initialization; referencing a class through an array definition will not trigger the initialization of this class, such as will not trigger the initialization of the User class; constants (static final decoration) will be stored in the constant pool of the calling class during the compilation phase, In essence, there is no direct reference to the class that defines the constant, so the initialization of the class that defines the constant will not be triggered.User[] users = new User[10]

Class loading process

The whole process of class loading , including loading, verification, preparation, parsing and initialization

load

During the loading phase, the virtual machine needs to complete three things:

  1. Gets the binary byte stream that defines a class by its fully qualified name
  2. Convert the static storage structure represented by this byte stream into a runtime data structure in the method area
  3. Generate a java.lang.Class object representing this class in memory as an access point for various data of this class in the method area.

verify

The purpose of the verification phase is to ensure that the information contained in the byte stream of the Class file complies with all the constraints of the "Java Virtual Machine Specification" and to ensure that this information will not endanger the security of the virtual machine itself after being run as code. Because the Class file does
not It must only be compiled from Java source code. It can be generated using methods including directly typing out the Class file on the binary editor using keyboard 0 and 1. If the JVM does not check the input byte stream and completely trusts it, it is likely to load a byte code stream with errors or malicious intentions, causing the entire system to be attacked or even crash. Therefore, verifying the byte code is A necessary measure for the JVM to protect itself. Whether this stage is rigorous directly determines whether the JVM can withstand attacks
from malicious code. Overall, the verification stage will roughly complete four stages of verification actions: file format verification, metadata verification, Bytecode verification and symbol reference verification

Prepare

The preparation phase is officiallyVariables defined in the class(i.e. static variables, variables modified by static) The stage of allocating memory and setting the initial value of class variables . In JDK 7 and before, HotSpot uses the permanent generation to implement the method area, so class variables are stored in the method area; in JDK 8 and after, class variables will be stored in the Java heap together with the Class object, so "class variables are stored in the method area" Area" is a completely logical expression.
At this time, memory allocation only includes class variables , not instance variables. Instance variables will be allocated in the Java heap along with the object when the object is instantiated. Secondly, the "initial value" mentioned here is "usually" the zero value of the data type . For example, if a class variable is defined as private static int v = 123;, then the initial value of variable v after the preparation phase is 0 instead of 123, because it has not yet started. Execute any Java method, and the putstatic instruction that assigns v to 123 is stored in the class constructor <clinit>() method after the program is compiled, so the action of assigning v to 123 will not be executed until the initialization phase.
It is mentioned above that the initial value is zero under "normal circumstances". If the ConstantValue attribute exists in the field attribute table of the class field, then the variable value will be initialized to the initial value specified by the ConstantValue attribute during the preparation phase, that is, if the class variable If it is static final modification, it will be assigned the specified value. For example, private static final int v = 123;v will be set to 123 in the preparation stage.

analyze

The parsing phase is the process in which the JVM replaces symbol references in the constant pool with direct references.

initialization

The initialization phase is the last step in the class loading process. Among the several class loading actions mentioned above, except for the user application that can partially participate in the loading phase through a custom class loader , the rest of the actions are completely performed by Java. The virtual machine takes control. It is not until the initialization phase that the JVM actually begins to execute the Java program code written in the class, handing over the dominance to the application.
In the preparation phase, the variables have been assigned zero values, and in the initialization phase, the values ​​set by the programmer through program coding are Plan to initialize class variables and other resources. A more direct statement is: the initialization phase is the process of executing the class constructor <clinit>() method

  1. The <clinit>() method is generated by the compiler (javac.exe) automatically collecting the assignment actions of all class variables in the class and merging the statements in the static code block . The order of collection by the compiler is determined by the statements in the source file. Determined by the order of appearance, in a static code block, only variables defined before the static code block can be accessed . Variables defined after it can be assigned values ​​in the previous static code block , but cannot be accessed (that is, they can only be written but not read. Because there may be other assignment statements after the static code block, what is read when the static code block is executed may not be the final value of that variable, so reading is not allowed)
  2. The <clinit>() method is different from the class constructor (<init>() method). It does not need to explicitly call the parent class constructor. The JVM will ensure that the parent class constructor is executed before the subclass's <clinit>() method is executed. The class's <clinit>() method has been executed . Therefore, the class of the first <clinit>() method executed in the JVM must be java.lang.Object
  3. The <clinit>() method is not necessary for a class or interface. If there is neither a static code block nor an assignment statement to a variable in a class, the compiler does not need to generate the <clinit>() method for this class.
  4. Static code blocks cannot be used in interfaces, but there are still assignment operations for variable initialization, so the <clinit>() method will also be generated. But unlike a class, executing the <clinit>() method of an interface does not require first executing the <clinit>() method of the parent interface , because the parent interface will only be initialized when the variables defined in the parent interface are used.
    In addition, the implementation class of the interface will not execute the <clinit>() method of the interface during initialization . When the subclass is initialized, the initialization of the parent class must occur before the subclass is initialized.
  5. The JVM must ensure that the <clinit>() method of a class is correctly locked and synchronized in a multi-threaded environment . If multiple threads initialize a class at the same time, only one of the threads will execute the <clinit>() method of this class. method, other threads need to be blocked and wait until the active thread completes executing the <clinit>() method. Of course, after the active thread finishes executing the <clinit>() method, other waiting threads will not continue to execute the <clinit>() method after being awakened. Under the same class loader, a type will only be initialized once.

class loader

Class loader refers to the code that implements the action of "obtaining the binary byte stream describing the class through the fully qualified name of the class" in the class loading phase.

Classes and class loaders

For any class, the class loader that loads it and the class itself (that is, the fully qualified class name) must jointly establish its uniqueness in the Java virtual machine. That is, comparing whether two classes are "equal" only makes sense if the two classes are loaded by the same class loader. Otherwise, even if the two classes originate from the same Class file and are loaded by the same Java Virtual machine loading , as long as the class loaders that load them are different, the two classes must not be equal.

Parental Delegation Model

From the perspective of the Java virtual machine, there are only two different class loaders:
one is the startup class loader ( Bootstrap C lass Loader Bootstrap\ Class\ LoaderB oo t s t r a p Cl a ss L o a d er   ), also known asthe boot class loader. This class loader is implemented in the C++ language and is part of the virtual machine itself. Instances of this class loader cannot be used by users. Obtained (if the getParent() method of a class loader returns null, it means that its parent class loader is the boot class loader);

The other is all other class loaders. These class loaders are all implemented in the Java language, all exist outside the virtual machine, and all inherit from the abstract class java.lang.ClassLoader. First, let’s understand the class loading provided by the following three systems
. Device:

  • Start the class loader : This class loader is responsible for loading the classes stored in the <JAVA_HOME>\lib directory (that is, loading Java's core class library, including classes starting with package names java, javax, and sun), or specified by the -Xbootclasspath parameter. The class library stored in the path and recognized by the Java virtual machine is loaded into the memory of the virtual machine. The startup class loader cannot be directly referenced by Java programs. When users write a custom class loader, if they need to delegate the loading request to the boot class loader for processing, they can directly use null instead, that is, use the null value to represent the boot class. class loader
  • Extension Class Loader ( E xtension C lass Loader Extension\ Class\ LoaderE x t e n s i o n Cl a ss L o a d er )   : This class loader is implemented as Java code in the class sun.misc.Launcher$ExtClassLoader. Responsible for loading all class libraries in the <JAVA_HOME>\lib\ext directory, or in the path specified by the java.ext.dirs system variable. This is actually an extension mechanism of the Java system class library. The JDK development team allows users to place general-purpose class libraries in the ext directory to extend the functions of Java SE. Since the extended class loader is implemented in Java code, developers can use the extended class loader directly in the program to load Class files.
  • Application Class Loader ( A application Class Loader Application\ Class\ LoaderA ppl i c a t i o n Cl a ss L o a d er   ): This class loader is implemented by sun.misc.Launcher$AppClassLoader. It is the return value of the getSystemClassLoader() method in the ClassLoader class, so it is also called the"system class loader". It is responsible for loading the user class path (ClassPath ClassPathCl a ss P a t hxxx.class.getResource("/").toString() , you can get all the class libraries on ClassPath)by calling any class in the projectDevelopers can also use this class loader directly in their code. If the application has not customized its own class loader, this is generally the default class loader in the program.

Java applications before JDK 9 are loaded by the cooperation of these three class loaders. Users can also add custom class loaders for expansion, such as adding Class file sources in addition to disk locations, or through class Loaders implement class isolation, overloading and other functions. The collaborative relationship between these class loaders is "usually" as follows (the reason for "usually" is because although the parent delegation model was widely used after the introduction of JDK 1.2, It is not a binding model, but a best practice for class loader implementation recommended by Java designers to developers):
Parental Delegation Model

This hierarchical relationship between class loaders is the Parent Delegation Model of class loaders ( Parents Delegation Model Parents\ Delegation\ ModelParents Delegation Model)

The parent delegation model requires that in addition to the top-level startup class loader, all other class loaders should have their own parent class loader. The parent-child relationship mentioned here is generally not implemented by inheritance, but by using a combination relationship to reuse the code of the parent loader.

The workflow is: if a class loader receives a class loading request, it will not try to load the class itself first, but delegates the request to the parent class loader to complete. Each level of class loader will That's right, so all load requests should eventually be sent to the top-level startup class loader. Only when the parent loader feedbacks that it cannot complete the load request (that is, the required class is not found in its search scope), The subloader will try to complete the loading by itself.

Under this model, an obvious benefit is that classes in Java have a hierarchical relationship with priority along with its class loader , such as java.lang.Object, which is stored in rt.jar , no matter which class loader wants to load this class, it will eventually be delegated to the startup class loader at the top of the model for loading. Therefore, the Object class can be guaranteed to be the same class in various class loader environments of the program.
The benefits are also :

  1. Avoids repeated loading of classes. When the parent loader has already loaded a class, the child loader will not reload the class.
  2. Safety is guaranteed. Classes in the <JAVA_HOME>\lib directory will only be loaded by the startup class loader. Imagine if someone maliciously defines a class with the same name as the <JAVA_HOME>\lib directory, such as java.lang.Integer, etc., if there is no parental delegation model, then such a class will be successfully loaded and even used, then the core basic Java API will be overwritten and tampered with

The code to implement the parent delegation model is all concentrated in the loadClass() method of java.lang.ClassLoader:

protected Class<?> loadClass(String name, boolean resolve) throws ClassNotFoundException{
    
    
    synchronized (getClassLoadingLock(name)) {
    
    
        //首先检查这个类是否被加载过
        Class<?> c = findLoadedClass(name);
        if (c == null) {
    
    
            long t0 = System.nanoTime();
            try {
    
    
            	//将请求委派给父类加载器
                if (parent != null) {
    
    
                    c = parent.loadClass(name, false);
                } else {
    
      //没有父类加载器,说明是启动类加载器
                    c = findBootstrapClassOrNull(name);
                }
            } catch (ClassNotFoundException e) {
    
    
                //父类加载器找不到这个类就会抛出ClassNotFoundException,说明父类加载器无法完成加载请求
            }
            if (c == null) {
    
    
                long t1 = System.nanoTime();
                //父类加载器无法加载时,再调用自身的findClass()方法来进行加载
                c = findClass(name);
                // this is the defining class loader; record the stats
                sun.misc.PerfCounter.getParentDelegationTime().addTime(t1 - t0);
                sun.misc.PerfCounter.getFindClassTime().addElapsedTimeFrom(t1);
                sun.misc.PerfCounter.getFindClasses().increment();
            }
        }
        if (resolve) {
    
    
            resolveClass(c);
        }
        return c;
    }
}

The logic of the method is: first check whether the type requested to be loaded has been loaded. If not, call the loadClass() method of the parent class loader. If the parent loader is empty, the startup class loader will be used as the parent loader by default. If the parent class loader fails to load and throws a ClassNotFoundException, then call your own findClass() method to try to load it.

Undermining the Parental Delegation Model

As can be seen from the above content, the implementation of the parent delegation model is in the loadClass() method of ClassLoader, so to destroy the parent delegation model, you only need to define a class loader yourself and override the loadClass() method.

Three Times in History That Broken the Parental Delegation Model

  1. The first time actually happened before the advent of the parental delegation model. Since the parent delegation model only came out in JDK 1.2, the concept of class loader and the abstract class java.lang.ClassLoader already existed in the first version of Java, which means that someone has already customized the class loader and rewritten loadClass( ) method. In order to be compatible with existing code, we cannot avoid the possibility of the loadClass() method being overwritten by subclasses. We can only add a new findClass() method to guide users to rewrite this method as much as possible when writing their own class loading logic. Not the loadClass() method. From the previous analysis of the loadClass() method, we can see that if the parent class fails to load the class, the subclass itself will call its own findClass() method to complete the loading. This will not affect the user's ability to load the class according to his own logic, and Ensure that the newly written class loader complies with the parent delegation model
  2. The second time was caused by flaws in the model itself. The parent delegation model achieves this. The more basic classes are loaded by the upper-level loader. If there are basic types that need to be called back to the user's code (under ClassPath), the startup class loader will not be able to load them. For example, JDBC is a set of specifications defined by Java itself. It must be a very basic type in Java, but it needs to call the SPI (Service Provider Interface) code implemented by other manufacturers and deployed under the application ClassPath, namely MySQL, Oracle and other companies According to the driver class implemented by JDBC, the class loader that loads JDBC must not recognize these codes.
    In order to solve this problem, Java introduces a thread context class loader. Services similar to JDBC use this thread context class loader to load The required SPI service code, which is a behavior in which the parent class loader requests the child class loader to complete class loading, so it violates the parental delegation model
  3. The third time was caused by the pursuit of hot deployment technologies such as code hot replacement and module hot deployment. For some production systems, it is very important to implement hot deployment and update without restarting. Relevant implementations include IBM's OSGi. The key to achieving modular hot deployment is the implementation of its custom class loader mechanism. Many class search operations are performed in the flat class loader, which destroys Learn the rules of the parental delegation model

Tomcat breaks parent delegation model

Related Links

Guess you like

Origin blog.csdn.net/Pacifica_/article/details/123647893