Chapter 7_Virtual Machine Class Loading Mechanism

Overview

The Java virtual machine loads the data describing the class from the Class file to the memory, and loads, connects, and initializes. These are all completed during the running of the program.

Java's inherently dynamically expandable language features depend on dynamic loading and dynamic connection during runtime.

Class file refers to a string of binary byte streams, not specifically to a file that exists in a specific disk

Timing of class loading

  1. The entire life cycle: loading, connecting (verification, preparation, analysis), initialization, use, unloading
  2. The order of loading, verification, preparation, initialization, and unloading is determined, and must be started step by step in this order, but not step by step or completed. They are usually mixed with each other and will be called to activate another phase during the execution of one phase.
  3. The parsing phase may start after the initialization phase because it supports runtime binding (dynamic binding)

Six situations where the class is initialized immediately:

There are only these six types. These behaviors are called active references to a type. In addition, all reference types will not trigger initialization, which is called passive reference

  1. When encountering the four bytecode instructions of new, getstatic, putstatic, and invokestatic, if the initialization has not been performed, the initialization phase needs to be triggered first. Typical situations:

    1. Use the new keyword to instantiate objects
    2. Read or set a type of static field
    3. Call a static method of a class
  2. Use java.lang.reflect package methods to make reflection calls to the class. If the class is not initialized, you need to trigger

  3. Initialize the class, its parent class has not yet been initialized

  4. When the virtual machine starts, you need to specify the execution main class (main()), first initialize this main class

  5. Use the newly added dynamic language support of JDK7, the class corresponding to the method handle has not been initialized

  6. An interface defines the default method newly added by JDK8 (modified by default). If the implementation class of this interface is initialized, the interface must be initialized before it

Passive reference example

  1. Referencing the static fields defined in the parent class through its subclass will only trigger the initialization of the parent class and not the class.
  2. Creating an array of the class will only automatically generate an array type inherited from Object, and will not trigger the initialization of the subclass
  3. Refer to the constants defined in the class, because constant optimization during compilation, the value of the constant is directly stored in the constant pool of the used class

The process of class loading

load

task

  1. Obtain the binary byte stream that defines the class through the fully qualified name of a class
  2. Convert the static storage structure represented by the byte stream into the runtime data structure of the method area
  3. Generate a Class object representing this class in memory as the access entry for various data of this class in the method area

Binary byte stream acquisition method

  1. Obtain from zip, such as JAR
  2. Obtain from the Internet, such as Web Applet
  3. Run-time calculation and generation, such as dynamic proxy technology
  4. Generated by other files, such as JSP
  5. Get from the database
  6. Obtained from encrypted files, protection measures against decompilation of Class files

Array class loading

  1. If the component type of the array is a reference type, the aforementioned loading process will be used recursively to load the component type, and the array class will be marked in the class namespace of the class loader that loads the component type
  2. If the component type of the array is not a reference type, the array class will be associated with the boot class loader
  3. The accessibility of the array class is consistent with its component type. If the component type of the array is not a reference type, the default is public

End of loading

After the loading phase is over, the binary byte stream is stored in the method area according to the format, and then a Class object is instantiated in the Java heap memory as an external interface for the program to access the type data in the method area

verification

In order to ensure that the information contained in the byte stream of the Class file meets all the constraint requirements of the "Java Virtual Machine Specification", to ensure that this information will not endanger the security of the virtual machine itself after being run as code

File format verification

Verify whether the byte stream conforms to the Class file format specification and is processed by the current version of the virtual machine (for example, whether it starts with a magic number, whether the constant in the constant pool has an unsupported constant type, whether the constant of the CONSTANT_Utf8_info type does not conform to Utf8 Encoded data)

The purpose is to ensure that the input byte stream can be correctly parsed and stored in the method area, that is, the format meets the requirements of describing a Java type

Metadata verification

Perform semantic analysis on the information described by the bytecode, including:

  1. Does this class have a parent class
  2. Whether the parent class of this class inherits a class that is not allowed to be inherited (final modified class)
  3. If it is not an abstract class, whether it has implemented all the methods required by its parent class or interface
  4. Fields and methods in the class are in contradiction with the parent class (for example, the final method of the parent class is overridden, overloads that do not conform to the specification, etc.)

Bytecode verification

Perform verification and analysis on the method body of the class (Code attribute in the Class file) to ensure that the method of the verified class will not perform behaviors that endanger the security of the virtual machine at runtime, for example:

  1. Ensure that the data type of the operand stack and the code sequence of the instruction can work together at any time. It will not appear similar to placing an int type data on the operand stack, but using the long type to load into the local variable table
  2. Ensure that any jump instructions will not jump to bytecode instructions outside the method body
  3. Ensure that the type conversion in the method body is valid

But it is impossible to completely ensure that the program is safe, because of the theory of shutdown problem: it is impossible to accurately check whether the program can end running within the effective time through the program

Symbol reference verification

Occurs when the virtual machine converts symbolic references into direct references, which occurs in the parsing phase. To verify whether the class lacks or is forbidden to access certain external classes, methods, fields and other resources it depends on, usually check:

  1. Whether the fully qualified name described by the string in the symbol reference can find the corresponding class
  2. Whether there are methods and fields described by field descriptors and simple names that match the method in the specified class
  3. Accessibility of classes, fields, and methods in symbolic references

ready

The stage of formally allocating memory for static variables defined in the class and setting the initial value of the class variable.
In JDK8, the class variable will be placed along with the Class object in the Java heap
. Instance variables are not included. The instance variables will follow the object when the object is instantiated. Allocate together in the heap.
Normally, the initial value is the zero value of the data type, until the value is assigned through the () method of the class constructor during the initialization phase of the class,
but if it is final modified, that is, the ConstantValue property exists, it will be in the preparation phase. Initialized to the initial value specified by the ConstantValue property

Parsing

  1. The process of replacing symbol references in the constant pool with direct references

  2. Conformance reference: A set of conformances is used to describe the referenced target, which can be any form of literal, regardless of the memory layout implemented by the virtual machine

  3. Direct reference: A pointer that can directly point to the target, a relative offset, or a handle that can be located indirectly to the target, which is directly related to the memory layout implemented by the virtual machine. If there is a direct reference, then the reference target must already exist in the memory of the virtual machine

  4. Before executing the bytecode instruction referenced by the operation symbol, the symbol reference used by it must be parsed. You can judge by yourself whether to parse when loading or parse before use

Analysis of classes and interfaces

Assuming that the class of the current code is D, the symbol reference N that has never been resolved must be resolved into a class or interface C

  1. If C is not an array type, the virtual machine passes the fully qualified name representing N to the class loader of D to load this class C
  2. If C is an array type, and the element type of the array is an object, the array element type will be loaded first according to the above method, and then the virtual machine will generate an array object representing the dimensions and elements of the array
  3. Before parsing is complete, symbol reference verification is required to confirm whether D has access to C

Field resolution

  1. The symbolic reference of the class or interface to which the field belongs will be resolved first
  2. If C itself contains a field whose simple name and field descriptor match the target, a direct reference to the field is returned
  3. Otherwise, recursively search for each interface and its parent interface from bottom to top according to the inheritance relationship
  4. Otherwise, the parent class is searched recursively from bottom to top according to the inheritance relationship
  5. If not found, NoSuchFieldError will be thrown
  6. After the search is successful and the reference is returned, the permission will be verified for this field. If there is no access permission, an IllegalAccessError will be thrown.

Method analysis

  1. The symbolic reference of the class or interface to which the method belongs will be resolved first
  2. Found that C is an interface, throw an exception
  3. If C itself contains a field whose simple name and field descriptor match the target, then a direct reference to the method is returned
  4. Recursive search in parent class
  5. List of implemented interfaces and recursive search in parent interface
  6. Authority authentication

Interface method analysis

  1. The symbolic reference of the class or interface to which the method belongs will be resolved first
  2. Found that C is a class, not an interface, throw an exception
  3. If C itself contains a field whose simple name and field descriptor match the target, then a direct reference to the method is returned
  4. Recursive search in the parent interface
  5. Because multiple inheritance of interfaces is allowed, if multiple interfaces are found in different parent interfaces, one of them will be returned from multiple methods
  6. Because all methods of the interface are public by default, there is no modular access constraint before JDK9, and there is no access permission problem

initialization

  1. The initialization phase is <clinit>()the process of executing the class constructor method. It is an automatic generation of the Javac compiler. It is formed by the compiler automatically collecting the assignment actions of all class variables in the class and combining the statements in the static statement block. The order of collection is determined by the order of appearance in the source file
  2. The static statement block can only access the variables defined before the static statement block, and the variables defined after the block can only be assigned, but cannot be accessed
  3. The parent class is <clinit>()executed first, so the static statement block defined in the parent class takes precedence over variable assignment operations like this
  4. <clinit>()The method is not necessary. If there is no assignment action and static statement block of the class variable, no <clinit>()method is generated for the class
  5. The <clinit>()method of executing the interface does not need to execute the <clinit>()method of the parent interface first , and the parent interface will be initialized only when the variables of the parent interface are used. The implementation class of the interface will not execute the <clinit>()method of the interface when it is initialized
  6. The virtual machine ensures that the <clinit>()methods of a class are correctly locked and synchronized in a multi-threaded environment

Class loader

Class loader: Obtain the binary byte stream describing the class through the fully qualified class name of a class, implemented outside the Java virtual machine

Classes and class loaders

Each class loader has an independent class namespace

A class is determined by its class loader and the class itself together to determine its uniqueness

Even if two classes originate from the same class loader and are loaded by the same Java virtual machine, as long as the class loader that loads them is different, the two classes must not be equal

Equal, including equals() method, instanceof keyword, etc.

Parental delegation mechanism

Virtual machine perspective

Bootstrap ClassLoader is implemented in C++ and is part of the virtual machine itself

All other class loaders: implemented by Java, independent of the outside of the virtual machine, all inherited from the abstract class java.lang.ClassLoader

Developer perspective

Maintain a three-tier class loader, the class loading structure of the parent delegation mechanism

Bootstrap ClassLoader (Bootstrap ClassLoader)

Responsible for loading and storing in the <JAVA_HOME>\lib directory, and the class library that the virtual machine can recognize is loaded into the virtual machine memory

The startup class loader cannot be directly referenced by the Java program. If you need to delegate the loading request to the startup class loader, use null instead

Extension ClassLoader

Responsible for loading the class library in the <JAVA_HOME>\lib\ext directory

It is an extension mechanism of Java system library

Allow users to place generic class libraries in the ext directory to extend the functionality of Java SE

Application ClassLoader

Also called the system class loader.

Responsible for loading all class libraries on the user class path (ClassPath). If you do not define your own class loader, it is the default class loader of the program

Parent Delegation Model

Except for the top-level startup class loader, the rest of the class loaders should have their own parent class loader, and usually use composition relationships to reuse the code of the parent loader

work process

If a class loader receives a request for class loading, he will not try to load the class himself first, but delegates the request to the parent class loader to complete it. This is the case for every level of class loader, so all All loading requests should eventually be transmitted to the top-level startup class loader. Only when the parent loader reports that it cannot complete the loading request (the required class is not found in his search range), the child loader will try itself To finish loading

Advantage

The class has a hierarchical relationship with priority along with his class loader

For example, the Object class, no matter which class loader loads it, it will be delegated to the startup class loader for execution. Therefore, the Object class can be guaranteed to be the same class in different class loader environments

Destroy the parent delegation mechanism

  1. It was "destroyed" for the first time: compatible with the code of the user-defined class loader that existed before the appearance of the parent delegation mechanism. According to the logic of the loadClass() method, if the parent class fails to load, it will call its findClass() method to complete the loading. This will not affect users to load classes according to their own wishes, and it can also ensure that the newly written class loader is in line with the parent's delegation of loading
  2. The second time it was "destroyed": due to the defects of the model itself, if there is a basic type, the user's code must be transferred back. But obviously it is impossible for the startup class loader to recognize and load these codes. Using the thread context class loader and using the parent loader to request the child class loader to complete the class loading violates the general principle of the parent delegation mechanism.
    According to the logic of the loadClass() method, if the parent class fails to load, it will call its own findClass() method to complete loading. This will not affect users to load classes according to their own wishes, and it can also ensure that the newly written class loader is in line with the parent's delegation of loading
  3. The second time it was "destroyed": due to the defects of the model itself, if there is a basic type, the user's code must be transferred back. But obviously it is impossible for the startup class loader to recognize and load these codes. The behavior of using the thread context class loader and using the parent loader to request the child class loader to complete the class loading violates the general principle of the parent delegation mechanism
  4. It is caused by the user's pursuit of program dynamics. Such as code hot replacement, module hot deployment and so on. The class loader at this time is no longer the tree structure recommended by the parent delegation mechanism, but a more complex network structure

Guess you like

Origin blog.csdn.net/weixin_42249196/article/details/108164100