In-depth understanding of the virtual machine class loading mechanism of the virtual machine

"Deep Understanding of Java Virtual Machine: JVM Advanced Features and Best Practices (Second Edition" Summary of Reading Notes and Common Related Interview Questions

Common interview questions in this section (recommended to read with questions, the answers to the questions are mentioned in the text):

Briefly talk about the class loading process, what operations are performed in it?

Do you know anything about class loaders?

What is the parental delegation model?

How the parental delegation model works and the benefits of using it.

Foreword:

The translation of the result of code compilation from native to bytecode is one small step in the evolution of storage formats, but one giant leap in the evolution of programming languages.

1 Overview

We already know the class file structure in the previous section. All kinds of information described in the class file need to be loaded into the virtual machine before it can be run and used.

So how does the virtual machine load these class files? What happens to the information in the class file after it enters the virtual machine?

1.1 The concept of virtual machine class loading mechanism

The virtual machine loads the data describing the class from the class file into the memory, and verifies, converts, parses and initializes the data. The process of finally forming the most direct type of java that can be used by virtual machines is the class loading mechanism of virtual machines.

1.2 Dynamic loading and dynamic linking of Java language

Another important point to note is: The type loading connection and initialization process in the java language are all completed during the running of the program . Although this strategy will slightly increase the performance overhead of class loading, it will not be enough for the java application. Offers a high degree of flexibility . The innate ability to dynamically extend language features in java is achieved by relying on the features of dynamic loading and dynamic linking during runtime. For example, if you write an interface-oriented program, you can wait until runtime to specify its concrete implementation class.

2 class loading timing

The entire life cycle of a class from being loaded into virtual machine memory to unloading memory includes:

class life cycle
Let's think about it. So when do we need to start the first phase of class loading: loading?

The virtual machine specification strictly states that there are only five situations in which a class must be "initialized" immediately:

  • When instantiating an object using the new keyword, when reading or setting a static field of a class, or when a static method of a class has been called.
  • When using the method of the java.lang.reflect package to make a reflection call to a class, if the class is not initialized, its initialization needs to be triggered first.
  • When initializing a class, if it finds that its parent class has not been initialized, it will initialize its parent class first.
  • When the virtual machine starts, the user needs to specify a main class to be executed (that is, the class that contains the main() method), and the virtual machine initializes this class first;
  • Some situations when using Jdk1.7 dynamic language support.

For interfaces, when an interface is initialized, it is not required that all its parent interfaces have been initialized, and only when the parent interface is actually used (such as referring to the constants defined in the parent interface) will be initialized.

All ways of referencing a class without triggering initialization are called passive references. Here are 3 examples of passive references:

1. Referencing the static field of the parent class through a subclass will not cause the initialization of the subclass; 2. Defining a reference class through an array will not trigger the initialization of this class

public class SuperClass {
	static {
		System.out.println("SuperClass(父类)被初始化了。。。");
	}
	public static int value = 66;
}

public class Subclass extends SuperClass {
	static {
		System.out.println("Subclass(子类)被初始化了。。。");

	}
    
}

public class Test1 {

	public static void main(String[] args) {

		// 1:通过子类调用父类的静态字段不会导致子类初始化
		// System.out.println(Subclass.value);//SuperClass(父类)被初始化了。。。66
		// 2:通过数组定义引用类,不会触发此类的初始化
		SuperClass[] superClasses = new SuperClass[3];
		// 3:通过new 创建对象,可以实现类初始化,必须把1下面的代码注释掉才有效果不然经过1的时候类已经初始化了,下面这条语句也就没用了。
		//SuperClass superClass = new SuperClass();
	}

}

③Constants will be stored in the constant pool of the calling class during the compilation phase. Essentially, there is no direct reference to the class that defines the constant, so the initialization of the class that defines the constant will not be triggered.

public class ConstClass {
	static {
		System.out.println("ConstClass被初始化了。。。");
	}
	public static final String HELLO = "hello world";
}
public class Test2 {

	public static void main(String[] args) {
		System.out.println(ConstClass.HELLO);//输出结果:hello world
	}

}

3 class loading process

Let's talk about the whole process of class loading in the Java virtual machine in detail: the specific work of the five-stage lock execution of loading , verification , preparation , parsing and initialization .

3.1 Loading

"Loading" is a phase of the "class loading" process, and the two must not be confused.

The loading phase consists of three basic actions:

  1. Through the fully qualified name of the type, a binary data stream representing the type is generated (there is no indication of where and how to obtain it, it can be said to be a very open platform)

  2. Parse this binary data stream into a runtime data structure within the method area

  3. Create an instance of the java.lang.Class class representing the type as the access entry for various data of this class in the method area.

From the fully qualified name of a type, there are several common forms that produce a stream of binary data representing that type:

  • Read from the zip package and become the basis for the JAR, EAR, and WAR formats in the future;
  • Obtained from the network, the most typical application of this scenario is Applet;
  • Run-time calculation generation, the most commonly used in this scenario is dynamic proxy technology;
  • Generated by other files, such as our JSP;

Note: The non-array class loading phase can either be done using the system-provided class loader or a user-defined class loader. (i.e. overriding the loadClass() method of a class loader)

3.2 Verification

Verification is the first step in the connection phase. The purpose of this phase is to ensure that the information contained in the byte stream of the Class file meets the requirements of the current virtual machine and will not compromise the security of the virtual machine itself .

If the virtual machine does not check the incoming byte stream and fully trust it, it is likely to cause the system to crash because of the harmful byte stream loaded, so verification is an important job for the virtual machine to protect itself. Whether this stage is rigorous or not directly determines whether the Java virtual machine can withstand malicious code attacks.

On the whole, the verification phase will roughly complete four stages of verification work: file format, metadata, bytecode, and symbol references .

3.2.1 File Format Verification

Verify that the byte stream conforms to the specification of the Class file format and can be processed by the current version of the virtual machine. The main purpose of this verification phase is to ensure that the incoming byte stream can be correctly parsed and stored in the method area. The verification at this stage is based on the binary byte stream. Only after passing the verification at this stage, the byte stream will enter the method area of ​​the memory for storage, so all the following three phases are based on the storage structure of the method area. , will no longer directly manipulate the byte stream.

3.2.2 Metadata Validation

In this stage, semantic analysis is performed on the information described by the bytecode to ensure that the described information conforms to the requirements of the Java language specification, and the purpose is to ensure that there is no metadata information that does not conform to the Java language specification .

3.2.3 Bytecode Verification

In this phase, data flow and control flow analysis are mainly performed to ensure that the methods of the verified class will not do behaviors that endanger the security of the virtual machine at runtime. For example, guaranteeing that jump instructions do not jump to bytecode instructions outside the method body, guaranteeing that type conversions in the method body are valid, and so on.

Due to the high complexity and time-consuming of data flow verification, after JDK1.6, an optimization method (which can be turned off by parameters) was introduced in Javac: add an item to the attribute table of the Code attribute of the method body. StackMapTable" property, which describes the state of the local variable table and operation stack at the beginning of all basic blocks in the method body, thus saving some time by turning bytecode verification type deduction into type checking.

Note: If a method body passes the bytecode verification, it does not mean that it must be safe, because the verification program logic cannot be absolutely accurate.

3.2.4 Symbolic reference verification

The final stage of validation occurs when the virtual machine converts a symbolic reference to a direct reference. This conversion occurs in the third phase of the link, the parsing phase. The purpose of symbolic reference verification is to ensure that the resolution action is performed correctly.

The verification contents mainly include:

  • Whether the corresponding class can be found by the fully qualified name described by the string in the symbol reference;
  • Whether there are field descriptions of symbolic methods and methods and fields described by simple names in the specified class;
  • Whether the access (private, protected, public, default) of classes, fields, and methods in symbolic references can be accessed by the current class.

3.3 Preparation

The preparation phase is the phase of formally allocating memory for class variables and setting the initial values ​​of class variables. The memory used by these variables will be allocated in the method area. (Note: At this time, only class variables (variables modified by static) are included in memory allocation, not instance variables. Instance variables will be allocated in the Java heap along with the object when the object is instantiated).

The initial value is usually the zero value of the data type:

For: public static int value = 123;, then the initial value of the variable value after the preparation phase is 0 instead of 123. At this time, no java method has been executed, and the action of assigning value to 123 will be executed in the initialization phase. .

Some special cases:

For: public static final int value = 123; Javac will generate the ConstantValue property for the value at compile time, and the virtual machine will assign the value to 123 according to the setting of ConstantValue in the preparation stage.

Zero values ​​for primitive data types:

Zero value of primitive data type

3.4 Analysis

The resolution phase is the process by which the virtual machine replaces symbolic references in the constant pool with direct references.

So what does a symbolic reference have to do with a direct reference?

3.4.1 Look at the concepts of both.

Symbolic References: Symbolic references use a set of symbols to describe the referenced target. Symbols can be any form of literal that conforms to the convention. Symbolic references have nothing to do with the memory layout implemented by the virtual machine, and the referenced target does not necessarily have loaded into memory.

Direct References: A direct reference can be a pointer directly to the target, a relative offset, or a handle that can be located indirectly to the target. Direct references are related to the memory layout implemented by the virtual machine, and the referenced target must already exist in memory.

The virtual machine specification does not specify when the parsing phase occurs, and the virtual machine implementation can decide whether to parse when the class is loaded or wait until a symbolic reference is about to be used.

3.4.2 Cache parsing results

It is very common to perform multiple resolution requests for the same symbol reference. Except for the invokedynamic instruction, the virtual machine implementation can cache the first resolution result to avoid repeated resolution actions. Regardless of whether multiple resolution actions are actually performed, the virtual machine needs to ensure that in the same entity, if a reference symbol has been successfully resolved before, then subsequent reference resolution requests should always succeed; If the secondary parsing fails, the parsing requests for this symbol by other directives should also receive the same exception.

3.4.3 The target of the parsing action

The parsing action is mainly performed on 7 types of symbolic references, namely class or interface, field, class method, interface method, method type, method handle and call site qualifier. The parsing process of the first four references is closely related to the new dynamic language support of JDK1.7 for the latter three. Since the java language is a statically typed language, there is no way to compare them with the present without introducing the semantics of the invokedynamic instruction. corresponding to the java language.

3.5 Initialization

The class initialization stage is the last step of class loading. In the previous class loading process, except that the user application can participate in the loading stage through the custom class loader, the rest of the actions are completely dominated and controlled by the virtual machine. In the initialization phase, the java program code (or bytecode) defined in the class is actually executed.

4 class loader

4.1, classes and class loaders

For any class, its uniqueness in the Java virtual machine needs to be established by the class loader that loads it and the class itself. If two classes originate from the same Class file, the two classes must not be equal as long as the class loader that loads them is different.

4.2 Introduction to Class Loaders

From the perspective of the Java virtual machine, it is divided into two different class loaders: Bootstrap ClassLoader and other class loaders . The startup class loader is implemented in C++ language and is a part of the virtual machine itself; the rest of the class loaders are implemented in Java language, independent of the virtual machine, and all inherit from the java.lang.ClassLoader class . (This is limited to HotSpot virtual machines).

From the perspective of Java developers, most Java programs use the class loaders provided by the following three systems.

Bootstrap ClassLoader:

This class loader is responsible for storing the files stored in the <JAVA_HOME>\lib directory, or in the path specified by the -Xbootclasspath parameter, and recognized by the virtual machine (only recognized by the file name, such as rt.jar, the name does not match The class library is not loaded even if placed in the lib directory) The class library is loaded into the virtual machine memory.

Extension ClassLoader:

This loader is implemented by sun.misc.Launcher$ExtClassLoader, which is responsible for loading all class libraries in the <JAVA_HOME>\lib\ext directory or in the path specified by the java.ext.dirs system variable. Developers can directly Use extension class loader.

Application ClassLoader (Application ClassLoader):

This class loader is implemented by sun.misc.Launcher$AppClassLoader. Since this class loader is the return value of the getSystemClassLoader() method in ClassLoader, it is generally called the system class loader. It is responsible for loading the class library specified on the user's classpath (ClassPath). Developers can use this class loader directly. If the application has not customized its own class loader, in general, this is the default class in the program. Loader.

Our applications are loaded by these three class loaders. If necessary, we can add our own defined class loaders.

4.3 Parent Delegation Model

The Parent Delegation Model (Pattern Delegation Model) requires that all class loaders except the top-level startup class loader should have their own parent class loader. The parent-child relationship here is usually that the subclass reuses the code of the parent loader through a composition relationship rather than an inheritance relationship.

Pattern Delegation Model
The working process of the parent delegation model: if a class loader receives a class loading request, it first delegates the request to the parent class loader to complete (so all loading requests should eventually be sent to the top-level startup class loader) , and only when the parent loader reports that it cannot complete the loading request, the child loader will try to load it by itself.

An obvious benefit of using the parent delegation model to organize the relationship between class loaders is that a Java class has a priority hierarchy along with its class loaders .

Note: The parent delegation model is a class loader implementation recommended by the Java designers to developers, not a mandatory constraint model. Most class loaders in the java world follow this model, but there are exceptions.

4.4 Breaking the parent delegation model

The parental delegation model has mainly been "destroyed" on a large scale three times.

The first failure is because the class loader and the abstract class java.lang.ClassLoader existed in JDK1.0, and the parent delegation model was introduced after JDK1.2. In order to be compatible with the existing user-defined class loader, A certain compromise was made when introducing the parent delegation model: a findClass() method was introduced in java.lang.ClassLoader. Before that, the only purpose for users to inherit java.lang.Classloader was to override the loadClass() method. After JDK1.2, users are not encouraged to override the loadClass() method, but write their own class loading logic into the findClass() method. If the parent class fails to load in the loadClass() method, it will call its own findClass() method to complete the loading, so as to ensure that the newly written class loader conforms to the rules of the parent delegation model.

The second failure is due to the defects of the model itself. In reality, there is such a scenario: the basic class loader needs to call the user's code, and the basic class loader may not know the user's code. To this end, the design-time "Thread Context ClassLoader" introduced by the Java design team. In this way, the parent class loader can request the child class loader to complete the class loading action. The general principles of the parental delegation model have been violated.

The third break is caused by the user's pursuit of program dynamism. The dynamism mentioned here refers to popular words such as "code hot replacement", "module hot deployment" and so on. To put it bluntly, I hope that the application can be used immediately like our computer peripherals, connecting the mouse and U disk without restarting the machine. OSGi is the current "de facto" Java modularization standard in the industry. The key to OSGi's implementation of modular hot deployment is the implementation of its custom class loader mechanism. Each program module (called Bundle in OSGi) has its own class loader. When a Bundle needs to be replaced, the Bundle and the class loader are replaced together to achieve hot code replacement. In the OSGi environment, the class loader is no longer a tree structure in the parent delegation model, but further developed into a more complex network structure.

Summarize:

This section mainly introduces the actions performed by the virtual machine in the five stages of "loading", "validation", "preparation", "parse" and "initialization" during the class loading process, and also introduces the work of the class loader. Principles and implications for virtual machines.

Welcome to my WeChat public account: "Java Interview Clearance Manual" (a warm WeChat public account, looking forward to making progress together with you~~~ insist on originality, share beautiful texts, and share various Java learning resources):

WeChat public account

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325388023&siteId=291194637