Detailed Java Virtual Machine (x) ------ class loading process

  In the previous article, we detailed description of the Java class file structure , then the Class how a file is loaded into memory by the virtual machine to direct it? That's this blog will be introduced - class loading process.

1, class life cycle

  From the class is loaded into memory to start a virtual machine, up to unload a memory, its life cycle process is as follows:

  

  Shown in red above five parts (loading, verification, preparation, initialization, uninstall) order is determined, that is to say, the class loading process to be started step by step in this order. Here the "Start" is not a step by step "in" or "complete" because these stages are usually carried out mixing cross each other, typically call another stage in the course of a stage.

2, load

  "Loading" phase is the "class loader" The first phase of the life cycle. In the loading phase, the virtual confidential complete the following three things:

  ①, to obtain such a binary byte stream defined by the fully qualified name of a class.

  ②, the byte stream that represents the static storage structure into a run-time data structure area method.

  ③, generates a representative of the Java heap java.lang.Class object of this class, as a method of access to the entrance area of ​​such data.

  PS: the fully qualified name of the class of this class can be understood as an absolute path stored. The method area is the runtime data area previously defined JDK1.7, JDK1.8 and later changed to a metadata area (Metaspace), mainly used to store class information loaded Java Virtual Machine, constants, static variables, time compiler compiled code and other data. For more information please refer to the second article of the series here - runtime memory structures .

  In addition, we look at the first point - to obtain such permission by the definition of a binary stream named class, there does not explicitly specify from where to get and how to get that is not clearly defined and we must be from a Class File acquired. Based on this, the Java development process, creative developers in this arena to play out all kinds of tricks:

  1, read from the ZIP package. This is called the base behind the JAR, EAR, WAR format.

  2, acquired from the network. The typical application is the Applet.

  3, calculates and generates run-time. This is the dynamic agent technology.

  4, is generated by the other files. For example JSP application.

  5, read from the database.

  After the completion of the loading stage, external to the virtual machine according to a binary stream of bytes stored in the format required by virtual machines in the process zone, and a stack instantiate the Java class java.lang.Class object, this object will access the program as these types of methods outside the data area of ​​the interface.

  Note that, with the loading part of the connection phase stage (e.g., part of the check byte code format file) for the cross, the loading phase has not been completed, the connection phase may have begun.

3. Verify

  The first step is to verify the connection phase, the role is to ensure that the information byte stream file included in Class meet the requirements of the current virtual machine and the virtual machine will not jeopardize their own safety.

  We say Java language itself is relatively safe because of the compiler, pure Java code to access an array of data outside the boundary, there is no jump to the line of code and the like, are to be rejected by the compiler. But we have said earlier, Class files do not have to come from the compiled Java source code, you can use any means, including you very fast hardware to write Class files directly with a hex editor.

  So, if the virtual machine does not check the incoming byte stream, byte stream will be loaded harmful and cause the system to crash. But what virtual machine specification for inspection, when inspection, how the checks are not clearly defined, different virtual machine implementations may all be different, but the general inspection will be completed the following four aspects.

①, file format validation

  Check byte stream for compliance with Class file format, and can be a current version of virtual machine processing.

  First, whether the magic number 0xCAFEBABE beginning.

  Second, the major and minor version number is within the scope of the current virtual machine processing.

  Third, if there is an unsupported type constants (constant tag check mark) constant in the constant pool

  Fourth, the index value points to the constant variety of whether there is a constant point does not exist or do not meet the type of constants.

  Fifth, whether non-compliance with UTF8-encoded data CONSTANT_Utf8_info type constants.

  Six, Class files and documents in various parts of itself if there is to be deleted or other additional information.

  These are part of the contents of the check, of course, much more. After these check byte stream will enter the storage method of the memory zone, the next three stages behind the check method is based on the storage structure of the area performed.

②, metadata validation

  The second stage is mainly described bytecode information semantic analysis, to ensure compliance with the information which describes the Java language specifications.

  First, if the class has a parent class (except java.lang.Object class, all classes should have a parent class).

  Second, if the parent of this class inherits not allowed to be inherited class (the final modified class).

  Third, if the class is not an abstract class, whether to implement a common method among all of its parent class or interface implementation requirements.

  Fourth, the fields in the class, whether the method is a contradiction with the parent (e.g. covering the final field of the parent class, or overloading of compliance occurs)

③, bytecode verification

  The third stage is the entire bytecode verification phase to verify the most complex, is the main control flow and data flow analysis. The class will stage method for analysis to ensure that the method is verified at runtime will not make any act endangering the security of virtual machines.

  First, to ensure that any time the operand stack and data type code sequence of instructions can cooperate. For example, it will not be placed in an int type data operand stack, a long type according to Shique loaded into a local variable table.

  Second, to ensure not jump to the jump instruction bytecode instructions other than the method body.

  Third, to ensure that body type conversion method is effective. For example, to assign a subclass object to the parent class data type, which is safe. But the parent object assigned to the sub-class data type, or even assigned to type completely irrelevant, which is not legitimate.

④, symbolic references verification

  Symbol reference information is mainly based authentication other than itself (constant pool reference symbols) of the matching check, the check typically requires the following:

  A symbol reference is the fully qualified name can be found by a character string described in the corresponding class.

  Second, if there is a method in line with methods and fields and the field descriptors described the simple name in the specified class.

  Third, access class, field and method references a symbol in (private, protected, public, default) whether the current class can be accessed.

4, ready

  Preparation phase is formally class variables allocated memory and set the class variable stage of the initial value, which is the memory area allocated in the process.

  note:

  First, the above said is a class variable that is being modified static variables, instance variables are not included. Instance variable assignment along with the object when the object is instantiated in the heap.

  Second, the initial value refers to the default values ​​for some data types. The basic data types as the initial value (initial value of reference type null):

  

 

   For example, the definition of public static int value = 123. So after the preparation stage, the value of value is 0 instead of 123, the value 123 is assigned after the program is compiled, stored in the constructor method of the class is will be executed during the initialization phase. But there is a special case, the final modification of attributes, such as the definition of public final static int value = 123, then after the preparation stage, value was assigned to the 123.

5, resolve

  Parsing stage is a virtual machine to a symbolic constant pool of reference for the process of replacing direct references.

  Reference symbol (Symbolic References): Symbol reference to a certain set of symbols described in the referenced, literal symbols may be in any form, can unambiguous target as long as the target can be used. Symbol reference has nothing to do with the memory layout to achieve a virtual machine, the goal is not necessarily a reference has been loaded into memory.

  A direct reference (Direct References): direct reference may be direct pointer to the object, a relative offset or indirectly targeted to handle targets. Direct reference to the memory layout is achieved with a virtual machine, and references cited directly translated in a virtual machine instance with a different symbol is generally not the same. If you have a direct reference, the target reference must already exist in memory.

  The main analysis operation for the class or interface, fields, methods class, interface method references four symbols, respectively corresponding to the constant pool CONSTANT_Class_info, CONSTANT_Fieldref_info, CONSTANT_Methodref_info, CONSTANTS_InterfaceMethodref_info four types of constants.

6, initialization

   Class initialization phase is the final step in the loading phase, in front of the process, but the first stage can be loaded from outside the defined class loaders involved, the rest of the process is completely dominated and controlled by the virtual machine by the user. By the initialization phase, then start the real implementation of Java code in the class definition (or byte code).

  In the preparatory phase described earlier, the class variable has already been assigned an initial value, while the initialization phase, the programmer of the encoding and resources to initialize variables.

  In other words, the initialization phase is performed during the class constructor <clinit> () method .

  ①, <clinit> () method is the assignment operation compiler automatically collected class all class variables and static block of statements (static {}) statements merger sequentially compiler collected by the statement in the source file appear in the order determined by a static statement block only access to the static variables defined in a block of statements before, as defined in the variable after it, in front of a static statement block can be assigned, but can not be accessed.

  For example, the following code will complain:

  

 

   But you put the code into line 14 above static static code block will not be the error. Or does not change the order of the code, the code is removed, line 11, will not be given.

  ②, <clinit> () constructor method of the class (or an instance constructor <init> () method) different, it does not display the call the parent class constructor, the virtual guaranteed subclass <init> in () before performing the method, the parent class <init> () method has been completed. Thus the first class virtual machine <init> is executed () method must be java.lang.Object.

  ③, since the parent <clinit> () method to perform, so static statements defined in the parent class block priority subclass in the variable assignment.

  ④, <clinit> () method for the interface is not necessary, if a class is not a static block of statements, there is no assignment of the variable, then the compiler can not generate this class <clinit> () method.

  ⑤, the interface can not use static statement block, but there is still initialized variable assignment, so as interfaces and classes are generated <clinit> () method. But different interfaces and classes is executed interface <clinit> () method does not need to run a parent interface <clinit> () method. Only when the parent variables defined in the interface is used, the parent interface will be initialized.

  ⑥, the interface implementation class initialization will not be performed as in the <clinit> () interface method.

  ⑦, virtual opportunity to ensure a class of <clinit> () method is properly locked and synchronized in a multithreaded environment. If multiple threads to initialize a class, then there will only be one thread to perform this type of <clinit> () method, other threads are blocked need to wait until the active thread execution <clinit> () method is completed. If there is very time-consuming operation in a class of <clinit> () method, it may cause obstruction of multiple processes.

  For example, the following code:

package com.yb.carton.controller;

/**
 * Create by YSOcean
 */
public class ClassLoadInitTest {


    static class Hello{
        static {
            if(true){
                System.out.println(Thread.currentThread().getName() + "init");
                while(true){}
            }
        }
    }

    public static void main(String[] args) {
        new Thread(()->{
            System.out.println(Thread.currentThread().getName()+"start");
            Hello h1 = new Hello();
            System.out.println(Thread.currentThread().getName()+"run over");
        }).start();


        new Thread(()->{
            System.out.println(Thread.currentThread().getName()+"start");
            Hello h2 = new Hello();
            System.out.println(Thread.currentThread().getName()+"run over");
        }).start();
    }

}
View Code

  Results are as follows:

  

 

   Thread 1 grabbed the implementation of <clinit> () method, but this method is an infinite loop, thread 2 will block waiting.

  Know initialization class, then initialize the class when it is triggered it? JVM probably provides the following situations:

  ①, when the virtual machine is started, initialization user-specified class.

  ②, when faced with new instructions to create a new target class instance, new initialization specified target class.

  ③, when faced with an instruction calling the static method, a static method to initialize the class is located.

  ④, class instruction when it comes to accessing the static field, initialize the static field is located.

  ⑤, initialize the subclass will trigger initialization of the parent class.

  ⑥, if an interface defines the default method, then directly or indirectly to achieve the class initialization of the interface, the interface will trigger initialization.

  ⑦, when a class is reflected calls using reflection API, this class will be initialized.

  ⑧, like when the original call to MethodHandle instance, the initialization method MethodHandle point is located.

 

Guess you like

Origin www.cnblogs.com/ysocean/p/11427536.html