jvm_polymorphic implementation principle, virtual method table, virtual method, static analysis, dynamic link detailed explanation

1 Introduction

I am learning JVM recently, but it is very painful, because the knowledge points of JVM are not coherent, and I have not found any information, and it is difficult to verify. As the title, I encountered these concepts in the process of learning, and it was difficult to understand at first. Later, I kept looking for information, watching videos, and reading books. I have come up with some personal summaries, which may not be completely accurate. I hope to have a reference for everyone.

The following explanation requires you to have some understanding of the JVM in advance, such as class loading, JVM memory model, bytecode files, etc., will be compiled into bytecode files according to the source code, and then the bytecode files will be loaded into Let's start with the order of virtual machine memory.

1.1 Virtual and non-virtual methods

Let's first look at the definition in a broad sense (that is, at the Java code level):
non-virtual method: If the specific calling version of the method is determined at compile time, this version is immutable at runtime. Such methods are called non-virtual methods.
Static methods, private methods, final methods, instance constructors, and parent methods are all non-virtual methods. Virtual methods
: Other methods are called non-virtual methods
Huh? How to understand the above two definitions? Let's focus on the key points. The specific calling version is determined at compile time. This is the essential difference between virtual methods and non-virtual methods. The method is determined at compile time? What does this mean? Is the method still uncertain?

Let's look at a specific code example:

class Animal{
    
    
    void test(){
    
    
        System.out.println("动物");
    }
}

class Cat extends Animal{
    
    
    @Override
    void test(){
    
    
        System.out.println("猫");
    }
}

class Test {
    
    
    void test(Animal animal){
    
    
    	// 此时方法就是无法确定的
        animal.test();
    }
}

The code animal.test() in the test method in the Test class cannot be confirmed. Only when it is running can it know whether the Animal or Cat is called according to the actual parameters. Now that you understand the compilation period of the method, you can determine what the specific version of the call means.

You can also understand that methods that cannot be rewritten are called non-virtual methods, such as static methods, methods defined by final, private methods, etc.; methods that may be rewritten are called virtual methods. Note that as long as there are All possible rewrites are virtual methods, even though you haven't rewritten them yet.

Let's look at the further definition, this one is good for everyone to understand:

Non-virtual method: The method called by the invokestatic instruction and the invokespecial instruction is called a non-virtual method.
Virtual method: The rest (except those modified by final) are called virtual methods.
What does this mean? We know that the source file needs to be compiled into a class file. During the compilation process, the compiler will modify our source code and change it into some instructions that the JVM can recognize. The JVM provides the following four instructions:

invokestatic: calls static methods, the only method version is determined in the parsing stage
invokespecial: calls methods, private and parent methods, and the only method version is determined during the parsing stage
invokevirtual: calls all virtual methods
invokeinterface: calls interface methods

When compiling, the compiler will add instructions to the method according to the content of the method in the source code. When compiling, how the compiler recognizes whether a method is determinable is actually very simple, that is, whether it is static, private, constructor, parent class method, or interface method. Then directly judge whether it is determined.

1.2 Symbolic references and direct references

Let’s look at the definition first:
symbolic reference: a string, this string contains enough information to find the corresponding location for actual use
Direct reference: it’s the address, it’s the address in memory of the method of the class we use
Let’s look at it first a class:

class Test {
    
    
    public static void main(String[] args) {
    
    
        System.out.println("wqewqeqwe");
    }
}

This class is compiled into a class file, we can use javap -v to decompile this file, and we will get the following content (just intercept part of the content):

Constant pool:
   #1 = Methodref          #6.#20         // java/lang/Object."<init>":()V
   #2 = Fieldref           #21.#22        // java/lang/System.out:Ljava/io/PrintStream;
   #3 = String             #23            // wqewqeqwe
   #4 = Methodref          #24.#25        // java/io/PrintStream.println:(Ljava/lang/String;)V
   #5 = Class              #26            // Test
   #6 = Class              #27            // java/lang/Object
   #7 = Utf8               <init>
   #8 = Utf8               ()V
   #9 = Utf8               Code
  #10 = Utf8               LineNumberTable
  #11 = Utf8               LocalVariableTable
  #12 = Utf8               this
  #13 = Utf8               LTest;
  #14 = Utf8               main
  #15 = Utf8               ([Ljava/lang/String;)V
  #16 = Utf8               args
  #17 = Utf8               [Ljava/lang/String;
  #18 = Utf8               SourceFile
  #19 = Utf8               Solution.java
  #20 = NameAndType        #7:#8          // "<init>":()V
  #21 = Class              #28            // java/lang/System
  #22 = NameAndType        #29:#30        // out:Ljava/io/PrintStream;
  #23 = Utf8               wqewqeqwe
  #24 = Class              #31            // java/io/PrintStream
  #25 = NameAndType        #32:#33        // println:(Ljava/lang/String;)V
  #26 = Utf8               Test
  #27 = Utf8               java/lang/Object
  #28 = Utf8               java/lang/System
  #29 = Utf8               out
  #30 = Utf8               Ljava/io/PrintStream;
  #31 = Utf8               java/io/PrintStream
  #32 = Utf8               println
  #33 = Utf8               (Ljava/lang/String;)V

We actually use the System class in the source code, but in the class file, this use will be replaced by the java/lang/System corresponding to #28 in the bytecode file. This java/lang/System is the legend Symbolic references in .

Why use symbolic references? This is easier to understand, because it doesn't know your specific memory usage when compiling, so it doesn't know what the direct reference is, so it uses a uniquely identified string to represent it.

When will it be converted into a direct reference? When the class is loaded, there is a small stage in the linking stage of the class loading, called parsing. The task of this stage is to convert the symbolic reference of the class into a direct reference.

Moreover, the translation relationship between these symbol references and direct references can be reused, that is to say, there is an area in the virtual machine called the method area, and there is an area in the method area called the runtime constant pool. This constant pool can be regarded as a table. This table records the correspondence between symbolic references and direct references. When loading a class, all symbol references of the class file will be added to this table.

If the newly added conforming reference already exists in the table, it means that the symbolic reference has been translated and can be directly converted into a direct reference; if not, it means that the symbolic reference appears for the first time, and it must be String content to search. After running once, symbolic references will be replaced with direct references, so you don't need to search next time.

The above is my personal understanding, not necessarily correct, for your reference and discussion.

1.3 Dynamic linking in stack frames

We know that there is a content in the stack frame called dynamic link, what is this?
First look at a class, the code is as follows:

class Test {
    
    
    public static void main(String[] args) {
    
    
        System.out.println("wqewqeqwe");
    }
}

Use javap -v to decompile:
insert image description here
the ones marked in red are the so-called dynamic links. I personally don’t like to call them this way. I think it is ambiguous. It should be called the reference to the runtime constant pool in the method area in the stack, which is easier to understand.

That is to say, the dynamic link is a pointer, which points to the symbol reference in the runtime constant pool of the method area. If the pointed object has been resolved, it is a direct reference, that is, it points to a specific address.

There is also a pair of concepts: dynamic linking and static analysis
Static analysis: when a bytecode file is loaded into the JVM, if the called target method is known at compile time and remains unchanged at runtime, the call The process of converting symbolic references of methods into direct references is called static linking and
dynamic linking: if the called method cannot be determined at compile time, the symbol of the called method can only be converted into direct references at program runtime, because this This reference conversion process is dynamic, so it is also called dynamic linking.

Note that the dynamic link here has nothing to do with the dynamic link in the stack frame, don't get confused. My personal understanding of this bunch of concepts is that the parsing phase of class loading happens at different times. If the parsing phase occurs during class loading, it is static parsing; if parsing occurs during runtime, it is dynamic linking.

1.4, virtual method table

Let's first look at the essence of method execution:

Find the actual type of the object executed by the first element at the top of the operand stack, denoted as C.
If a method that matches the description in the constant and the simple name is found in the type C, the access authority check is performed, and if it passes, the direct reference of this method is returned, and the search process ends; if not, it returns java.lang. IllegalAccessError exception.
Otherwise, according to the inheritance relationship from bottom to top, carry out the search and verification process in the second step for each parent class of C.
If no suitable method is found, a java.1ang.AbstractMethodserror exception is thrown.
In object-oriented programming, dynamic dispatch is frequently used. If you have to search for a suitable target in the method metadata of the class during each dynamic dispatch, it may affect the execution efficiency. Therefore, in order to improve performance, the JVM implements it by creating a virtual method table (virtual method table) in the method area of ​​the class (non-virtual methods will not appear in the table). Use indexed tables instead of lookups.

What is a virtual method table, look at the following two pictures to understand:

insert image description here

To sum up, we mentioned 4 instructions earlier. These 4 instructions can be used to judge whether a method is a virtual method. During the execution of instructions, if the execution engine encounters a virtual method, it will check the class. Virtual method table, and then implement the concrete method.

Guess you like

Origin blog.csdn.net/chuige2013/article/details/129744123