[JVM] Execution engine of JVM

foreword

In this article we will explain the execution engine in the JVM.

Question: How do we compile and convert the Java programs we usually write into machine codes that computers can recognize? And what is the difference between Java program compilation and C/C++ program compilation? What is the difference between what we call a JIT compiler and what we usually call a compiler? ...I believe that after reading this article, you will have a clear understanding.

  • Java's [front-end] compilation will only generate bytecode files, not assembly (let alone machine language). When a Java program is running, the bytecode file will be loaded into the java virtual machine, and the virtual machine will "translate" the bytecode into machine instructions to run. Java implements virtual machines on different platforms, and code portability can be achieved by compiling for virtual machines.
    insert image description here

  • The C/C++ program compilation and execution process is simple. The whole process is divided into four stages: Pre-Processing, Compilation, Assembly, and Linking. C/C++ language programs are compiled into machine codes, which usually cannot be run on machines with different instruction systems. Compilation of C/C++ programs is generally directed at the hardware.

insert image description here

Glossary

machine code

  • Definition : Various instructions expressed in binary encoding are called machine instruction codes.

  • Features :

    • Can be understood and accepted by the computer, the program written by it is input into the computer, and the CPU can directly read and run it, so the execution speed is the fastest;
    • It is closely related to the CPU, and the machine codes corresponding to different types of CPUs are also different;

instruction

Since the machine code is a binary sequence composed of 0 and 1, the readability is too poor, so the instruction appeared.

  • Definition: An instruction is a code that simplifies the specific sequence of 0 and 1 in the machine code into a corresponding instruction (generally using English abbreviations, such as mov, inc, etc.), telling the computer to perform a special operation. Such as: data transfer instructions, arithmetic operation instructions, bit operation instructions, program flow control instructions, string operation instructions, processor control instructions.

  • Form of composition:
    An instruction usually consists of two parts: opcode + address code.

    • Operation code: Indicates the type or nature of the operation to be completed by the instruction, such as fetching, adding, or outputting data.
    • Address code: indicates the content of the operation object or the address of the storage unit where it is located.
  • Features :

    • Instruction is the smallest functional unit of computer operation
    • The same instruction (such as mov) on different hardware platforms may have different machine codes.

Instruction Set

  • Definition : The collection of all instructions on a computer is the instruction system of this computer. The instruction system, also known as the instruction set, is the embodiment of all the functions of this computer.

  • features

    • Different hardware platforms have different supported instructions. Therefore, the instructions supported by each platform are called the instruction set of the corresponding platform.

    • as usual

      • The x86 instruction set corresponds to the x86 architecture platform
      • ARM instruction set, corresponding to the platform of ARM architecture

Assembly language

  • Concept : Assembly language is a low-level language, also known as symbolic language. In assembly language, mnemonics are used instead of opcodes for machine instructions, and address symbols or labels are used instead of addresses of instructions or operands.

  • Features :

    • In different hardware platforms, assembly language corresponds to different machine language instruction sets, which are converted into machine instructions through the assembly process.
    • Since computers only understand instruction codes, programs written in assembly language must be translated into machine instruction codes before computers can recognize and execute them.

high level language

  • Concept : A high-level language is a machine-independent, procedural or object-oriented language.

  • Features :

    • Write a program in a way that is easier for people to understand, and the written program is called a source program.
    • High-level language has nothing to do with the hardware structure and instruction system of the computer. It has stronger expressive ability, can easily express the operation of data and the control structure of the program, can better describe various algorithms, and is easy to learn and master.
    • When a computer executes a program written in a high-level language, it still needs to be 程序解释和编译成机器的指令码. The program that completes this process is called an interpreter or a compiler.

bytecode

  • Concept : bytecode is a binary code file in a special state [intermediate code], which is more abstract than machine code and needs to be translated to form machine code.

  • Features :

    • Bytecode is mainly to achieve specific software operation and software environment, independent of the hardware environment→【Cross-platform】
    • The implementation of bytecode is through compilers and virtual machines. The compiler compiles the source code into bytecode, and the virtual machine on a specific platform translates the bytecode into instructions that can be executed directly.

The typical application of bytecode is: Java bytecode

Virtual Machine & Physical Machine

"Virtual machine" is a concept relative to "physical machine", both of which have the ability to execute code.

  • the difference:

    • The execution engine of a physical machine is built on the processor, cache, instruction set, and operating system levels.
    • The execution engine of the virtual machine is implemented by software itself, so it is not restricted by physical conditions. The structure of the instruction set and execution engine can execute those instruction set formats that are not directly supported by the hardware.

Frontend Compiler & Backend Compiler

Front-end compiler : the process of converting *.java files into *.class files.
Back-end compiler : the process of converting Class files into binary machine codes related to local infrastructure (hardware instruction set, operating system).

Execution engine of JVM

The execution engine is one of the core components of the Java virtual machine. Because the bytecode cannot run directly on the operating system (the bytecode instruction is not equivalent to the local machine instruction, it contains only some that can be recognized by the JVM. Bytecode instructions, symbol tables, and other information), so after the JVM loads the bytecode into its interior, 执行引擎(Execution Engine)的任务就是将字节码指令解释/编译为对应平台的本地机器指令simply speaking, the execution engine of the JVM acts as a translator from high-level language to machine language.

How does the execution engine work?

  1. What kind of bytecode instructions the execution engine needs to execute during execution depends entirely on the PC register;
  2. Whenever an instruction operation is executed, the PC register will update the address of the next instruction to be executed;
  3. Of course, during the execution of the method, the execution engine may accurately locate the instance information of the object stored in the Java heap through the object reference stored in the local variable table, and locate the target object through the metadata pointer in the object header. type information;

insert image description here

The specific flow chart of the JVM execution engine is shown in the figure below

insert image description here

  • The input and output of the execution engine of the Java virtual machine are consistent
    Input: bytecode binary stream;
    output: machine instructions after interpreter and compiler "translate" bytecode;

Ask a question: Why is Java said to be a semi-compiled and semi-interpreted language? You will find the corresponding answer in the following content

interpreter

  • Concept : When the Java virtual machine starts, it will interpret the bytecode line by line according to the predefined specifications, and "translate" the content in each bytecode file into the local machine instructions of the corresponding platform for execution. (Interpret bytecode instructions line by line, execute immediately, no compilation required)

  • how to work

    • After a bytecode instruction is interpreted and executed, the interpretation operation is performed according to the next bytecode instruction that needs to be executed recorded in the PC register.

    • In Hotspot VM, the interpreter is mainly composed of Interpreter module and Code module

    • Interpreter module: implements the core functions of the interpreter

    • Code module: used to manage the local machine instructions generated by the Hotspot VM at runtime

Because interpreters are so simple in design and implementation, interpreter-based execution is now synonymous with inefficiency. In order to solve this problem, just-in-time compilation technology appeared.

Just-in-time compiler (JIT)

The "compiler" of the Java language is a very vague concept without a specific context, because it may refer to

  • Front-end compiler (actually also called the front end of the compiler): the process of converting .java files into .class files

  • Back-end compiler (JIT compiler Just In Time Compiler): the process of converting bytecode into machine code

  • Static ahead of time compiler (AOT compiler Ahead Of Time Compiler): the process of directly compiling .java files into local machine code

The just-in-time compiler in the JVM execution engine belongs to the back-end compiler, so we will mainly explain the just-in-time compiler below.
The Hotspot virtual machine adopts an architecture where an interpreter and a just-in-time compiler coexist. When it is running, it will find a node where the interpreter and the just-in-time compiler cooperate with each other. Based on this feature, Java's running performance can already compete with C/C++ programs.

Here comes the question: Since the JIT compiler is already built into the Hotspot VM, why do you need to use an interpreter to "drag" the execution performance of the program?

Both interpreters and compilers have their own advantages :

  • When the program needs to be started and executed quickly, the interpreter can play a role first [fast response], and it can run immediately without waiting for the instant compiler to compile and execute.

  • When the program starts, the compiler kicks in over time, compiling more and more code into native code, which takes a certain amount of time. However, after compiling to native code, the execution efficiency is high.

To sum up, the interpreter can also be used as a back-up "escape door" for the compiler's aggressive optimization, that is, let the compiler choose some optimization methods that cannot guarantee that all situations are correct according to the probability, but can improve the running speed most of the time. When the assumption of radical optimization is not true (for example, after adding a new class, the type inheritance structure changes, etc.), de-optimization can be used to return to the interpretation state and continue to execute, so the two often work together in a complementary manner. The interaction relationship is as follows
insert image description here

The HotSpot virtual machine has two (or three) just-in-time compilers built in.

  • Client Compiler (client compiler/C1 compiler)

    • The C1 compiler will perform simple and reliable optimization of bytecode, which takes less time to achieve faster compilation speed
  • Server Compiler (server compiler/C2 compiler)

    • C2 performs time-consuming optimization and aggressive optimization, but the optimized code execution efficiency is high.
  • Graal compiler: JDK10 only appeared, the purpose is to replace C2, and will not be explained here.

Different optimization strategies of C1 and C2 compilers

  • The C1 compiler has method inlining, devirtualization, and redundancy elimination

    • Method inlining: Compile the referenced function code to the point of reference, thus reducing the generation of stack frames, parameter passing and jumping process
    • Devirtualization: inlining the only implementing class
    • Redundancy elimination: some code that will not be executed will be folded during runtime
  • The optimization of C2 is mainly at the global level, the basis for the optimization of escape analysts. Based on the escape analysis, there are the following optimizations in C2

    • scalar substitution: replace aggregate variables with scalars
    • On-stack allocation: for escaping objects that allocate objects on the stack instead of the heap
    • Synchronous elimination: clear synchronous actions, usually referred to as synchronized

Note: The 64-bit version of JDK only supports "-server" mode. For other versions of JDK, users can use the "-client" or "-server" parameter to force the virtual machine to run in client mode or server mode.

  • Program interpretation and execution (not enabling performance monitoring) can trigger C1 to compile bytecode into machine code, which can be optimized simply or with performance optimization. C2 compilation will perform radical optimization based on performance monitoring information
  • However, after the Java7 version, once the developer displays the specified command "-server" in the program, it will be enabled by default, using c1 and c2 to execute together

Layered Compilation Strategy

Before the layered compilation working mode appeared, the HotSpot virtual machine usually used the interpreter to work directly with one of the compilers. In order to achieve the best balance between program startup response speed and operating efficiency, the HotSpot virtual machine adds a layered compilation strategy to the compilation subsystem:

  • Layer 0: Program interpretation and execution. The performance monitoring function (Profiling) is enabled by default. If not enabled, the second layer of compilation can be triggered;
  • Layer 1: It can be called C1 compilation, which compiles bytecode into native code, performs simple and reliable optimization, and does not enable Profiling;
  • Layer 2: Also known as C1 compilation, turn on Profiling, and only execute C1 compilation with method call times and loop back edge execution times profiling;
  • Layer 3: Also known as C1 compilation, perform all C1 compilations with Profiling;
  • Layer 4: It can be called C2 compilation, which also compiles bytecode into native code, but will enable some optimizations that take a long time to compile, and even perform some unreliable aggressive optimizations based on performance monitoring information.

Virtual Machine Execution Mode

By default, the Hotspot virtual machine uses an architecture in which an interpreter and a compiler coexist. Developers can also adjust it to use an interpreter or a just-in-time compiler.

  • -Xint: Execute the program completely in interpreter mode
  • -Xcomp: Execute the program completely in just-in-time compiler mode. If there is a problem with compilation, the interpreter will still intervene
  • -Xmixed: Execute the program in a mixed mode of interpreter + just-in-time compiler
C:\Users\13832>java -version
java version "1.8.0_162"
Java(TM) SE Runtime Environment (build 1.8.0_162-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.162-b12, mixed mode)

C:\Users\13832>java -Xint -version
java version "1.8.0_162"
Java(TM) SE Runtime Environment (build 1.8.0_162-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.162-b12, interpreted mode)

C:\Users\13832>java -Xcomp -version
java version "1.8.0_162"
Java(TM) SE Runtime Environment (build 1.8.0_162-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.162-b12, compiled mode)

Hot code is mentioned above, what is hot code? How to determine that the code is hot code? Please read on.

Hot code & detection method

The JIT compiler needs to determine whether to compile the bytecode into local machine instructions according to the execution frequency of the bytecode. The code with a higher execution frequency is called a hot code, and the JIT compiler targets these codes with a higher execution frequency. In-depth optimization, compiled into local machine instructions, no need to use an interpreter to explain in the next execution, but directly execute local machine instructions, thereby improving JAVA performance.
There are two main types of hot codes:

  • method called multiple times
  • loop body executed multiple times

In both cases, the target object of compilation is the entire method body. Since this compilation occurs during method execution, it is also called on-stack replacement. Or simply called OSR (On Stack Replacement), that is, the stack frame of the method is still on the stack, and the method is replaced.
In the above description, how many times does a method have to be called, or how many times does a loop body loop to meet this standard? This relies on the hotspot detection function (the purpose is to determine whether a certain piece of code is a hotspot code, and whether it needs to trigger just-in-time compilation).
There are two mainstream hotspot detection and judgment methods, which are:

  • Sampling-based hotspot detection: The virtual machine periodically checks the call stack tops of each thread, and if a method (or some) is found to frequently appear on the top of the stack, then the method is a "hot method". This method is simple and efficient, but it is difficult to accurately confirm the heat of a method.
  • Counter-based hotspot detection: The virtual machine creates a counter for each method (even a code block) to count the number of executions of the method. If the number of executions exceeds a certain threshold, it is considered a "hot method". [How the HotSpot virtual machine is used]

Using counter hotspot detection, Hotspot VM will create two different types of counters for each method, namely the method call counter (Invocation Counter) and the back edge counter (Back Edge Counter)

  • The method call counter is used to count the number of times the method is called
  • The edge counter is used to count the number of loops executed by the loop body

1) Method call counter

  • This counter is used to count the number of times the method is called. Its default threshold is 1,500 times in Client mode, and 10,000 times in Server mode. If this threshold is exceeded, JIT compilation will be triggered. This threshold can be set by the virtual machine parameter -XX:ComplieThreshold
  • When a method is called, it will first check whether there is a JIT-compiled version of the method. If it exists, the compiled native code will be used first; if there is no compiled version, the method will be executed. Add 1 to the value of the call counter, and then judge whether the sum of the method call counter and the return call counter exceeds the threshold of the method call counter, and if so, submit a compilation request for the method to the compiler

The concept of heat decay: the method call counter counts not the absolute number of calls, but a relative execution frequency. That is, the number of times the method is called within a period of time. When a certain time limit is exceeded, if the number of method calls is not enough to transfer it to the just-in-time compiler for compilation, then the call counter of this method will be reduced by half. This process is the decay of the call counter heat (Counter Decay), and this section The time is called the half-life period (Counter Half Life Time) of the submethod statistics.
The heat decay action is executed during garbage collection inside the virtual machine. You can use the virtual machine parameter -XX:-UseCouinterDecay to turn off the heat decay, so that as long as the system runs long enough, most of the methods in the program will be compiled. cost local code.

  • You can use the -XX:CounterHalfLifeTime parameter to set the time of the half-life period in seconds.

2) Edge counter

  • To count the execution times of the loop body code in the method, the purpose of setting up the back-end counter statistics is to trigger the replacement compilation on the stack.
  • When the interpreter encounters an edge-back instruction, it will first check whether there is a compiled version of the code segment to be executed. If so, it will execute the compiled code first, otherwise it will increase the value of the edge-back counter 1. Then judge whether the sum of the method call counter and the edge return counter exceeds the threshold of the edge return counter. If the threshold is exceeded, an on-stack replacement compilation request will be submitted, and the value of the edge counter will be reduced.

code demo

public class IntCompTest {
    
    
    public static void main(String[] args) {
    
    

        long start = System.currentTimeMillis();

        testPrimeNumber(1000000);

        long end = System.currentTimeMillis();

        System.out.println("花费的时间为:" + (end - start) + "ms");

    }

    public static void testPrimeNumber(int count){
    
    
        for (int i = 0; i < count; i++) {
    
    
            //计算100以内的质数
            label:for(int j = 2;j <= 100;j++){
    
    
                for(int k = 2;k <= Math.sqrt(j);k++){
    
    
                    if(j % k == 0){
    
    
                        continue label;
                    }
                }
            }

        }
    }
}

-Xint[纯解释器]:花费的时间为:6184ms
-Xcomp[纯编译器]:花费的时间为:746ms
-Xmixed[默认]:花费的时间为:710ms

Graal Compiler & AOT Compiler

Graal Compiler

  • After JDK10, Hotspot added a brand new just-in-time compiler: the Graal compiler, which aims to replace the C2 compiler in the future

  • Currently in the experimental stage

    • -XX:+UnlockExperimentalVMOptions -XX:+UseJVMCICompiler deactivated to use

AOT compiler

  • After JDK9.0, the AOT compiler (static ahead of time compiler Ahead Of Time Compiler) was added

  • The so-called AOT compiler is a concept opposite to the just-in-time compiler. The just-in-time compiler converts bytecode into machine code when the program is running, and the AOT compiler converts bytecode into machine code before the program runs.

  • benefit:

    • It can be executed directly without waiting for the program to warm up, reducing the experience of running slowly for the first time
  • shortcoming:

    • Breaks Java's "compile once, run everywhere" because the files after direct compilation are fixed and must be compiled for each system
    • Reduces the dynamics of Java dynamic linking, and the loaded code must be known at the compiler stage
    • Need to continue to optimize, initially only support Linux x64 java base

Guess you like

Origin blog.csdn.net/u011397981/article/details/130186842
JVM
JVM