Learning JVM from scratch (6) - direct memory and execution engine

1 Introduction to direct memory

Direct memory is not part of the virtual machine runtime data area, nor is it a memory area defined in the Java Virtual Machine Specification. Direct memory is the memory space outside the Java heap that is directly applied to the system. Direct memory comes from NIO, and operates native memory through DirectByteBuffer stored in the heap. Usually access to direct memory is faster than the Java heap. That is, the read and write performance is high.

Therefore, for performance reasons, read-write-intensive farms may consider using direct memory. Java's NIO library allows Java programs to use direct memory for data buffers.

1.1 Indirect buffer pool

To use IO to read and write files, you need to interact with the disk, and you need to switch from user mode to kernel mode. In kernel mode, two copies of memory are required to store duplicate data, which is inefficient.

1.2 Direct Buffer Pool

When using NIO, the direct buffer area allocated by the operating system can be directly accessed by Java code, and there is only one copy. NIO is suitable for reading and writing operations on large files.

insert image description here

Insufficient direct memory may also cause OutOfMemoryErroran exception. Since direct memory is outside the Java heap, its size is not directly limited by the -Xmxspecified maximum heap size, but system memory is limited, and the sum of the Java heap and direct memory is still limited The maximum memory available to the operating system.

The direct memory size can be MaxDirectMemorySizeset, if not specified, the default -Xmxis consistent with the maximum parameter value of the heap.

Disadvantages of using direct memory: high cost of allocation and recovery and not subject to JVM memory management

OOM code example caused by direct memory shortage:

public class MaxDirectMemorySizeTest02 {
    
    
    private static final int BUFFER = 1024 * 1024 * 20;

    public static void main(String[] args) {
    
    
        ArrayList<ByteBuffer> list = new ArrayList<>();

        int count = 0;
        try {
    
    
            while (true) {
    
    
                ByteBuffer buffer = ByteBuffer.allocateDirect(BUFFER);
                list.add(buffer);
                count++;
                try {
    
    
                    Thread.sleep(100);
                } catch (InterruptedException e) {
    
    
                    e.printStackTrace();
                }
            }

        } finally {
    
    
            System.out.println(count);
        }
    }
}

At the same time, set MaxDirectMemorySize的the value to 20m: -Xmx20m -XX:MaxDirectMemorySize=20m, use the allocateDirect of ByteBuffer to apply for the allocation of direct memory space, and the execution results are as follows:
insert image description here

2. Execution engine

insert image description here

2.1 Overview of Execution Engine

The execution engine belongs to the lower layer of the JVM, which includesinterpreter, just-in-time compiler, garbage collector
insert image description here

The execution engine is one of the core components of the Java virtual machine. "Virtual machine" is a concept relative to "physical machine". Both machines have code execution capabilities. The difference is that the execution engine of a physical machine is directly built on the processor, cache, instruction set, and operating system levels. ,andThe execution engine of the virtual machine is implemented by the software itself., so the structure of the instruction set and execution engine can be customized without being restricted by physical conditions,Ability to execute instruction set formats that are not directly supported by the hardware

The main task of the JVM is toResponsible for loading bytecode into its internal, but the bytecode cannot run directly on the operating system, because the bytecode instruction is not equivalent to the local machine instruction, it contains only some bytecode instructions, symbol tables and other auxiliary information.
insert image description here

So if you want to make a Java program run, the task of the execution engine isInterpret/compile bytecode instructions into native machine instructions on the corresponding platform. In simple terms, the execution engine in the JVM acts as a translator from high-level language to machine language.

2.2 The working process of the execution engine

  1. What kind of bytecode instructions the execution engine needs to execute during execution depends entirely on the PC register.
  2. Whenever an instruction operation is executed, the PC register will update the address of the next instruction to be executed.
  3. Of course, during the execution of the method, the execution engine may accurately locate the object instance information stored in the Java heap through the object reference stored in the local variable table, and locate the target object through the metadata pointer in the object header. type information.

insert image description here

From the appearance point of view, the input and output of the execution engine of all Java virtual machines are consistent: the input is the bytecode binary stream, the processing process is the equivalent process of bytecode parsing and execution, and the output is the execution result.

2.3 Java code compilation and execution process

insert image description here
Most of the program code needs to go through the steps in the above figure before it is converted into the target code of the physical machine or the instruction set that the virtual machine can execute.

The Java code compilation is completed by the Java source code compiler, and the flowchart is as follows:
insert image description here
The execution of the Java bytecode is completed by the JVM execution engine, and the flowchart is as follows:
insert image description here

Interpreter: When the Java virtual machine starts, it will interpret the bytecode line by line according to the predefined specifications, and "translate" the content in each bytecode file into the local machine instructions of the corresponding platform for execution.

JIT (Just In Time Compiler) compiler: It is a virtual machine that directly compiles the source code into a machine language related to the local machine platform.

Why is Java a semi-compiled and semi-interpreted language?
In the era of JDK1.0, it is more accurate to position the Java language as "interpretation and execution". Later, Java also developed a compiler that can directly generate native code. Now when the JVM executes Java code, it usually combines interpretation and execution with compilation and execution.

insert image description here

2.4 Machine code, instructions, assembly language

2.4.1 Machine code

Various instructions expressed in binary encoding are called machine instruction codes. At the beginning, people used it to write programs, which is machine language.

Although machine language can be understood and accepted by computers, it is too different from human language, so it is not easy to be understood and memorized by people, and it is easy to make mistakes when programming with it.

Once the program written in it is input into the computer, the CPU directly reads and runs it, so it has the fastest execution speed compared with programs written in other languages. Machine instructions are closely related to the CPU, so different types of CPUs correspond to different machine instructions.

2.4.2 Commands

Since the machine code is a binary sequence composed of 0 and 1, the readability is really poor, so people invented instructions.

The instruction is to simplify the specific 0 and 1 sequence in the machine code into the corresponding instruction (generally abbreviated in English, such as mov, inc, etc.), and the readability is slightly better

Because different hardware platforms perform the same operation, the corresponding machine codes may be different, so the same instruction (such as mov) on different hardware platforms may have different corresponding machine codes.

2.4.3 Instruction Set

Different hardware platforms have different supported instructions. Therefore, the instructions supported by each platform are called the instruction set of the corresponding platform. as usual

● x86 instruction set, corresponding to the platform of x86 architecture
● ARM instruction set, corresponding to the platform of ARM architecture

2.4.4 Assembly language

Because the readability of the instructions is still too poor, people invented assembly language.

In assembly language,Use mnemonics (Mnemonics) to replace the opcode of the machine instruction, and use <mark address symbol (Symbol) or label (Label) to replace the address of the instruction or operand. On different hardware platforms, assembly language corresponds to different machine language instruction sets, which are converted into machine instructions through the assembly process.

Since computers only understand instruction codes, usePrograms written in assembly language must also be translated into machine code, the computer can recognize and execute.

2.4.5 High-level language

In order to make programming easier for computer users, various high-level computer languages ​​appeared later. High-level language is closer to human language than machine language and assembly language

When a computer executes a program written in a high-level language, it still needs to interpret and compile the program into machine instruction codes. The program that completes this process is called an interpreter or a compiler.
insert image description here
High-level languages ​​are not directly translated into machine instructions, but translated into assembly language codes, such as C and C++ mentioned below

C, C++ source program compilation process can be divided into two stages: compilation and assembly.
Compilation process: read the source program (character stream), analyze it lexically and grammatically, and convert high-level language instructions into functionally equivalent assembly codes

Assembly process: actually refers to the process of translating assembly language code into target machine instructions.
insert image description here

2.4.6 Bytecode

Bytecode is a binary code (file) in an intermediate state (intermediate code), which is more abstract than machine code and needs to be translated by an interpreter to become machine code

Bytecode is mainly to achieve specific software operation and software environment, and has nothing to do with the hardware environment.

The implementation of bytecode is through compilers and virtual machines. The compiler compiles the source code into bytecode, and the virtual machine on a specific platform translates the bytecode into instructions that can be executed directly. The typical application of bytecode is: Java bytecode

insert image description here

2.5 Interpreter

The original intention of the JVM designers was simply toSatisfy the Java program to achieve cross-platform features, so avoid using static compilation to directly generate local machine instructions, thus the idea of ​​implementing the interpreter to execute the program by interpreting the bytecode line by line at runtime was born.
insert image description here

2.5.1 Interpreter working mechanism

The role of the interpreter in the true sense is a runtime "translator", which "translates" the content in the bytecode file into the local machine instruction execution of the corresponding platform.

After a bytecode instruction is interpreted and executed, the interpretation operation is performed according to the next bytecode instruction that needs to be executed recorded in the PC register.

2.5.2 Classification of Interpreters

In the development history of Java, there are two sets of interpretation executors, namely the ancient bytecode interpreter and the now commonly used template interpreter.

  • The bytecode interpreter simulates the execution of the bytecode through pure software code during execution, which is very inefficient.
  • The template interpreter associates each bytecode with a template function, and the template function directly generates the machine code when this bytecode is executed, thereby greatly improving the performance of the interpreter.

In HotSpot VM, the interpreter is mainly composed of Interpreter module and Code module.

  • Interpreter module: implements the core functions of the interpreter
  • Code module: used to manage the local machine instructions generated by the HotSpot VM at runtime

2.5.3 Status Quo

Because the interpreter is very simple in design and implementation, in addition to the Java language, there are many high-level languages ​​that are also executed based on the interpreter, such as Python, Perl, Ruby, etc. But today,Interpreter-based execution has become synonymous with inefficiency, and is often ridiculed by some C/C++ programmers.

To solve this problem, the JVM platform supports a technology called just-in-time compilation. The purpose of just-in-time compilation is to prevent functions from being interpreted and executed, butCompile the entire function body into machine code, and only execute the compiled machine code each time the function is executed, this method can greatly improve the execution efficiency.

But in any case, the execution mode based on the interpreter still made an indelible contribution to the development of the intermediate language.

2.6 JIT Compiler

2.6.1 Execution classification of Java code

  • The first is to compile the source code into a bytecode file, and then convert the bytecode file into machine code execution through the interpreter at runtime
  • The second is compilation and execution (directly compiled into machine code, but you must know that the machine code compiled on different machines is different, and bytecode can be cross-platform). In order to improve execution efficiency, modern virtual machines use just-in-time compilation technology (JIT, Just In Time) to compile methods into machine code and then execute them.

HotSpot VM is one of the masterpieces of high-performance virtual machines currently on the market. it usesArchitecture where interpreter and just-in-time compiler coexist. When the Java virtual machine is running, the interpreter and the just-in-time compiler can cooperate with each other, learn from each other, and try to choose the most appropriate way to weigh the time of compiling native code and the time of directly interpreting and executing code.

Today, the running performance of Java programs has been reborn, and has reached the point where it can compete with C/C++ programs.

Some developers will be surprised, since the HotSpot VM has a built-in JIT compiler, why do you need to use an interpreter to "drag" the execution performance of the program? For example, JRockit VM does not contain an interpreter inside, and all bytecodes are compiled and executed by a just-in-time compiler.

First of all, it is clear: When the program starts, the interpreter can take effect immediately, saving the compilation time and executing it immediately. In order for the compiler to function, it takes a certain amount of execution time to compile the code into native code. However, after compiling to native code, the execution efficiency is high.

So: Although the execution performance of the program in the JRockit VM will be very efficient, the program will inevitably take longer to compile at startup. For server-side applications, startup time is not the focus, but for those application scenarios that are concerned about startup time, it may be necessary to adopt an architecture where an interpreter and a just-in-time compiler coexist in exchange for a balance point. In this mode, when the Java virtual machine starts, the interpreter can take effect first, without having to wait for the JIT compiler to complete all compilations before executing, which can save a lot of unnecessary compilation time. Over time, the compiler comes into play, compiling more and more code into native code for higher execution efficiency.

At the same time, interpreted execution is used as an "escape door" for the compiler when the aggressive optimization of the compiler is not established.

2.6.2 HotSpot JVM Execution Mode

When the virtual machine starts,The interpreter can play a role first, without having to wait for the instant compiler to compile and then execute, which can save a lot of unnecessary compilation time. And as the running time of the program goes on, the just-in-time compiler gradually plays a role, and compiles valuable bytecodes into local machine instructions according to the hotspot detection function in exchange for higher program execution efficiency.

Pay attention to the subtle dialectical relationship between interpreted execution and compiled execution in the online environment. The load that the machine can withstand in the hot state is greater than that in the cold state. If the traffic in the hot state is used to cut the flow, the server in the cold state may die because it cannot carry the traffic.

During the release process of the production environment, the release is performed in batches, divided into multiple batches according to the number of machines, and the number of machines in each batch accounts for at most 1/8 of the entire cluster. There was once such a failure case: a programmer released in batches on the publishing platform, and when entering the total number of releases, he mistakenly entered the number of releases into two batches. If it is in a hot state, half of the machines can barely carry the traffic under normal circumstances. However, since the newly started JVMs are interpreted and executed, hot code statistics and JIT dynamic compilation have not yet been performed. After the machines are started, the current 1/2 release is successful. All the servers of the server crashed immediately. This failure shows the existence of JIT. — Ali Team

insert image description here

The "compilation period" of the Java language is actually an "uncertain" operation process, because it may refer to afront-end compiler(In fact, it is more accurate to call it "the front end of the compiler") The process of converting a .java file into a .class file;

It may also refer to the virtual machine'sbackend runtime compiler(JIT compiler, Just In Time Compiler) The process of converting bytecode into machine code.

may also refer to the use ofstatic ahead of time compiler(AOT compiler, Ahead of Time Compiler) The process of directly compiling .java files into local machine code.

  • Front-end compiler: Sun's Javac, the incremental compiler (ECJ) in Eclipse JDT.
  • JIT compiler: C1 and C2 compilers of HotSpot VM.
  • AOT compilers: GNU Compiler for the Java (GCJ), Excelsior JET.

2.6.3 Hot code and detection technology

Of course, whether it is necessary to start the JIT compiler to directly compile the bytecode into the local machine instructions of the corresponding platform, thenIt depends on how often the code is invoked and executed. Regarding the bytecode that needs to be compiled into native code, it is also called "hot code", the JIT compiler will make in-depth optimization for those frequently called "hot codes" at runtime, and directly compile them into local machine instructions of the corresponding platform, so as to improve the execution performance of Java programs.

A method that is called many times, or a loop body with a large number of loops inside the method body can be called "hot code", so all can be compiled into native machine instructions by the JIT compiler. Since this compilation method occurs during the execution of the method, it is called on-stack replacement, or OSR (On Stack Replacement) compilation for short.

How many times does a method have to be called, or how many times does a loop body need to be executed to meet this standard? A clear threshold is necessarily required before the JIT compiler will compile these "hot codes" into local machine instructions for execution. here mainly depends onHot spot detection function

The current hotspot detection method used by the HotSpot VM isCounter-based hotspot detection

Using counter-based hotspot detection, HotSpot VM will create two different types of counters for each method, namely the method invocation counter (Invocation Counter) and the back edge counter (Back Edge Counter).

  • The method call counter is used to count the number of method calls
  • The edge counter is used to count the number of loops executed by the loop body

Method call counter
This counter is used to count the number of times a method is called. Its default threshold is 1500 times in Client mode and 10000 times in Server mode. Exceeding this threshold triggers JIT compilation.

This threshold can be manually set through virtual machine parameters -XX:CompileThreshold.

When a method is called, it will first check whether there is a JIT-compiled version of the method, and if it exists, the compiled native code will be used first for execution. If there is no compiled version, add 1 to the call counter value of this method, and then determine whether the sum of the method call counter and the return edge counter exceeds the threshold value of the method call counter. If the threshold has been exceeded, a code compilation request for the method will be submitted to the just-in-time compiler.

insert image description here

hot spot attenuation

If no settings are made, the method call counter counts not the absolute number of times the method is called, but a relative execution frequency, that is,The number of times the method was called over a period of time. when more thana certain time limit, if the number of calls of the method is still not enough to submit it to the just-in-time compiler for compilation, then the call counter of this method will be reduced by half. This process is called method call counter popularityCounter Decay, and this period of time is called the statisticalCounter Half Life Time

The action of heat decay is carried out by the way when the virtual machine performs garbage collection. You can use the virtual machine parameters -XX:-UseCounterDecayto turn off the heat decay and let the method counter count the absolute number of method calls. In this way, as long as the system runs long enough, most methods will be compiled into native code.

In addition, you can use -XX:CounterHalfLifeTimeparameters to set the time of the half-life cycle in seconds.

Back edge counter

Its function is to count aThe number of times the code in the loop body of the method is executed, The instruction that jumps backward when the control flow is encountered in the bytecode is called "Back Edge". Apparently, the purpose of setting up back-to-edge counter statistics is to trigger OSR compilation.

insert image description here

2.6.4 HotSpotVM can set the program execution method

By default, the HotSpot VM adopts an architecture in which an interpreter and a just-in-time compiler coexist. Of course, developers can explicitly specify for the Java virtual machine through commands according to specific application scenarios whether to use an interpreter for execution at runtime, or Executed entirely with a just-in-time compiler. As follows:

  • -Xint: Execute the program completely in interpreter mode;
  • -Xcomp: Execute the program completely in just-in-time compiler mode. If there is a problem with just-in-time compilation, the interpreter will step in to execute
  • -Xmixed: Use the mixed mode of interpreter + just-in-time compiler to jointly execute the program.

2.6.5 JIT classification in HotSpotVM

There are two types of JIT compilers, namely C1 and C2. There are two JIT compilers embedded in the HotSpot VM, namely Client Compiler and Server Compiler, but in most cases we simply call them C1 compiler and C2 translater. Developers can explicitly specify which kind of just-in-time compiler the Java virtual machine uses at runtime through the following command, as follows:

  • -client: Specifies that the Java virtual machine runs in Client mode and uses the C1 compiler; the C1 compiler will perform simple and reliable optimization of the bytecode, which takes less time to achieve faster compilation speed.
  • -server: Specifies that the Java virtual machine runs in server mode and uses the C2 compiler. C2 performs time-consuming optimization and aggressive optimization, but the optimized code execution efficiency is higher.

==Tiered Compilation (Tiered Compilation)==Strategy: Program interpretation and execution (not enabling performance monitoring) can trigger C1 compilation, compile bytecode into machine code, simple optimization can be performed, and performance monitoring can also be added, C2 compilation Aggressive optimization will be performed based on performance monitoring information.

However, after the Java7 version, once the developer explicitly specifies the command "-server" in the program, the hierarchical compilation strategy will be enabled by default, and the C1 compiler and the C2 compiler will cooperate with each other to perform the compilation task.

Different optimization strategies of C1 and C2 compilers

Different compilers have different optimization strategies, and the C1 compiler mainly includes method inlining, devirtualization, and redundancy elimination.

  • Method inlining: Compile the referenced function code to the reference point, which can reduce the generation of stack frames, parameter passing and jumping process
  • Devirtualization: Inlining the only implementing class
  • Redundancy Elimination: Fold some code that will not be executed during runtime

The optimization of C2 is mainly at the global level, and escape analysis (as mentioned earlier, it is not mature) is the basis of optimization. Based on escape analysis, there are the following optimizations in C2:

  • Scalar Replacement: Replaces property values ​​of aggregated objects with scalar values
  • Allocation on the stack: for unescaped objects, the object is allocated on the stack instead of the heap
  • Synchronous elimination: Clear synchronous operations, usually referred to as synchronized

Summarize

Generally speaking, the performance of machine code compiled by JIT is higher than that of interpreter. The startup time of the C2 compiler is slower than that of the C1 compiler. After the system runs stably, the execution speed of the C2 compiler is much faster than that of the C1 compiler.

● Since JDK10, HotSpot has added a new just-in-time compiler: Graal Compiler
● The compilation effect has been evaluated by the C2 compiler in just a few years, and the future can be expected
● At present, with the experimental status label, you need to use the switch Parameter -XX:+UnlockExperimentalvMOptions -XX:+UseJVMCICompilerdeactivation can be used

jdk9 introduces the AOT compiler (static ahead of time compiler, Ahead of Time Compiler)

Java 9 introduced the experimental AOT compilation tool jaotc. It uses the Graal compiler to convert the input Java class file into machine code and store it in the generated dynamic shared library.

The so-called AOT compilation is a concept opposite to just-in-time compilation. We know that just-in-time compilation refers to the process of converting bytecode into machine code that can be directly run on hardware during the running of the program, and deploying it to the hosting environment. AOT compilation refers to the process of converting bytecode into machine code before the program runs.

The biggest advantage: Java virtual machine loading has been precompiled into a binary library, which can be executed directly. There is no need to wait for the warm-up of the timely compiler, reducing the bad experience of "slow first run" brought by Java applications

shortcoming:

● It breaks the concept of "compile once, run everywhere" in java, and must compile the corresponding release package for each different hardware and OS. ●
Reduces the dynamics of the Java linking process, and the loaded code must be known to the compiler.
● Need to continue to optimize, initially only support Linux X64 java base

The notes are summarized from the video tutorial: Shang Silicon Valley Song Hongkang JVM full set of tutorials (detailed java virtual machine)
reference:
1. "In-depth understanding of Java virtual machine" 2nd edition

Guess you like

Origin blog.csdn.net/huangjhai/article/details/120813328