Detailed explanation of JIT optimization of Jvm

Article directory

1. The background of JIT
2. The HotSpot virtual machine has a built-in JIT compiler
3. Common hot spot detection technologies
4. Common JIT optimization methods
5. Problems that may be caused by JIT optimization
- 1. Improve the efficiency of JIT optimization
- 2. Reduce instantaneous request volume

1. The background of JIT

We know that there are two ways to convert high-level language into machine language that can be recognized by computers, namely compilation and interpretation. Although in Java, code needs to be compiled into bytecode in order to be executed, bytecode itself cannot be directly executed on the machine.

Therefore, the JVM has a built-in interpreter (interpreter) that interprets the bytecode at runtime, translates it into machine code, and then executes it.

The execution mode of the interpreter is to translate and execute at the same time, so the execution efficiency is low. In order to solve this inefficiency problem, HotSpot introduced JIT technology (just in time compilation).

With JIT technology, the JVM still uses the interpreter for interpretation and execution. However, when the JVM finds that a certain method or code block is executed frequently during runtime, it will mark it as "hot code". Then JIT translates some hot code into local machine-related machine code, optimizes it, and then caches the translated machine code for next use.

JIT optimization of JVM

2. The HotSpot virtual machine has a built-in JIT compiler

The HotSpot virtual machine has two built-in JIT compilers:Client Compiler和Server Compiler，For client and server side respectively. In the current mainstream HotSpot virtual machine, the interpreter works directly with one of the compilers by default.

When the JVM executes code, it does not immediately start compiling the code. First, if the code will only be executed once, compiling the code is a waste compared to translating the code into Java bytecode. Because the process of translating code into bytecode is much faster than the process of compiling and executing the code. Secondly, the JVM performs optimizations when compiling the code. When a method or loop is executed more times, the JVM will have a deeper understanding of the code structure and make corresponding optimizations when compiling the code.

1. Client Compiler

Client Compiler (also known as C1 compiler or Client JIT) mainly optimizes startup speed and memory usage. It compiles at the beginning of the program to quickly generate executable code, but it is less optimized for performance.

2. Server Compiler

Server Compiler (also known as C2 compiler or Server JIT) focuses on deeper optimization of the code at runtime to improve program execution efficiency. It generates high-performance machine code through dynamic analysis and optimization while the program is running.

3. Check the local compiler mode

If you want to check which mode is used by the JIT in the JDK installed on the machine, you can execute the "java -version" command. This command will display the JDK version information, which also includes the JIT compiler mode.

java -version

The picture shows JDK 1.8 installed on your local machine, and the JIT compiler mode isServer Compiler。However, it should be pointed out that whether it is Client Compiler or Server Compiler, the interpreter and compiler are based onblend modeused together, that is, what is shown in the figure ismixed mode。

3. Common hot spot detection technologies

In order to trigger JIT compilation, hot code needs to be identified first. Currently, Hot Spot Detection is mainly used to identify hot code, and there are two common methods.

1. Counter-based hotspot detection

One is counter-based hot spot detection, which counts the number of method calls and marks the method as hot code when it reaches a certain threshold. This method is simple and direct, and is suitable for identifying some simple hotspot scenes.

2. Sampling-based hotspot detection

The HotSpot virtual machine uses the method of periodically detecting the top of the stack of each thread to determine the hotspot method. If a method frequently appears on the top of the stack, it is considered a hot method. The advantage of this method is that it is simple and easy to understand, but the disadvantage is that it cannot accurately determine the popularity of a method. In addition, it is susceptible to interference from thread blocking or other reasons, thus affecting the accuracy of hotspot detection.

In the HotSpot virtual machine, a counter-based hotspot detection method is used, so two counters are prepared for each method: a method call counter and an edge counter.

2.1 Method call counter

The method call counter, as the name suggests, is a counter used to record the number of times a method is called. It counts the number of method calls and marks the method as hot code when it reaches a certain threshold.

2.2 Return counter

The edge counter is a counter used to record the number of times a loop structure (such as a for or while loop) in a method is run. It counts the number of iterations of the loop structure and determines whether the loop is a hot code based on the number of iterations.

The purpose of these two counters is to help the HotSpot virtual machine identify hot code, thereby triggering JIT compilation for optimization. The method call counter is used to identify frequently called methods, and the edge counter is used to identify loop structures that run a large number of times. By identifying these hot codes, the execution efficiency of the program can be improved.

In general, the HotSpot virtual machine uses method call counters and edge-back counters as counter-based hotspot detection methods to identify hotspot code and optimize it to improve the performance of Java applications.

4. Common JIT optimization methods

1. Common subexpression elimination

Common subexpression elimination is an optimization technology of the JVM JIT compiler, which is used to reduce repeated calculations and improve program execution efficiency.

Common subexpressions refer to calculation expressions that appear multiple times in a program. By eliminating optimization through common subexpressions, repeated calculations can be merged into one calculation, reducing unnecessary calculation overhead.

The following is a simple example code demonstrating elimination optimization of common subexpressions:

public class CommonSubexpressionEliminationDemo {
    
    
    public static void main(String[] args) {
    
    
        int a = 5;
        int b = 3;
        int c = a * b + 2; // 公共子表达式 a * b
        int d = a * b + 2; // 公共子表达式 a * b
        
        System.out.println(c);
        System.out.println(d);
    }
}

In the above code, variables c and d are bothThe same calculation expressiona * b + 2, eliminating optimization through common subexpressions, the JVM JIT compiler will merge repeated calculations into one calculation.

Summarize:

Common subexpression elimination can reduce repeated calculations and improve program execution efficiency.
The JVM JIT compiler will identify repeated calculation expressions and optimize them into one calculation.
Using common subexpressions to eliminate optimization can reduce unnecessary calculation overhead, especially when using the same expression in a loop.

2. Method inlining

Method inlining is an optimization technology of the JVM JIT compiler, which is used to reduce the cost of method calls and improve program execution efficiency.

Method inlining refers to inserting the code of a method directly into the place where the method is called, rather than executing it through a method call. This can reduce the overhead of method calls, including stack frame creation and destruction, parameter passing and other operations.

The following is a simple example code demonstrating the optimization of method inlining:

public class MethodInliningDemo {
    
    
    public static void main(String[] args) {
    
    
        int a = 5;
        int b = 3;
        int c = add(a, b); // 方法调用
        int d = a + b; // 方法内联
        
        System.out.println(c);
        System.out.println(d);
    }
    
    public static int add(int a, int b) {
    
    
        return a + b;
    }
}

In the above code, variable c calculates the result through method call, while variable d directly inlines the method code to the calling site for calculation.

Summarize:

Method inlining can reduce the cost of method calls and improve program execution efficiency.
The JVM JIT compiler will identify methods suitable for inlining and optimize them to be directly inserted into the call site for execution.
Using method inlining optimization can reduce the cost of method calls, especially in frequently called methods.
It should be noted that excessive method inlining may cause code bloat, increase compilation time and memory consumption. Therefore, there is a trade-off between code size and performance improvements when using method inlining.

3. Escape analysis

Escape analysis is an optimization technology of the JVM JIT compiler. It is used to analyze the scope of an object and determine whether the object will escape the scope of the method, thereby optimizing the memory allocation of the object.

The purpose of escape analysis is to find objects that will not escape from methods and allocate them on the stack instead of the heap to reduce the overhead of garbage collection.

The following is a simple example code demonstrating the optimization of escape analysis:

public class EscapeAnalysisDemo {
    
    
    public static void main(String[] args) {
    
    
        User user = createUser("Alice"); // 对象逃逸
        
        System.out.println(user.getName());
    }
    
    public static User createUser(String name) {
    
    
        return new User(name); // 对象逃逸
    }
    
    static class User {
    
    
        private String name;
        
        public User(String name) {
    
    
            this.name = name;
        }
        
        public String getName() {
    
    
            return name;
        }
    }
}

In the above code, the User object created in the createUser method will escape the scope of the method and be used by external references.

Summarize:

Escape analysis is an optimization technology of the JVM JIT compiler that is used to analyze the scope of an object and determine whether the object will escape the scope of the method.
The purpose of escape analysis is to find objects that will not escape from methods and allocate them on the stack instead of the heap to reduce the overhead of garbage collection.
Escape analysis can reduce the overhead of heap allocation and garbage collection and improve program execution efficiency.
It should be noted that escape analysis is not an absolutely effective optimization technology. It can only be effective in specific scenarios, and for most applications, heap allocation and garbage collection overhead are not performance bottlenecks, so Escape analysis is of limited use.
Escape analysis is controlled through parameters in the JVM.In JDK7 and later versions, escape analysis is turned on by default.

The following are some JVM parameters related to escape analysis:

-XX:+DoEscapeAnalysis: Enable escape analysis. It is enabled by default.
-XX:-DoEscapeAnalysis: Disable escape analysis.
-XX:+PrintEscapeAnalysis: Print escape analysis related information.
-XX:+EliminateLocks: Eliminate unnecessary locks through escape analysis.
It should be noted that the effect of escape analysis is related to the specific JVM implementation. Different JVMs may have different support and optimization levels for escape analysis. Therefore, for some specific scenarios, it may be necessary to make appropriate adjustments and optimizations based on actual conditions.

3.1 Scalar replacement for escape analysis

Scalar replacement is an optimization technique for escape analysis. It disassembles an object into independent scalars (a single basic type or object reference) and allocates these scalars on the stack or in registers respectively, thus avoiding the creation and creation of objects. access operations.

Here is a simple example code that demonstrates the effect of scalar substitution:

public class ScalarReplacementDemo {
    
    
    public static void main(String[] args) {
    
    
        long startTime = System.currentTimeMillis();

        for (int i = 0; i < 10000000; i++) {
    
    
            Point point = new Point(i, i); // 创建一个Point对象
            int sum = point.x + point.y; // 使用Point对象的属性进行计算
        }

        long endTime = System.currentTimeMillis();
        System.out.println("Time taken: " + (endTime - startTime) + "ms");
    }

    static class Point {
    
    
        int x;
        int y;

        public Point(int x, int y) {
    
    
            this.x = x;
            this.y = y;
        }
    }
}

In the above code, we create 10,000,000 Point objects in a loop and add the properties of each object. If escape analysis is turned on and scalar substitution is in effect, the JVM will replace the properties x and y of the Point object with two independent local variables and allocate them on the stack, thus avoiding the creation and access of the Point object.

Summarize:

Scalar substitution is an optimization technique for escape analysis that breaks an object into independent scalars and allocates them on the stack or in a register.
Scalar substitution improves program performance by avoiding object creation and access operations.
To enable scalar replacement, you need to ensure that escape analysis is turned on and the JVM will automatically optimize scalar replacement at runtime.
When writing code, you can help the JVM perform scalar replacement optimization through appropriate code design, such as using immutable objects or local variables, etc.

3.2 Escape analysis of allocation on stack

Stack allocation is another optimization technique for escape analysis, which allocates memory for certain objects on the stack instead of the heap. Allocation on the stack can reduce the overhead of object allocation and recycling on the heap and improve program performance.

Here is a simple example code that demonstrates the effect of allocation on the stack:

public class StackAllocationDemo {
    
    
    public static void main(String[] args) {
    
    
        long startTime = System.currentTimeMillis();

        for (int i = 0; i < 10000000; i++) {
    
    
            Point point = createPoint(i, i); // 创建一个Point对象，并返回其引用
            int sum = point.x + point.y; // 使用Point对象的属性进行计算
        }

        long endTime = System.currentTimeMillis();
        System.out.println("Time taken: " + (endTime - startTime) + "ms");
    }

    static Point createPoint(int x, int y) {
    
    
        return new Point(x, y);
    }

    static class Point {
    
    
        int x;
        int y;

        public Point(int x, int y) {
    
    
            this.x = x;
            this.y = y;
        }
    }
}

In the above code, we create 10,000,000 Point objects in a loop and add the properties of each object. If escape analysis is turned on and stack allocation takes effect, the JVM will allocate the memory of the Point object on the stack instead of the heap, thereby reducing the overhead of object allocation and recycling on the heap.

Summarize:

Allocation on the stack is an optimization technique of escape analysis that allocates the memory of certain objects on the stack instead of the heap.
Allocation on the stack can reduce the overhead of object allocation and recycling on the heap and improve program performance.
To enable on-stack allocation, you need to ensure that escape analysis is turned on and the JVM automatically optimizes on-stack allocation at runtime.
When writing code, you can help the JVM optimize stack allocation through appropriate code design, such as limiting the scope of objects to methods, using local variables, etc.

3.3 Synchronous elimination of escape analysis

Synchronization elimination is another optimization technique of escape analysis. It analyzes the synchronization operations in the code to determine whether these synchronization operations can be eliminated to improve the performance of the program.

Here is a simple example code to demonstrate the effect of synchronized elimination:

public class SynchronizationEliminationDemo {
    
    
    public static void main(String[] args) {
    
    
        long startTime = System.currentTimeMillis();

        for (int i = 0; i < 10000000; i++) {
    
    
            synchronizedMethod();
        }

        long endTime = System.currentTimeMillis();
        System.out.println("Time taken: " + (endTime - startTime) + "ms");
    }

    static void synchronizedMethod() {
    
    
        synchronized (SynchronizationEliminationDemo.class) {
    
    
            // 同步块中的代码
        }
    }
}

In the above code, we loop and call the synchronizedMethod method 10000000 times, which contains a synchronized block. If escape analysis is turned on and synchronization elimination is in effect, the JVM will determine that the synchronizedMethod method has not escaped to other threads, so the synchronization operation can be eliminated, thereby improving program performance.

Summarize:

Synchronization elimination is an optimization technique of escape analysis. By analyzing the synchronization operations in the code, it is determined whether these synchronization operations can be eliminated to improve the performance of the program.
The prerequisite for synchronization elimination is that escape analysis is turned on and the JVM can determine that the synchronization operation does not escape to other threads.
Synchronization elimination can reduce synchronization overhead between threads and improve the concurrency performance of the program.
When writing code, you can help the JVM perform synchronization elimination optimization by avoiding unnecessary synchronization operations and reasonably designing the scope of objects.

5. Problems that may be caused by JIT optimization

Once we understand the principle of JIT compilation, we will understand that JIT optimization is performed at runtime, and optimization cannot be performed immediately when the Java process is just started. It requires a certain amount of execution time to determine which code is hot code.

Therefore, before JIT optimization starts, all requests need to be interpreted and executed, which is a relatively slow process. This problem will be more obvious when the application has a large request volume. During application startup, a large influx of requests causes the interpreter to continuously work hard.

If the interpreter occupies a high amount of CPU resources, it will indirectly cause indicators such as CPU and load to soar, thereby reducing application performance. This is also the reason why there are a lot of timeout problems in newly restarted applications during the application release process.

As requests continue to increase, JIT optimization will be triggered, so that subsequent hotspot requests no longer need to be interpreted and executed, but directly run the machine code cached after JIT optimization.

✨There are two main solutions:✨

1. Improve the efficiency of JIT optimization

One method is to learn from the JDK Dragonwell developed by Alibaba, which provides some proprietary features compared to OpenJDK, including JwarmUp technology. This technology passesRecord the compilation information when the Java application was last run into a file, and read the file when the application starts next time to complete class loading, initialization and method compilation in advance, skip the interpretation stage, and directly execute the compiled machine code.

2. Reduce instantaneous request volume

When the application is just started, the traffic is gradually increased by adjusting the load balancing, so that the application triggers JIT optimization under small traffic, and then gradually increases the traffic after the optimization is completed.

This approach is similar to the idea of cache warm-up. When the application is just started, do not immediately distribute a large amount of traffic to it, but first allocate a small portion of the traffic and trigger JIT optimization through this portion of the traffic. After the optimization is completed, gradually increase the traffic.