Deep understanding of jvm-backend compilation and optimization

This article is a reading note

1. Just-in-time compiler

Hotspot code A
method or code block is executed particularly frequently (recursively). These codes are called hotspot codes
. A backend compiler that compiles hotspot code into local code is called a just-in-time compiler.

Several questions raised around the beginning of the article:

1. Why does the HotSpot virtual machine use an architecture in which an interpreter and a real-time compiler coexist?

  • When the program needs to be started and executed quickly, the interpreter can play a role first, save compilation time, and run immediately.
  • After the program starts, as time goes by, the compiler gradually plays a role, compiling more and more code into local code, which can reduce the intermediate loss of the interpreter and obtain higher execution efficiency. When the memory resource limit is large in the program running environment, you can use interpreted execution to save memory
  • Interpreters and compilers often work together to complement each other
    Insert picture description here

2. Why does HotSpot virtual machine implement two (or three) different real-time compilers?
Mainly for hierarchical compilation.
When hierarchical compilation, the interpreter, client compiler and server compiler will work at the same time, hot code may be compiled multiple times, use the client compiler to obtain a higher compilation Speed, use the server-side compiler to obtain better compilation quality, and do not need to undertake the additional task of collecting performance monitoring information when interpreting and executing, and when the server-side compiler uses a high-complexity optimization algorithm, the client-side compiler You can use simple optimization to get more compilation time for it.

3. When is the program executed using the interpreter? When to use the compiler to execute?
Insert picture description here

4. Which program code will be compiled into local code? How to compile local code?
· A method that is called multiple times. · Loop body that is executed multiple times.

The target object of the compilation is the entire method body, not a separate loop body.

In the first case, since it relies on method-triggered compilation, the compiler will naturally take the entire method as the compilation object. This compilation is also the standard just-in-time compilation method in the virtual machine.

In the latter case, although the compilation action is triggered by the loop body and the hotspot is only part of the method, the compiler must still use the entire method as the compilation object, only the execution entry (starting from the first few bytecode instructions of the method) Execution) will be slightly different, and the byte code number (Byte Code Index, BCI) of the execution entry point will be passed in during compilation. This compilation method is called "On Stack Replacement" (OSR) because the compilation occurs during the execution of the method. That is, the method's stack frame is still on the stack, and the method is replaced.

5. How to observe the compilation process and compilation result of just-in-time compiler from the outside?

Some running parameters require FastDebug or SlowDebug optimization level HotSpot virtual machines to be supported. Product level virtual machines cannot use these parameters.
That is, the average person cannot observe the
compilation by hand, or the unofficial compilation version

2. Advance compiler

One branch is similar to the traditional C and C ++ compilers, which compiles the program code into machine code static translation work before the program runs;

The other branch is to pre-compile and save the compilation work that the original just-in-time compiler needs to do at runtime. The next time it runs to these codes (for example, the common library code is used by other Java processes on the same machine), it is directly loaded. Come in and use.

It can be seen that the example of Android outbreak is the perfect use of early compilation

3. Optimization technology

The compiler optimization technology
mainly introduces four types: 1. Final technical optimization: method inlining 2. The most cutting-edge optimization technology: escape analysis 3. Language-independent classic optimization technology: common sub-expression elimination 4. Language-related classic optimization Technology: Array boundary check elimination
Method inlining:
Method inlining is to "copy" the code of the target method to the method that initiated the call intact, avoiding real method invocation.
For the java method, the difficulty is that many methods are virtual methods, and the polymorphic selection of the call is not known before running. In order to solve this problem, the java virtual machine introduces a method called type inheritance analysis (CHA). Different methods are used in different situations:
① If it is a non-virtual method, just inline.
②If it is a virtual method and this method has only one target version to be selected in the current program state, it can be "guarded inline" by assuming. Because the new type may be loaded in the future, which will change the CHA conclusion, this inlining is a radical predictive optimization, and the "escape door" must be reserved.
③If it is a virtual method and there are multiple versions to choose from, the inline cache will be used to reduce the cost of method invocation, which can be understood as recording each method invocation of different versions, and only need to judge the method after the next call What version is adopted can be inlined immediately.
Escape analysis:
This is not a means of directly optimizing the code, but an analysis technique that provides a basis for other optimization measures.
When an object is defined in a method, it may be referenced by an external method. This method is called method escape, and is even accessed by external threads. This is called thread escape. From no escape to method escape to thread escape is called different levels of object escape from low to high.
If an object does not escape or the probability of escape is extremely low, you can do the following optimization:
① Allocation on the stack: let this object not allocate memory in the java heap, allocate memory directly on the stack, the memory space occupied by the object can The stack frame is popped and destroyed.
② Scalar replacement: A data cannot be decomposed into smaller data to represent, then this is a scalar, otherwise it is an aggregate. The object is a typical aggregate quantity. If the object does not escape, it can be disassembled, and its member variables can be restored to the original type for access according to the program access situation.
③ Synchronization elimination: It will not be accessed by other threads without escaping, so the synchronization measures of this variable can be canceled.
Common sub-expression elimination:
For variables that have been calculated, the result can be called in subsequent calls, and the expression of the variable is deleted.
If this optimization is limited to the basic blocks of the program, it can be called local common subexpression elimination; the scope of optimization covers multiple basic blocks, which can be called global common subexpression elimination.
Data boundary checking eliminates
Java is a dynamic security language. Each time you access the data, you need to check the upper and lower bounds, but each check needs to waste a lot of running time. Therefore, implicit exception handling is used, and the check is only performed when an error occurs.

Published 37 original articles · won praise 6 · views 4638

Guess you like

Origin blog.csdn.net/littlewhitevg/article/details/105567500