Talk about JVM from the principle (5): JVM compilation process and optimization methods | JD Cloud technical team

1. Front-end compilation

Front-end compilation is the process of compiling Java source code files into Class files. The compilation process is divided into 4 steps:

1 ready

Initializes the pluggable annotation processor (Annotation Processing Tool).

2 Parsing and filling the symbol table

Convert the character stream of the source code into a token (Token) collection, and construct it 抽象语法树(AST).

Each node of the abstract syntax tree represents a grammatical structure in the program code, including packages, types, modifiers, operators, interfaces, return values, code comments, etc.

The subsequent behavior of the compiler is based on the abstract syntax tree.

The symbol table can be understood as a collection of KV structures, storing the following information:

  • variable names and constants
  • Procedure and function names
  • Literal constants and strings
  • Temporary files generated by the compiler
  • tags in the source language

The compiler will use the symbol table to find all the symbols conveniently during operation.

3 Annotation processors

The annotation processor can be regarded as a set of compiler plug-ins that can be used to read and write arbitrary elements in the abstract syntax tree.

To put it simply, the function of the annotation processor is to let the compiler execute specific logic for specific annotations, which are generally used to generate code, such as commonly used lombok and mapstruct are based on this.

If the syntax tree is modified during this period, the compiler will return to the process of "parsing and filling the symbol table" to reprocess. This cycle is called "Round (Round)".

This is the only way the developer can control the behavior of the compiler.

4 Analysis and bytecode generation

The pre-steps can successfully generate a syntax tree with correct structure, and the semantic analysis is to verify whether the syntax tree is logical.

Semantic analysis is divided into four steps:

4.1 Label check

Annotation checking is mainly used to check whether the table is declared, whether the variable matches the assignment, and so on.

At this stage, an optimization called "constant folding" will also be performed, such as Java code int a = 1 + 2;, which will be folded into int a = 3;

4.2 Data and control flow analysis

Data flow analysis and control flow analysis are a further verification of the program context logic. It can check whether the local variables of the program are assigned before use, whether each path of the method has a return value, and whether all checked exceptions are correct. etc. are handled correctly.

4.3 Decomposing syntactic sugar

There are a lot of syntactic sugar in Java to simplify code implementation, such as automatic boxing and unboxing, generics, variable-length parameters, and so on. These grammatical sugars will be reduced to the basic grammatical structure in the compiler. This process is called desyntactic sugaring.

4.4 Bytecode Generation

This is javacthe final stage of the compilation process. At this stage, the compiler will generate the abstract syntax tree and symbol table generated earlier into class files, and also perform a small amount of code addition and conversion.

2. Compile at runtime

The main purpose of runtime compilation is to compile the code into native code, thus saving the time of interpretation and execution.

However, the JVM does not start compiling immediately after startup, but first interprets and executes for execution efficiency. When the program is running, according to the hotspot detection, after finding the hotspot code, it will be compiled in a targeted manner to gradually replace the interpretation and execution. Therefore, HotSpot JVM adopts an architecture in which interpreter and just-in-time compiler coexist.

1 When to use compilation and execution

Sun JDK mainly calculates whether the threshold is exceeded based on a counter on the method, and if it exceeds, it uses the method of compilation and execution.

  • call counter

Record the number of method calls, the default is 1500 times in client mode, and 10000 times in server mode by default, which can be -XX:CompileThreshold=10000set by

  • Back edge counter

The execution times of some codes in a loop, the default is 933 in client mode and 140 in server mode, which can be -XX:OnStackReplacePercentage=140set by

2 compilation mode

In terms of compilation, Sun JDK provides two modes: client compiler (-client) and server compiler (-server)

2.1 Client compiler

Also known as C1, it is relatively lightweight and mainly includes the following aspects:

2.1.1 Method inlining

The most important optimization made by the compiler is method inlining

Following object-oriented design, property access is usually through setter/getter methods rather than direct calls, and such method calls are expensive, especially relative to the code size of the method.

The current JVM usually executes these methods in the form of inline code, for example:

Order o = new Order();
o.setTotalAmount(o.getOrderAmount() * o.getCount());

And what the compiled code essentially does is:

Order o = new Order();
o.orderAmount = o.orderAmount * o.count;

Inlining is enabled by default and can be turned -XX:-Inlineoff by default. However, it is not recommended to turn it off because it has a huge impact on performance.

Whether a method is inlined or not depends on how hot it is and how big it is .

2.1.2 Devirtualization

If it is found that the method in the class only provides one implementation class, then for the code that calls this method, the method will be inlined

public interface Animal {
  void eat();
}

public class Cat implements Animal {
  @Override
  public void eat() {
    System.out.println("Cat eat !");
  }
}

public class Demo {
  public void execute(Animal animal){
    animal.eat();
  }
}

If only the Cat class in the JVM implements the Animal interface, execute()when the method is compiled, it will evolve into a structure similar to the following:

public void execute() {
  System.out.println("Cat eat !");
}

That is, execute()the method directly inlines the internal logic of the method Catin the class eat().

2.1.3 Redundancy Elimination

Redundancy elimination refers to code folding or elimination according to the running situation at compile time

For example:

private static final boolean isDebug = false;

public void execute() {
  if (isDebug) {
    log.debug("do execute.");
  }
  System.out.println("done");
}

After executing C1 compilation, it will evolve into the following structure:

public void execute() {
  System.out.println("done");
}

This is why it is usually not recommended to call directly log.debug(), but to judge first.

2.2 Server complier

Also known as C2, it is more heavyweight than C1. C2 is more about global optimization than code block optimization.

escape analysis

Escape analysis refers to judging whether the variable in the method will be read by the outside of the method according to the running status, and if it is read by the outside, it is considered to be escaped.

If -XX:+DoEscapeAnalysisescape analysis is enabled through the command (true by default), the server compiler will perform more aggressive optimization measures.

2.2.1 Scalar substitution

Point point = new Point(1, 2);
System.out.println("point.x = " + point.x + "; point.y" + point.y);

When the point object is not used in the subsequent execution process, the code will evolve into the following structure after compilation:

int x = 1, y = 2;
System.out.println("point.x = " + x + "; point.y" + y);

2.2.2 Allocation on the stack

In the above example, if pointthere is no escape, then C2 will choose to create pointthe object directly on the stack instead of the heap.

The advantage of allocating on the stack is that object creation is faster on the one hand, and on the other hand, objects are also recycled as the method ends during recycling.

2.2.3 Same step deletion

Point point = new Point(1, 2);
synchronized(point) {  
  System.out.println("point.x = " + point.x);
}

After analysis, if it is found that pointthere is no escape, the code will become the following structure after compilation:

Point point = new Point(1, 2);
System.out.println("point.x = " + point.x);

2.3 OSR (On Stack Replace, replacement on the stack)

The main difference between OSR and C1 and C2 is that OSR only replaces the entry of the loop code body , while C1 and C2 replace the entry of the method call.

Therefore, the phenomenon that will appear after OSR compilation is that the entire code of the method is compiled, but the compiled machine code is only executed in the loop code body, while other parts are still interpreted and executed.

If the method is compiled and optimized, and the JVM finds that the method is hot in a certain method and needs to be compiled, then only the next time the method is called can the optimized code be enjoyed, and this call still uses the code before optimization. OSR is mainly to solve this problem. For example, if the JVM finds that the loop in the method is overheated, then just compile the loop body, and the execution engine will jump to the newly compiled code when entering the next loop.

Author: JD Technology Kang Zhixing

Source: Reprinted by JD Cloud developer community, please indicate the source

Guess you like

Origin blog.csdn.net/jdcdev_/article/details/132535702