Chapter 10_Front-end compilation and optimization

Overview

The compilation period may refer to the following:

  1. Front-end compiler: the process of converting java files into class files, such as Javac
  2. Just-in-time compiler (JIT compiler): the process of converting bytecode into native machine code at runtime, such as the C1 and C2 compilers of the hotspot virtual machine
  3. Static pre-compiler: the process of directly compiling the program into binary code related to the target instruction set

In Java, the optimization process of the just-in-time compiler during the runtime supports the continuous improvement of program execution efficiency, while the optimization process of the front-end compiler during the compilation period supports the programmer's coding efficiency and the happiness of language users. improve

Javac compiler

Compilation process

  1. Preparation process: Initialize the plug-in annotation processor
  2. The process of parsing and filling the symbol table
    • Lexical, grammatical analysis, construct an abstract syntax tree
    • Fill the symbol table, generate symbol address and symbol information
  3. The annotation processing process of the plug-in annotation processor: the execution phase of the plug-in annotation processor
  4. Semantic analysis and bytecode generation process
    • Annotation check. Check the static information of the grammar
    • Data flow and control flow analysis. Check the dynamic running process of the program
    • Decipher syntactic sugar. Reduce the syntactic sugar of simplified code writing to its original form
    • Bytecode generation. Convert the information generated in the previous steps into bytecode

In the above actions, new symbols may be generated when inserting annotations are executed. If new symbols are generated, you must return to the previous analysis, and reprocess these new symbols during the process of filling the symbol table.

Parse and fill symbol table

Lexical, grammatical analysis

Lexical analysis is the process of transforming the character stream of the source code into a set of tags. A tag is the smallest element at compile time. For example, int is a tag

Syntax analysis is the process of constructing an abstract syntax tree based on the tag sequence. The abstract syntax tree is a tree representation used to describe the syntax structure of program code

Fill symbol table

The symbol table is a data structure composed of a set of symbol addresses and symbol information, which can be used as a hash table stored in key-value pairs

Annotation processor

Plug-in annotation processor: Annotations generally work at runtime, and this advances to the compilation period to process specific annotations in the code, thereby affecting the working process of the front-end compiler.

The plug-in annotation processor can be regarded as a plug-in of the compiler. If these plug-ins modify the syntax tree during the processing of annotations, the compiler will return to the process of parsing and filling the symbol table and reprocess it until all the plug-in annotations The processor does not modify the syntax tree any more, and each cycle is called a round.

The plug-in annotation processor can achieve many functions, such as automatic generation of getter/setter methods through annotations, equals (), hashCode () methods

Semantic analysis and bytecode generation

The abstract syntax tree can represent a source program with the correct structure, but it cannot guarantee that the semantics of the source program are logical. The main task of semantic analysis is to check the context-sensitive nature of the source program with the correct structure, such as type checking, etc.

When compiling, you see error messages marked by red lines in the IDE, most of which are the results of the semantic analysis stage

Label check

Check whether the variable is declared before use, whether the data type between the variable and the assignment matches, etc.

Constant folding:, int a=1+2will become a=3

Data and control flow analysis

Check out issues such as whether the program's local variables are assigned before use, whether each path of the method has a return value, whether all checked exceptions are correctly handled, etc.

Syntactic sugar

Adding a certain grammar to the language has no actual effect on the compilation results and functions of the language, but it is convenient for programmers to use the language. Reduce the amount of code, increase program readability, and reduce the chance of program code errors

Bytecode generation

The instance constructor () method and the class constructor () are added to the syntax tree at this stage

The generation of (), () is the process of code convergence. The compiler will converge the statement block, variable initialization, call the parent class instance constructor and other operations to their two methods

And ensure that regardless of the order in which the source code appears, it must be executed in the order of first executing the instance constructor of the parent class, then initializing the variables, and finally executing the statement block

The taste of Java syntactic sugar

Generic

The essence of generics is the application of parameterized types or parameterized polymorphism. That is, the data type of the operation can be specified as a special parameter in the method signature

Java's generics are "type erasure generics", which only appear in the source code of the program. In the compiled bytecode file, all generics are replaced with the original raw type (Raw Type), and in the corresponding The mandatory conversion code is inserted in the place

So ArrayList and ArrayList are the same type at runtime

Both in terms of use effect and operating efficiency, it lags behind the realization of generics. The only advantage is that the realization of erasing generics only needs to be improved on the Java compiler...

Type erasure

Java chooses to generalize the existing types, such as ArrayList in situ generalization to ArrayList
bare type should be regarded as the common parent type of
all generic instances of this type. Let all generic instance types, such as ArrayList, ArrayList All automatically become subtypes of ArrayList.
The implementation of Java bare types: simply and rudely restore ArrayList to ArrayList at compile time, and automatically insert some mandatory type conversion and check instructions when accessing or modifying elements.

However, the so-called erasure in the erasure method is only to erase the bytecode in the Code attribute of the method. In fact, the generic information is still retained in the metadata. Therefore, the parameterized type can be obtained through reflection when encoding

problem

Does not support primitive data

Because it does not support the mandatory transformation between the basic types of int and Object, once the generic information is erased, the mandatory transformation code cannot be inserted until the place. Java does not support the generics of primitive types, and can only use ArrayList, which leads to the packing and unboxing overhead of countless construction packaging classes

Long-winded code

Cannot get generic type information during runtime

Brings an ambiguity

When generics encounter overload

Automatic packing, unpacking, loop traversal

Loop traversal is to restore the code to the implementation of iterators, so the classes that need to be traversed implement the Iterator interface

Variable length parameter becomes an array type parameter when calling

Integer a = 1;
Integer b = 2;
Integer c = 3;
Integer d = 3;
Integer e = 321;
Integer f = 321;
Long g = 3L;
System.out.println(c==d);//true
System.out.println(e==f);//false
System.out.println(c==(a+b));//true
System.out.println(c.equals(a+b));//true
System.out.println(g==(a+b));//true
System.out.println(g.equals(a+b));//false

Given that the == operation of the packaging class will not automatically unbox without encountering arithmetic operations, and their equals() method does not deal with the relationship of data transformation, try to avoid such automatic boxing and unboxing in actual coding

Conditional compilation

Java language compilation method: The compiler does not compile java files one by one, but enters the top node of the syntax tree of all compilation units into the list to be processed and then compiles, so each file can provide symbolic information to each other

Java's conditional compilation: use an if statement with a constant condition, it will run at compile time


public static void main(String[] args)
{
    
    
    if(true)
    {
    
    
        System.out.println("1");
    }
    else
    {
    
    
        System.out.println("2");
    }
}

The decompilation result of the Class file after the code is compiled:

public static void main(String[] args)
{
    
    
   System.out.println("1");
}

The compiler will eliminate the invalid code in the branch.

This kind of syntactic sugar can only be written inside the method body, and can only achieve conditional compilation at the basic block level of the statement, but there is no way to adjust the structure of the entire Java class according to conditions.

java
public static void main(String[] args)
{
System.out.println(“1”);
}


编译器将会把分支中不成立的代码消除掉。

这种语法糖只能写在方法体内部,只能实现语句基本块级别的条件编译,而没有办法实现根据条件调整整个Java类的结构

Guess you like

Origin blog.csdn.net/weixin_42249196/article/details/108295165