Article Directory
Overview
The compilation period may refer to the following:
- Front-end compiler: the process of converting java files into class files, such as Javac
- Just-in-time compiler (JIT compiler): the process of converting bytecode into native machine code at runtime, such as the C1 and C2 compilers of the hotspot virtual machine
- Static pre-compiler: the process of directly compiling the program into binary code related to the target instruction set
In Java, the optimization process of the just-in-time compiler during the runtime supports the continuous improvement of program execution efficiency, while the optimization process of the front-end compiler during the compilation period supports the programmer's coding efficiency and the happiness of language users. improve
Javac compiler
Compilation process
- Preparation process: Initialize the plug-in annotation processor
- The process of parsing and filling the symbol table
- Lexical, grammatical analysis, construct an abstract syntax tree
- Fill the symbol table, generate symbol address and symbol information
- The annotation processing process of the plug-in annotation processor: the execution phase of the plug-in annotation processor
- Semantic analysis and bytecode generation process
- Annotation check. Check the static information of the grammar
- Data flow and control flow analysis. Check the dynamic running process of the program
- Decipher syntactic sugar. Reduce the syntactic sugar of simplified code writing to its original form
- Bytecode generation. Convert the information generated in the previous steps into bytecode
In the above actions, new symbols may be generated when inserting annotations are executed. If new symbols are generated, you must return to the previous analysis, and reprocess these new symbols during the process of filling the symbol table.
Parse and fill symbol table
Lexical, grammatical analysis
Lexical analysis is the process of transforming the character stream of the source code into a set of tags. A tag is the smallest element at compile time. For example, int is a tag
Syntax analysis is the process of constructing an abstract syntax tree based on the tag sequence. The abstract syntax tree is a tree representation used to describe the syntax structure of program code
Fill symbol table
The symbol table is a data structure composed of a set of symbol addresses and symbol information, which can be used as a hash table stored in key-value pairs
Annotation processor
Plug-in annotation processor: Annotations generally work at runtime, and this advances to the compilation period to process specific annotations in the code, thereby affecting the working process of the front-end compiler.
The plug-in annotation processor can be regarded as a plug-in of the compiler. If these plug-ins modify the syntax tree during the processing of annotations, the compiler will return to the process of parsing and filling the symbol table and reprocess it until all the plug-in annotations The processor does not modify the syntax tree any more, and each cycle is called a round.
The plug-in annotation processor can achieve many functions, such as automatic generation of getter/setter methods through annotations, equals (), hashCode () methods
Semantic analysis and bytecode generation
The abstract syntax tree can represent a source program with the correct structure, but it cannot guarantee that the semantics of the source program are logical. The main task of semantic analysis is to check the context-sensitive nature of the source program with the correct structure, such as type checking, etc.
When compiling, you see error messages marked by red lines in the IDE, most of which are the results of the semantic analysis stage
Label check
Check whether the variable is declared before use, whether the data type between the variable and the assignment matches, etc.
Constant folding:, int a=1+2
will become a=3
Data and control flow analysis
Check out issues such as whether the program's local variables are assigned before use, whether each path of the method has a return value, whether all checked exceptions are correctly handled, etc.
Syntactic sugar
Adding a certain grammar to the language has no actual effect on the compilation results and functions of the language, but it is convenient for programmers to use the language. Reduce the amount of code, increase program readability, and reduce the chance of program code errors
Bytecode generation
The instance constructor () method and the class constructor () are added to the syntax tree at this stage
The generation of (), () is the process of code convergence. The compiler will converge the statement block, variable initialization, call the parent class instance constructor and other operations to their two methods
And ensure that regardless of the order in which the source code appears, it must be executed in the order of first executing the instance constructor of the parent class, then initializing the variables, and finally executing the statement block
The taste of Java syntactic sugar
Generic
The essence of generics is the application of parameterized types or parameterized polymorphism. That is, the data type of the operation can be specified as a special parameter in the method signature
Java's generics are "type erasure generics", which only appear in the source code of the program. In the compiled bytecode file, all generics are replaced with the original raw type (Raw Type), and in the corresponding The mandatory conversion code is inserted in the place
So ArrayList and ArrayList are the same type at runtime
Both in terms of use effect and operating efficiency, it lags behind the realization of generics. The only advantage is that the realization of erasing generics only needs to be improved on the Java compiler...
Type erasure
Java chooses to generalize the existing types, such as ArrayList in situ generalization to ArrayList
bare type should be regarded as the common parent type of
all generic instances of this type. Let all generic instance types, such as ArrayList, ArrayList All automatically become subtypes of ArrayList.
The implementation of Java bare types: simply and rudely restore ArrayList to ArrayList at compile time, and automatically insert some mandatory type conversion and check instructions when accessing or modifying elements.
However, the so-called erasure in the erasure method is only to erase the bytecode in the Code attribute of the method. In fact, the generic information is still retained in the metadata. Therefore, the parameterized type can be obtained through reflection when encoding
problem
Does not support primitive data
Because it does not support the mandatory transformation between the basic types of int and Object, once the generic information is erased, the mandatory transformation code cannot be inserted until the place. Java does not support the generics of primitive types, and can only use ArrayList, which leads to the packing and unboxing overhead of countless construction packaging classes
Long-winded code
Cannot get generic type information during runtime
Brings an ambiguity
When generics encounter overload
Automatic packing, unpacking, loop traversal
Loop traversal is to restore the code to the implementation of iterators, so the classes that need to be traversed implement the Iterator interface
Variable length parameter becomes an array type parameter when calling
Integer a = 1;
Integer b = 2;
Integer c = 3;
Integer d = 3;
Integer e = 321;
Integer f = 321;
Long g = 3L;
System.out.println(c==d);//true
System.out.println(e==f);//false
System.out.println(c==(a+b));//true
System.out.println(c.equals(a+b));//true
System.out.println(g==(a+b));//true
System.out.println(g.equals(a+b));//false
Given that the == operation of the packaging class will not automatically unbox without encountering arithmetic operations, and their equals() method does not deal with the relationship of data transformation, try to avoid such automatic boxing and unboxing in actual coding
Conditional compilation
Java language compilation method: The compiler does not compile java files one by one, but enters the top node of the syntax tree of all compilation units into the list to be processed and then compiles, so each file can provide symbolic information to each other
Java's conditional compilation: use an if statement with a constant condition, it will run at compile time
public static void main(String[] args)
{
if(true)
{
System.out.println("1");
}
else
{
System.out.println("2");
}
}
The decompilation result of the Class file after the code is compiled:
public static void main(String[] args)
{
System.out.println("1");
}
The compiler will eliminate the invalid code in the branch.
This kind of syntactic sugar can only be written inside the method body, and can only achieve conditional compilation at the basic block level of the statement, but there is no way to adjust the structure of the entire Java class according to conditions.
java
public static void main(String[] args)
{
System.out.println(“1”);
}
编译器将会把分支中不成立的代码消除掉。
这种语法糖只能写在方法体内部,只能实现语句基本块级别的条件编译,而没有办法实现根据条件调整整个Java类的结构