Compilation principle: 1. Introduction

Better reading experience\huge{\color{red}{better reading experience}}better reading experience



1.1 Basic concepts


There are basically two ways to execute a programming language source program:

  • Translation: Using a translation program, the source program is translated into a low-level language target program, and then the target program is executed.
  • Interpretation: Use an interpreter to interpret and execute the source program statement by statement.

Compiler : A program that can read a program written in a certain language (source language) and translate the program into an equivalent program written in another language (target language), that is, a software system that can compile a program .

Interpreter : It is another common language processor, which does not generate the target program through translation. From the user's point of view, the interpreter directly uses the input provided by the user to perform the operations specified in the source program.

Interpretation program : It is a kind of high-level language translation program. It takes the source program written in the source language as input, and submits it to the computer for execution after explaining a sentence, without forming a target program.

Compiler : The high-level language source program is used as input, translated and converted, and the target program in machine language is generated, and then the computer is allowed to execute the target program to obtain the calculation result.

The difference between a compiler and an interpreter : The biggest difference is that the former generates object code, while the latter does not . A compiler is a translation program that translates a source program written in a high-level language into an equivalent target program in machine language or assembly language. An interpreter is also a translation program that takes a source program as input and executes it, interpreting it while executing it. The main difference between it and the compiler is that the target program is not generated during the execution of the interpreted program, but the source program itself is interpreted and executed according to the definition of the source language .


1.2 Modules and interfaces


Each stage of the compiler's operation is realized by one or more software modules. The reason for breaking the compiler down into such phases is to be able to reuse its various components. For example, when you want to change the target machine of the machine language generated by this compiler, you only need to change the stack frame layout (Frame Layout) module and instruction selection (Instruction Selection) module. When changing the source language to be compiled, at most you only need to change the module before the Translate module, and the compiler can also be connected to the language-oriented syntax editor at the Abstract Syntax interface.

Interfaces such as Abstract Syntax, IRTrees ( IR Tree), and Assem are forms of data structures. For example, the syntax analysis action phase establishes an abstract syntax data structure and passes it to the semantic analysis phase. Other interfaces are abstract data types: the translation interface is a group of functions that can be called by the semantic analysis stage; the word symbol (Token) interface is a function form, and the analyzer gets the next word symbol in the input program by calling it.

insert image description here

This modular design is typical of many real compilers. However, there are also compilers that combine syntax analysis, semantic analysis, translation, and normalization into one phase, and others that arrange instruction selection at a later point and merge it with code output. Simple compilers usually do not have specialized control flow analysis, data flow analysis, and register allocation.


1.3 Compilation process


Roughly, the process of a compiler compiling a language source program is as follows:

order stage describe
1 lexical analysis Break down source files into individual word symbols
2 Gramma analysis Parsing the Phrase Structure of a Program
3 semantic action Build an abstract syntax tree corresponding to each phrase
4 Semantic Analysis Determine the meaning of each phrase, associate variables with their declarations, check the type of expressions, and translate each phrase
5 stack frame layout Allocate variables, function parameters, etc. in the active record (i.e. stack frame) in the manner required by the machine
6 translate Generates an intermediate representation tree ( IRtree), which is a representation independent of any particular programming language and target machine architecture
7 normalization Extract the side effects in the expression and organize the conditional branches to facilitate the processing of the next stage
8 command selection Combine IRtree nodes into blocks corresponding to target machine instructions
9 control flow analysis Analyze the sequence of instructions and build a control flow graph, which represents all the control flows that may flow through the program during execution. Data flow analysis collects data flow information of program variables. For example, liveness analysis still needs to use other variables to calculate each variable. The location of the value (i.e. its active point)
10 register allocation Choose a register for each variable and temporary data in the program, two variables that are not active at the same point can share the same register
11 code out Replace temporary variable names that appear in every machine instruction with machine registers

Guess you like

Origin blog.csdn.net/LYS00Q/article/details/128927914