<Self-examination study record> Course code 6370 "Compilation Technology" <2>

1.2 Overview of the compilation process
Compiling a program is the entire process from the input of the source program to the output of the target program.
The entire working process of a compiler is divided into stages, each stage transforms one representation of the source program into another representation.
The whole process can be divided into five stages: lexical analysis, syntax analysis, intermediate code generation, code optimization, and object code generation.
There are also two important tasks: form management and error handling, both of which are related to the above five stages.
Table management: save various information of the source program, the construction, search or update related information involved in the work of each stage in the compilation process.
Error handling: The source program is found to have errors during the compilation process. The compiler should report the nature of the error and the location of the error, and limit the impact of the error to the smallest possible range, so that the rest of the source program can continue to be used. Compile down.
Now we will introduce the tasks of each stage from the different representations that the source program is converted into at different stages:
lexical
analysis The task of lexical analysis is to read characters one by one into the source program from left to right, and analyze the character stream that constitutes the source program. Scan and decompose to identify words, where the so-called word refers to a group of characters that are logically closely connected, and these characters have specific meanings: for example, an identifier starts with an alphabetic character, followed by a sequence of letters and data characters A word composed of, in addition to operators, delimiters, etc.
Example: begin var sum,first,count:real;
The lexical analysis stage will form the following sequence of words:
1. Reserved word begin 2. Reserved word var 3. Identifier sum
4. Comma, 5. Mark Character first 6. Comma, 7. Identifier count
8. Colon: 9. Reserved word real 10. Semicolon;
spaces between these words are filtered out in the lexical analysis stage.
Word symbols are the basic components of language and the basic elements for people to understand and write programs. Recognition and understanding of these elements are undoubtedly the basis of translation.
Just like translating from English to Chinese, if you don't understand English words or are not familiar with word formation, then you can't make a correct translation.
What is followed in the lexical analysis stage is the word formation rules of the language.
Syntactic
analysis The task of syntax analysis is to decompose word symbol strings into various grammatical units based on lexical analysis according to the grammatical rules of the language.
Through grammatical decomposition, it is determined whether the entire input string constitutes a grammatically correct "program", and grammatical analysis follows the grammatical rules of the language.
Example: a+0.1/b represents an "arithmetic expression", so the task of parsing is to recognize that this symbol string belongs to the category of "arithmetic expression".
Intermediate code generation
After the syntax analysis, some compilers convert the source program into an internal representation according to the semantics of the language, and this internal representation is called intermediate language or intermediate code.
The so-called "intermediate code" is a notation system with simple structure and clear meaning, which can be designed in various forms.
Important design principles:
1. Easy to generate.
2. It is easy to translate it into object code.
Many compilers use a "quaternary" intermediate code that approximates a "three-address instruction".
<Self-examination study record> Course code 6370 "Compilation Technology" <2>
Introduction of quaternary and ternary: https://baike.baidu.com/item/%E5%9B%9B%E5%85%83%E5%BC%8F
Code optimization
Change or transform the intermediate code, The purpose is to make the generated object code more efficient, saving time and space.
object code generation
Convert the intermediate code into absolute instruction code, relocatable instruction code and assembly instruction code on a specific machine. This is the final stage of compilation, and its work is related to the hardware system structure and instruction meaning. The work of this stage It is very complicated, involving the use of functional components of the hardware system, the selection of machine instructions, the allocation of storage space for variables of various data types, and the scheduling of registers and buffer registers.
Code generation: https://baike.baidu.com/item/%E4%BB%A3%E7%A0%81%E7%94%9F%E6%88%90Summary
In
fact, not all compilers are divided into this In several stages, some compilers have no requirements for optimization, and the optimization stage can be omitted. In some cases, in order to speed up the compilation speed, the intermediate code generation stage can also be omitted. The target instruction code is generated, but most practical compilers adopt the working process of the above several stages.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326109373&siteId=291194637