Selected Video Lessons Abroad: Introduction to Compilation Principles 1

Overview of compilation principles

Compilation refers to the conversion of the source code of the programmer in a certain high-level language into object code, that is, the executable machine code that the computer can recognize.
Insert picture description here
Compilation is done by a program called a compiler
because the program needs to be compiled and run in a specific type Processor, so, how to implement the compiler also depends on the architecture of the target machine

Insert picture description here

What makes a good compiler?

The beginning of the design of the compiler is to make the program written in the high-level language run correctly

works correctly

All static errors must be detected, that is to say, it should identify all errors that do not conform to the rules of the language. Do not expect the compiler to catch dynamic errors. These errors can only be detected at runtime. If they are not caught, they may be After the program crashes, don’t expect the compiler to find logic errors in the code, which means that these errors will not cause the program to crash during runtime, but will cause the program output to be incorrect.

Detects akk statuc errors

Produces meaningful diagnostics

A compiler should make a clear and meaningful diagnosis. If errors are found at compile time, the output error message should be clear, and the original source code error location should be precise

Generates optimal machine code

A good compiler will generate the best machine code

Compiles quickly

Compilation must be fast

Easy to use

Compilation should be simple to use

modular

Establishing a module with as little coupling as possible between various components requires a lot of knowledge reserves.
This method allows various parts of the compiler to be reused by multiple compiled languages ​​and multiple target machine architecture platforms.

Documented and easy to maintain

Like good software, a good compiler must have a detailed documentation and be easy to maintain

Stages of Compilation

lexical analysis

We wrote a simple English sentence and then split it into individual words and punctuation

Gramma analysis

To check what the sentence means

Machine code generation

This process is to translate a sentence into another language

The compiler generates 0 and 1 machine code that can be understood by the processor, but it also optimizes the machine code for speed and space

Therefore, code generation and optimization are usually recognized as a stage
Insert picture description here

A stage of the compiler is not followed by a stage

For example, lexical analysis and grammatical analysis are performed at the same time

When compiling a high-level compiler language, lexical and grammatical analysis is required, but the architecture platform of our target machine is independent. Therefore, lexical and grammatical analysis is called the front-end operation of the compiler

Insert picture description here

On the other hand, code generation and optimization are only used when generating machine code based on the instruction set of the target machine.
So this stage is called back-end operation during the compilation process.

Front end operation

The source code of the program will be input to the lexical analyzer as a text stream. The lexical analyzer converts each word of the source program into a stream of lexical units and outputs it, and when requested, sends the stream of lexical units to the lexical analyzer one by one.
Insert picture description here
The syntax analyzer will construct an abstract syntax tree to represent the source program

Abstract syntax tree

The abstract syntax tree is a dynamic data structure used to represent the hierarchical structure of the source program.
Insert picture description here
When the grammar book is built, the compiler will use it to check whether the source code complies with the grammatical rules of the programming language

When compiling, the lexical analyzer creates a symbol table at the same time. The symbol table is frequently accessed and modified at all stages of the compilation process. The symbol table contains information about the names used by the programmer in the source code, such as variable and function names.
Insert picture description here
For some compilers, the abstract syntax tree is the only intermediate representation from source code to machine code. The abstract syntax tree is the output of the syntax analyzer and the final output of the compiler front end. Then the abstract syntax tree will be directly converted to the machine. code

However, some compilers do more work on the front end

Before constructing the abstract syntax tree, the compiler may first construct a simple tree. We call it a parse tree (syntax analysis tree). It is a lightweight representation of the source program.
Insert picture description here
The information obtained by compiling the abstract syntax tree and The information of the symbol table is combined to generate another intermediate representation of the source code

Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_44522477/article/details/112060848