If you are a newbird, it is recommended to watch the following video to deepen your understanding, and then watch the following content:
https://www.bilibili.com/video/BV1N24y1B7nQ?p=7
The cherno will additionally tell you how to convert object files into assembly code, the process of CPU executing instructions, and how the compiler optimizes by deleting redundant variables.
statement :
The following content is generated by chatGpt and obtained by summarizing the video. I hope it will be helpful to everyone.
What is a C++ compiler
A C++ compiler is a software tool that converts C++ source code into an executable program, such as the most famous Visual Studio .
working principle
How it works can be broken down into three main stages: preprocessing, compiling, and linking .
1. Preprocessing
The preprocessing stage processes preprocessing directives in the source code, such as #include
and #define
etc., and replaces them with the source code.
The preprocessor can also perform conditional compilation, which includes or excludes code based on conditions defined in the code. After processing is complete, preprocessed source code is generated.
2. Compile
The compilation stage converts the preprocessed source code into intermediate code, including operations such as generating an abstract syntax tree.
The compiler performs lexical analysis and syntax analysis on the code, and performs semantic checks on the code to ensure that it conforms to the C++ language specification. The compiler then converts the intermediate code into machine code, producing object files.
2.1 What is intermediate code?
The C++ compiler converts the preprocessed source code into intermediate code, also known as object code (Object Code), during the compilation phase .
Features :
These intermediate codes are platform-independent low-level codes, usually in binary format or assembly code.
Specifically, the compiler converts the source code into an Abstract Syntax Tree (AST).
2.2 What is AST
concept :
AST is a data structure used by the compiler during compilation to represent the grammatical structure of the source code.
The compiler performs a series of optimizations and transformations on the AST to generate object code. These optimizations include removing redundant code, extracting common subexpressions, constant folding, and more.
The intermediate code generated is platform-independent, since they are not optimized for a specific CPU architecture. In the linking stage, the linker combines these object files into an executable file, and links it with the library files related to the operating system and CPU architecture to generate the final executable file.
3. Links
The linking phase combines multiple object and library files into a single executable.
The linker parses the symbols in the code, finds their definitions, and links them. These symbols may come from other object files or library files.
3.1 Specific examples
Suppose we have two C++ source code files, one is main.cpp
and one is hello.cpp
. main.cpp
One of the functions in is called hello.cpp
, and they need to be linked to produce an executable.
Now main.cpp
, the content is as follows:
#include <iostream>
#include "hello.h"
int main() {
hello();
return 0;
}
The other is hello.cpp
, the content is as follows:
#include <iostream>
#include "hello.h"
void hello() {
std::cout << "Hello, world!" << std::endl;
}
There is also a header file hello.h
with the following content:
#ifndef HELLO_H
#define HELLO_H
void hello();
#endif
When we run, the code will be compiled as follows:
$ g++ -c main.cpp
$ g++ -c hello.cpp
$ g++ -o hello main.o hello.o
The first command will main.cpp
compile to main.o
an object file, the second command will hello.cpp
compile to hello.o
an object file, and the last command will link the two object files to produce an executable hello
.
We can execute ./hello
the command to run the program, and the result should be output "Hello, world!"
( ChatGpt said, I haven't tested it, but the logic looks reasonable ).
It can be seen that during the linking phase, the linker will merge the files main.o
into hello.o
one executable file. First, the linker performs symbolic resolution on the object file, finds the symbolic reference to the function main.o
called in hello.cpp
and hello.o
finds the symbolic definition in . The linker then links the references and definitions to produce an executable.
3.2 Additional issues (problems with conflicting symbols)
concept :
The linker also needs to resolve symbol conflicts . When the same symbol definition exists in multiple object files, the linker reports an error because it cannot tell which definition should be used.
Solution :
To solve this problem, C++ provides some mechanisms.
- When a function or variable is declared in a header file
extern
, symbol resolution is not performed during the linking phase, but is resolved at runtime. - Additionally, the linker can use static or dynamic libraries to resolve symbol conflicts. Static libraries are incorporated directly into the executable at the link stage, while dynamic libraries are loaded into memory at runtime.
4. Summary
The resulting executable file can be run on the computer to perform the operations described by the program.
In general, the working principle of the C++ compiler is the process of converting the source code into an executable file, which is realized through three stages of preprocessing, compiling and linking.