[C++ review 2] How the C++ compiler works

If you are a newbird, it is recommended to watch the following video to deepen your understanding, and then watch the following content:
https://www.bilibili.com/video/BV1N24y1B7nQ?p=7
insert image description here

The cherno will additionally tell you how to convert object files into assembly code, the process of CPU executing instructions, and how the compiler optimizes by deleting redundant variables.

statement :

The following content is generated by chatGpt and obtained by summarizing the video. I hope it will be helpful to everyone.

What is a C++ compiler

A C++ compiler is a software tool that converts C++ source code into an executable program, such as the most famous Visual Studio .

working principle

How it works can be broken down into three main stages: preprocessing, compiling, and linking .

1. Preprocessing

The preprocessing stage processes preprocessing directives in the source code, such as #includeand #defineetc., and replaces them with the source code.

The preprocessor can also perform conditional compilation, which includes or excludes code based on conditions defined in the code. After processing is complete, preprocessed source code is generated.

2. Compile

The compilation stage converts the preprocessed source code into intermediate code, including operations such as generating an abstract syntax tree.

The compiler performs lexical analysis and syntax analysis on the code, and performs semantic checks on the code to ensure that it conforms to the C++ language specification. The compiler then converts the intermediate code into machine code, producing object files.

2.1 What is intermediate code?

The C++ compiler converts the preprocessed source code into intermediate code, also known as object code (Object Code), during the compilation phase .

Features :

These intermediate codes are platform-independent low-level codes, usually in binary format or assembly code.

Specifically, the compiler converts the source code into an Abstract Syntax Tree (AST).

2.2 What is AST

concept :

AST is a data structure used by the compiler during compilation to represent the grammatical structure of the source code.

The compiler performs a series of optimizations and transformations on the AST to generate object code. These optimizations include removing redundant code, extracting common subexpressions, constant folding, and more.

The intermediate code generated is platform-independent, since they are not optimized for a specific CPU architecture. In the linking stage, the linker combines these object files into an executable file, and links it with the library files related to the operating system and CPU architecture to generate the final executable file.

3. Links

The linking phase combines multiple object and library files into a single executable.

The linker parses the symbols in the code, finds their definitions, and links them. These symbols may come from other object files or library files.

3.1 Specific examples

Suppose we have two C++ source code files, one is main.cppand one is hello.cpp. main.cppOne of the functions in is called hello.cpp, and they need to be linked to produce an executable.

Now main.cpp, the content is as follows:

#include <iostream>
#include "hello.h"

int main() {
    
    
    hello();
    return 0;
}

The other is hello.cpp, the content is as follows:

#include <iostream>
#include "hello.h"

void hello() {
    
    
    std::cout << "Hello, world!" << std::endl;
}

There is also a header file hello.hwith the following content:

#ifndef HELLO_H
#define HELLO_H

void hello();

#endif

When we run, the code will be compiled as follows:

$ g++ -c main.cpp
$ g++ -c hello.cpp
$ g++ -o hello main.o hello.o

The first command will main.cppcompile to main.oan object file, the second command will hello.cppcompile to hello.oan object file, and the last command will link the two object files to produce an executable hello.

We can execute ./hellothe command to run the program, and the result should be output "Hello, world!"( ChatGpt said, I haven't tested it, but the logic looks reasonable ).

It can be seen that during the linking phase, the linker will merge the files main.ointo hello.oone executable file. First, the linker performs symbolic resolution on the object file, finds the symbolic reference to the function main.ocalled in hello.cppand hello.ofinds the symbolic definition in . The linker then links the references and definitions to produce an executable.

3.2 Additional issues (problems with conflicting symbols)

concept :

The linker also needs to resolve symbol conflicts . When the same symbol definition exists in multiple object files, the linker reports an error because it cannot tell which definition should be used.

Solution :
To solve this problem, C++ provides some mechanisms.

  • When a function or variable is declared in a header file extern, symbol resolution is not performed during the linking phase, but is resolved at runtime.
  • Additionally, the linker can use static or dynamic libraries to resolve symbol conflicts. Static libraries are incorporated directly into the executable at the link stage, while dynamic libraries are loaded into memory at runtime.

4. Summary

The resulting executable file can be run on the computer to perform the operations described by the program.

In general, the working principle of the C++ compiler is the process of converting the source code into an executable file, which is realized through three stages of preprocessing, compiling and linking.

Guess you like

Origin blog.csdn.net/kokool/article/details/130534484