About the bootstrap implementation of the compiler (about how to implement a closed-loop language)

From a point of view that is neither right nor wrong:

First of all, let me talk about a popular opinion. The underlying source code of Java is implemented by C language and C++. This statement is neither right nor wrong. Actually, in the OpenJDK source code, JIT is actually implemented by C language. The gc code is implemented by C++

But in today's Java, it may not be implemented by C or C++, it may be implemented by itself

This self-realization seems to be inconsistent with normal cognition, like a chicken-and-egg problem

Here is a key concept to mention, that is, bootstrapping:

Compiler bootstrapping refers to writing the compiler in the language in which the compiler is compiled. Compiler bootstrapping is generally a milestone event in compiler development. Compiler bootstrapping means that the compiled language is promoted to a self-compiled language, and the compiler has the bootstrapping ability to implement the syntax and semantics of the language itself without restrictions

JAVA is actually a language that has been bootstrapped. Oracle uses java to implement a compiler that compiles Java bytecode into machine code. This compiler is Graal. As a mature modern compiler, Graal's capabilities are basically It can compete with modern top compilers such as Clang and GCC. Like the instruction set of physical hardware, it is only related to machine characteristics and not related to certain high-level language characteristics.

Approaches on how to create a bootstrap programming language:

1. Create a Turing Complete language X

(About Turing's completeness: Turing Machine (Turing Machine) is Turing's "On Computable Numbers, with an Application to the Entscheidungsproblem" published in 1936 ("On Computable Numbers and Its Application to Deterministic Problems") The proposed mathematical model. Since it is a mathematical model, it is not a physical concept, but an overhead idea. In the article, Turing described what it is, and proved that as long as the Turing machine can be realized, it can be used solve any computable problem)

2. Use other languages ​​(C, C++ or other) to write out the compiler you want to create. This compiler can realize the process of compiling X, so let’s call this compiler A.

3. Use the language you created to implement the compiler code, call it B, and then use A to compile B

4. At this time, B can be used to compile X, improve the code of X to get X1, and compile X1 with B again, and get B1

5. Improve X1 to get X2, take B1 to compile X2 to get B2

6. Repeat the above steps, you will find that you can compile yourself

Take the common C++ language as an example:

Use C++98 to write a C++ compiler that supports C++11, and when the time is ripe, you can use C++11 to write a C++ compiler that supports C++14, and then you can use C++14 to Write a C++ compiler that supports C++20, and then iterate your own next generation

Basically:

Executable files A and B have nothing to do with the programming language, because the bottom layer is translated into machine code. What the compiler does is to translate the high-level language into machine code. In fact, the entire exe execution process is essentially executed It is machine code, that is to say, the compiler itself has no direct relationship with the underlying language in which it is written.

Guess you like

Origin blog.csdn.net/qq_36653924/article/details/128655595