Implementing the Python interpreter in Python

introduce

Byterun is a Python interpreter implemented in Python. As I developed Byterun, I was pleasantly surprised to find that the basic structure of this Python interpreter can be implemented in 500 lines of code. In this chapter we'll clarify the structure of the interpreter and give you enough background to explore. Our goal is not to show you every detail of the interpreter --- like programming and other interesting areas of computer science, you could probably devote several years to learning more about this topic.

Byterun was written by Ned Batchelder and myself, building on the work of Paul Swartz. Its structure is similar to that of the main Python implementation (CPython), so understanding Byterun will help you understand most interpreters, especially the CPython interpreter. (If you don't know what Python you're using, chances are it's CPython). Although Byterun is small, it can execute most simple Python programs (this chapter is based on bytecode generated in Python 3.5 and earlier versions, with some changes in the bytecode generated in Python 3.6).

Python interpreter

Before we begin, let's qualify what we mean by "Python interpreter". When discussing Python, the word "interpreter" can be used in many different ways. Sometimes the interpreter refers to the Python REPL, which is the interactive environment you get when you type python at the command line. Sometimes people use "Python interpreter" and "Python" more or less interchangeably to describe the process of executing Python code from start to finish. In this chapter, "interpreter" has a more precise meaning: the final step in the execution of a Python program.

Before the interpreter takes over, Python performs 3 other steps: lexical analysis, syntax parsing, and compilation. These three steps together convert the source code into a code object, which contains instructions that the interpreter can understand. The job of the interpreter is to interpret the instructions in the code object.

You may be surprised that there is a compilation step when executing Python code. Python is often called an interpreted language, like Ruby and Perl, as opposed to compiled languages ​​like C and Rust. However, this term is not as precise as it seems. Most interpreted languages, including Python, do have a compilation step. The reason why Python is called interpreted is that compared to compiled languages, it does relatively little work in the compilation step (the interpreter does relatively much work). As you'll see later in this chapter, Python's compiler requires less information about program behavior than a C compiler.

Python interpreter for Python

Byterun is a Python interpreter written in Python, which may seem strange to you, but nothing is stranger than a C compiler written in C. (In fact, the widely used gcc compiler is written in C itself.) You can write a Python interpreter in almost any language.

Writing Python in Python has both advantages and disadvantages. The biggest disadvantage is speed: executing code with Byterun is much slower than executing it with CPython. The CPython interpreter is implemented in C language and has been carefully optimized. However, Byterun is designed for learning, so speed is not important to us. The biggest advantage of using Python is that we can just implement the interpreter without worrying about the Python runtime part, especially the object system. For example, when Byterun needs to create a class, it falls back to "real" Python. Another advantage is that Byterun is easy to understand, partly because it's written in a high-level language (Python!) that people can easily understand (plus we don't optimize the interpreter - again, clarity and simplicity are more important than speed) )

Build an interpreter

Before we examine the Byterun code, we need to have some high-level understanding of the interpreter structure. How does the Python interpreter work?

The Python interpreter is a virtual machine, a software that simulates a real computer. Our virtual machine is a stack machine, which uses several stacks to complete operations (as opposed to a register machine, which reads and writes data from a specific memory address).

The Python interpreter is a bytecode interpreter: its input is a set of instructions called bytecode. When you write Python code, the lexical analyzer, syntax parser, and compiler generate code objects for the interpreter to operate on. Each code object contains a set of instructions to be executed - which is the bytecode - and some information required by the interpreter. Bytecode is an intermediate representation of Python code: it represents the source code in a way that the interpreter can understand. This is very similar to assembly language as an intermediate representation of C language and machine language.

micro interpreter

To make the explanation more concrete, let's start with a very small explainer. It can only calculate the sum of two numbers and understand only three instructions. All the code it executes is just different combinations of these three instructions. Here are the three instructions:

LOAD_VALUE

ADD_TWO_VALUES

PRINT_ANSWER

We don't care about lexicon, syntax, and compilation, so we don't care about how these instruction sets are generated. You can imagine that when you write 7 + 5, then a compiler generates the combination of those three instructions for you. If you have a suitable compiler, you can even write it in Lisp syntax, as long as it generates the same instructions.

hypothesis

7 + 5

Generate an instruction set like this:

what_to_execute = {

    "instructions": [("LOAD_VALUE", 0),  # the first number

                     ("LOAD_VALUE", 1),  # the second number

                     ("ADD_TWO_VALUES", None),

                     ("PRINT_ANSWER", None)],

    "numbers": [7, 5] }

The Python interpreter is a stack machine, so it must complete this addition by operating on the stack (see the figure below). The interpreter first executes the first instruction, LOAD_VALUE, and pushes the first number onto the stack. Then it pushes the second number onto the stack. Then, the third instruction, ADD_TWO_VALUES, first pops two numbers from the stack, adds them together, and then pushes the result onto the stack. The last step is to pop up the results and output them.

stack machine

stack machine

The LOAD_VALUE instruction tells the interpreter to push a number onto the stack, but the instruction itself does not specify what the number is. The instruction requires an additional piece of information to tell the interpreter where to find the number. So our instruction set has two parts: the instructions themselves and a list of constants. (In Python, bytecodes are what we call "instructions," and what the interpreter "executes" is code objects.)

Why not just embed the numbers directly into the instructions? Imagine if instead of numbers we were adding strings. We don't want to add something like a string to the directive because it can be of any length. In addition, our design also means that we only need one copy of the object, such as this addition 7 + 7, now the constant table "numbers" only needs to contain one [7].

You may wonder why you need a directive other than ADD_TWO_VALUES. Indeed, for our addition of two numbers, this example is a bit artificial. However, this instruction is the wheel on which more complex programs can be built. For example, with the three instructions we have defined so far, as long as the correct combination of instructions is given, we can add three numbers, or add any number. At the same time, the stack provides a clear way to track the state of the interpreter, which provides support for our growing complexity.

Now let's complete our interpreter. The interpreter object requires a stack, which can be represented by a list. It also requires a method to describe how to execute each instruction. For example, LOAD_VALUE will push a value onto the stack.

class Interpreter:

    def __init__(self):

        self.stack = []

    def LOAD_VALUE(self, number):

        self.stack.append(number)

    def PRINT_ANSWER(self):

        answer = self.stack.pop()

        print(answer)

    def ADD_TWO_VALUES(self):

        first_num = self.stack.pop()

        second_num = self.stack.pop()

        total = first_num + second_num

        self.stack.append(total)

These three methods complete the three instructions understood by the interpreter. But the interpreter needs one more thing: a way to tie everything together and execute it. This method is called run_code. It takes the what-to-execute dictionary structure we defined earlier as a parameter, executes each instruction in a loop, processes the parameters if the instruction has parameters, and then calls the corresponding method in the interpreter object.

    def run_code(self, what_to_execute):

        instructions = what_to_execute["instructions"]

        numbers = what_to_execute["numbers"]

        for each_step in instructions:

            instruction, argument = each_step

            if instruction == "LOAD_VALUE":

                number = numbers[argument]

                self.LOAD_VALUE(number)

            elif instruction == "ADD_TWO_VALUES":

                self.ADD_TWO_VALUES()

            elif instruction == "PRINT_ANSWER":

                self.PRINT_ANSWER()

To test, we create an interpreter object and call run_code with the 7 + 5 instruction set defined earlier.

    interpreter = Interpreter()

    interpreter.run_code(what_to_execute)

Obviously, it will output 12.

Even though our interpreter is very limited, this process is almost identical to how a real Python interpreter handles addition. Here, we have a few more points to note.

First, some instructions require parameters. In real Python bytecode, about half of the instructions have parameters. Like our example, parameters and directives are packaged together. Note that the parameters of the directive are different from the parameters passed to the corresponding method.

Second, the instruction ADD_TWO_VALUES takes no arguments, it pops the required values ​​from the interpreter stack. This is exactly the characteristic of a stack-based interpreter.

Remember we said that given the appropriate set of instructions, we can add multiple numbers without making any changes to the interpreter. Consider the following instruction set, what do you think will happen? If you had a suitable compiler, what code would compile to the following instruction set?

    what_to_execute = {

        "instructions": [("LOAD_VALUE", 0),

                         ("LOAD_VALUE", 1),

                         ("ADD_TWO_VALUES", None),

                         ("LOAD_VALUE", 2),

                         ("ADD_TWO_VALUES", None),

                         ("PRINT_ANSWER", None)],

        "numbers": [7, 5, 8] }

From this point, we start to see the extensibility of this structure: we can describe more operations by adding methods to the interpreter object (as long as there is a compiler that can generate a well-organized set of instructions for us).

variable

Next, add variable support to our interpreter. We need an instruction STORE_NAME to save the variable value; an instruction LOAD_NAME to get the variable value; and a mapping relationship from variable to value. For now, we ignore namespaces and scopes, so we can store the mapping of variables and values ​​directly in the interpreter object. Finally, we need to ensure that in addition to a list of constants, what_to_execute also has a list of variable names.

>>> def s():

...     a = 1

...     b = 2

...     print(a + b)

# a friendly compiler transforms `s` into:

    what_to_execute = {

        "instructions": [("LOAD_VALUE", 0),

                         ("STORE_NAME", 0),

                         ("LOAD_VALUE", 1),

                         ("STORE_NAME", 1),

                         ("LOAD_NAME", 0),

                         ("LOAD_NAME", 1),

                         ("ADD_TWO_VALUES", None),

                         ("PRINT_ANSWER", None)],

        "numbers": [1, 2],

        "names":   ["a", "b"] }

Our new implementation is below. To keep track of which name is bound to which value, we add an environment dictionary to the __init__ method. We also added the STORE_NAME and LOAD_NAME methods, which obtain the variable name and then set or retrieve the variable value from the environment dictionary.

Now the parameter of the instruction has two different meanings. It may be the index of the numbers list, or it may be the index of the names list. The interpreter knows which parameters it is by examining the executed instructions. We break this logic and put the mapping between the instruction and the parameters it uses in a separate method.

class Interpreter:

    def __init__(self):

        self.stack = []

        self.environment = {}

    def STORE_NAME(self, name):

        val = self.stack.pop()

        self.environment[name] = val

    def LOAD_NAME(self, name):

        val = self.environment[name]

        self.stack.append(val)

    def parse_argument(self, instruction, argument, what_to_execute):

        """ Understand what the argument to each instruction means."""

        numbers = ["LOAD_VALUE"]

        names = ["LOAD_NAME", "STORE_NAME"]

        if instruction in numbers:

            argument = what_to_execute["numbers"][argument]

        elif instruction in names:

            argument = what_to_execute["names"][argument]

        return argument

    def run_code(self, what_to_execute):

        instructions = what_to_execute["instructions"]

        for each_step in instructions:

            instruction, argument = each_step

            argument = self.parse_argument(instruction, argument, what_to_execute)

            if instruction == "LOAD_VALUE":

                self.LOAD_VALUE(argument)

            elif instruction == "ADD_TWO_VALUES":

                self.ADD_TWO_VALUES()

            elif instruction == "PRINT_ANSWER":

                self.PRINT_ANSWER()

            elif instruction == "STORE_NAME":

                self.STORE_NAME(argument)

            elif instruction == "LOAD_NAME":

                self.LOAD_NAME(argument)

At only five instructions, the run_code method is starting to get lengthy. If you keep this structure, then every instruction will need an if branch. Here, we are going to use Python's dynamic method search. We will always define a method named FOO for an instruction called FOO, so that we can use Python's getattr function to dynamically find the method at runtime without using this large branch structure. The run_code method now looks like this:

    def execute(self, what_to_execute):

        instructions = what_to_execute["instructions"]

        for each_step in instructions:

            instruction, argument = each_step

            argument = self.parse_argument(instruction, argument, what_to_execute)

            bytecode_method = getattr(self, instruction)

            if argument is None:

                bytecode_method()

            else:

                bytecode_method(argument)

Real Python bytecode

Now let's abandon our little instruction set and look at real Python bytecode. The structure of the bytecode is similar to the instruction set of our little interpreter, except that the bytecode represents the instruction with a byte instead of a name. To understand its structure, we will examine the bytecode of a function. Consider the following example:

>>> def cond():

...     x = 3

...     if x < 5:

...         return 'yes'

...     else:

...         return 'no'

...

Python exposes a large amount of internal information at runtime, and we can access this information directly through the REPL. For the function object cond, cond.__code__ is the code object associated with it, and cond.__code__.co_code is its bytecode. When you're writing Python code, you'd never want to use these properties directly, but this allows us to do all kinds of mischief while also taking a look at the internals.

>>> cond.__code__.co_code  # the bytecode as raw bytes

b'd\x01\x00}\x00\x00|\x00\x00d\x02\x00k\x00\x00r\x16\x00d\x03\x00Sd\x04\x00Sd\x00

   \x00S'

>>> list(cond.__code__.co_code)  # the bytecode as numbers

[100, 1, 0, 125, 0, 0, 124, 0, 0, 100, 2, 0, 107, 0, 0, 114, 22, 0, 100, 3, 0, 83, 

 100, 4, 0, 83, 100, 0, 0, 83]

When we output this bytecode directly, it looks completely incomprehensible - the only thing we understand is that it is a string of bytes. Fortunately, we have a very powerful tool at our disposal: the dis module in the Python standard library.

dis is a bytecode disassembler. The disassembler takes as input low-level code written for the machine, such as assembly code and bytecode, and outputs it in a human-readable form. When we run dis.dis, it outputs an explanation of each bytecode.

>>> dis.dis(cond)

  2           0 LOAD_CONST               1 (3)

              3 STORE_FAST               0 (x)

  3           6 LOAD_FAST                0 (x)

              9 LOAD_CONST               2 (5)

             12 COMPARE_OP               0 (<)

             15 POP_JUMP_IF_FALSE       22

  4          18 LOAD_CONST               3 ('yes')

             21 RETURN_VALUE

  6     >>   22 LOAD_CONST               4 ('no')

             25 RETURN_VALUE

             26 LOAD_CONST               0 (None)

             29 RETURN_VALUE

What do these mean? Let's take the first instruction LOAD_CONST as an example. The number in the first column (2) indicates the number of lines of corresponding source code. The number in the second column is the bytecode index, telling us that the instruction LOAD_CONST is at position 0. The third column is the human-readable name of the command itself. If the fourth column is present, it represents the parameters of the command. If the fifth column is present, it is a hint as to what the parameters are.

Consider the first few bytes of this bytecode: [100, 1, 0, 125, 0, 0]. These 6 bytes represent two instructions with parameters. We can use dis.opname, a mapping of bytes to readable strings, to find what instructions 100 and 125 represent:

>>> dis.recording[100]

'LOAD_CONST'

>>> dis.recording[125]

'STORE_FAST'

The second and third bytes - 1, 0 - are the parameters of LOAD_CONST, and the fifth and sixth bytes - 0, 0 - are the parameters of STORE_FAST. Just like our previous small example, LOAD_CONST needs to know where to find the constant, and STORE_FAST needs to know the name to store. (Python's LOAD_CONST is the same as LOAD_VALUE in our small example, and LOAD_FAST is the same as LOAD_NAME). So these six bytes represent the first line of source code x = 3 (Why use two bytes to represent the parameters of the instruction? If Python uses one byte, you can only have 256 constants/names per code object, whereas with Two bytes, which increases to 256 squared, 65536).

Conditional statements and loop statements

So far, our interpreter can only execute instructions one after another. The problem with this is that we often want to execute certain instructions multiple times, or skip them under certain conditions. In order to be able to write loops and branching structures, the interpreter must be able to jump within instructions. In a way, Python uses GOTO statements in bytecode to handle loops and branches! Let’s look at another disassembly result of the cond function:

>>> dis.dis(cond)

  2           0 LOAD_CONST               1 (3)

              3 STORE_FAST               0 (x)

  3           6 LOAD_FAST                0 (x)

              9 LOAD_CONST               2 (5)

             12 COMPARE_OP               0 (<)

             15 POP_JUMP_IF_FALSE       22

  4          18 LOAD_CONST               3 ('yes')

             21 RETURN_VALUE

  6     >>   22 LOAD_CONST               4 ('no')

             25 RETURN_VALUE

             26 LOAD_CONST               0 (None)

             29 RETURN_VALUE

The conditional expression if x < 5 in the third line is compiled into four instructions: LOAD_FAST, LOAD_CONST, COMPARE_OP and POP_JUMP_IF_FALSE. x < 5 corresponds to loading x, loading 5, and comparing the two values. The instruction POP_JUMP_IF_FALSE completes this if statement. This instruction pops the value at the top of the stack. If the value is true, nothing happens. If the value is false, the interpreter will jump to another instruction.

This instruction to be loaded is called the jump target, which is used as a parameter of the instruction POP_JUMP. Here, the jump target is 22, and the instruction with index 22 is LOAD_CONST, which corresponds to line 6 of the source code. (dis uses >> to mark the jump target.) If X < 5 is false, the interpreter ignores line 4 (return yes) and jumps directly to line 6 (return "no"). Therefore the interpreter selectively executes instructions through jump instructions.

Python's loops also rely on jumps. In the bytecode below, the line while x < 5 produces almost the same bytecode as if x < 10. In both cases, the interpreter performs the comparison first, and then executes POP_JUMP_IF_FALSE to control which instruction is executed next. The last bytecode of the fourth line, JUMP_ABSOLUT (where the loop body ends), allows the interpreter to return to the ninth instruction at the beginning of the loop. When x < 10 becomes false, POP_JUMP_IF_FALSE causes the interpreter to jump to the end of the loop, instruction 34.

>>> def loop():

...      x = 1

...      while x < 5:

...          x = x + 1

...      return x

...

>>> dis.dis(loop)

  2           0 LOAD_CONST               1 (1)

              3 STORE_FAST               0 (x)

  3           6 SETUP_LOOP              26 (to 35)

        >>    9 LOAD_FAST                0 (x)

             12 LOAD_CONST               2 (5)

             15 COMPARE_OP               0 (<)

             18 POP_JUMP_IF_FALSE       34

  4          21 LOAD_FAST                0 (x)

             24 LOAD_CONST               1 (1)

             27 BINARY_ADD

             28 STORE_FAST               0 (x)

             31 JUMP_ABSOLUTE            9

        >>   34 POP_BLOCK

  5     >>   35 LOAD_FAST                0 (x)

             38 RETURN_VALUE

Explore bytecode

I hope you use dis.dis to try out functions you write. Some interesting questions worth exploring:

What is the difference between a for loop and a while loop as far as the interpreter is concerned?

Is it possible to write two different functions that produce the same bytecode?

How does elif work? What about list comprehensions?

frame

So far, we have known that the Python virtual machine is a stack machine. It can execute instructions sequentially, jump between instructions, and push or pop stack values. But this is still far from the interpreter we expect. In the previous example, the last instruction is RETURN_VALUE, which corresponds to the return statement. But where does it return to?

To answer this question, we must add another layer of complexity: frames. A frame is a collection of information and code execution context. Frames are created and destroyed dynamically as Python code executes. Each frame corresponds to a call to the function - so each frame has only one code object associated with it, and a code object can have multiple frames. For example, if you have a function that calls itself 10 times recursively, this will produce 11 frames, one for each call, plus one frame for the startup module. In general, there is a frame for each scope of a Python program, such as modules, functions, and class definitions.

Frames exist in the call stack, a completely different stack than the one we discussed before. (The stack you are most familiar with is the call stack, which is the exception traceback you often see. Each traceback starting with "File 'program.py'" corresponds to a frame.) The stack the interpreter operates on when executing bytecode, We call it the data stack. In fact, there is a third stack, called the block stack, which is used for specific control flow blocks, such as loops and exception handling. Each frame in the call stack has its own data stack and block stack.

Let's illustrate this with a concrete example. Assume that the Python interpreter executes to the point marked 3 below. The interpreter is in the middle of a call to function foo, which then calls bar. Below is a schematic diagram of the frame call stack, block stack and data stack. What we are interested in is that the interpreter starts from the bottom foo(), then executes the function body of foo, and then reaches bar.

>>> def bar(y):

...     z = y + 3     # <--- (3) ... and the interpreter is here.

...     return z

...

>>> def foo():

...     a = 1

...     b = 2

...     return a + bar(b) # <--- (2) ... which is returning a call to bar ...

...

>>> foo()             # <--- (1) We're in the middle of a call to foo ...

3

call stack

call stack

Now, the interpreter is in the call of bar function. There are three frames in the call stack: one corresponding to the module layer, one corresponding to the function foo, and one corresponding to the function bar. (See image above) Once bar returns, its corresponding frame is popped from the call stack and discarded.

The bytecode instruction RETURN_VALUE tells the interpreter to pass a value between frames. First, it pops the top value of the data stack in the frame at the top of the call stack. Then the entire frame is popped and discarded. Finally, this value is pushed onto the data stack of the next frame.

When Ned Batchelder and I were writing Byterun, we had a major bug in our implementation for a long time. We only have one data stack in the entire virtual machine, not one for each frame. We wrote a lot of test code and ran it on both Byterun and real Python, hoping to get consistent results. We passed almost all the tests except for one thing, generators. Finally, by carefully reading the CPython source code, we discovered the error (thanks to Michael Arntzenius for his insight into this bug). Moving the data stack to each frame solves this problem.

Looking back at this bug, I was surprised to find that Python really relies very little on having a data stack per frame. Almost all operations in Python clear the data stack, so it is no problem for all frames to share a data stack. In the above example, when bar finishes executing, its data stack is empty. Even if foo shares this stack, its value will not be affected. However, a key feature of the corresponding generator is that it can pause the execution of a frame and return to other frames. After a period of time, it can return to the original frame and continue execution in the same state as when it left.

Byter run

Now we have enough background on the Python interpreter to examine Byterun.

There are four types of objects in Byterun.

The VirtualMachine class manages the high-level structure, especially the frame call stack, and contains the mapping of instructions to operations. This is a more complex version than the previous Inteprter object.

Frame class. Each Frame class has a code object and manages some other necessary state bits, especially the global and local namespaces, a pointer to the integer that called it, and the last executed bytecode instruction.

Function class, which is used to replace real Python functions. Recall that a new frame is created when a function is called. We implemented Function ourselves so that we can control the creation of new Frame.

Block class, it just wraps the three properties of the block. (The details of blocks are not central to the interpreter and we won't spend time on them; they are listed here because Byterun requires it.)

VirtualMachine class

Only one instance of VirtualMachine will be created each time the program runs because we only have one Python interpreter. VirtualMachine saves the call stack, exception status, and return values ​​passed between frames. Its entry point is the run_code method, which takes the compiled code object as a parameter, starts by creating a frame, and then runs the frame. This frame may create new frames; the call stack grows and shrinks as the program runs. Execution ends when the first frame returns.

class VirtualMachineError(Exception):

    pass

class VirtualMachine(object):

    def __init__(self):

        self.frames = []   # The call stack of frames.

        self.frame = None  # The current frame.

        self.return_value = None

        self.last_exception = None

    def run_code(self, code, global_names=None, local_names=None):

        """ An entry point to execute code using the virtual machine."""

        frame = self.make_frame(code, global_names=global_names, 

                                local_names=local_names)

        self.run_frame(frame)

Frame class

Next, let's write the Frame object. Frame is a collection of properties, it does not have any methods. As mentioned earlier, these properties include code objects generated by the compiler; local, global, and built-in namespaces; a reference to the previous frame; a data stack; a block stack; and the last executed instruction pointer. (We need to do a little more work with the built-in namespace, which Python handles differently in different modules; but this detail is not important for our virtual machine.)

class Frame(object):

    def __init__(self, code_obj, global_names, local_names, prev_frame):

        self.code_obj = code_obj

        self.global_names = global_names

        self.local_names = local_names

        self.prev_frame = prev_frame

        self.stack = []

        if prev_frame:

            self.builtin_names = prev_frame.builtin_names

        else:

            self.builtin_names = local_names['__builtins__']

            if hasattr(self.builtin_names, '__dict__'):

                self.builtin_names = self.builtin_names.__dict__

        self.last_instruction = 0

        self.block_stack = []

Next, we add frame operations to the virtual machine. There are three helper functions: a method to create a new frame (which is responsible for finding the namespace for the new frame), and methods to push and pop the stack. The fourth function, run_frame, completes the main work of executing the frame. We will discuss this method later.

class VirtualMachine(object):

    [... abridged...]

    # Frame manipulation

    def make_frame(self, code, callargs={}, global_names=None, local_names=None):

        if global_names is not None and local_names is not None:

            local_names = global_names

        elif self.frames:

            global_names = self.frame.global_names

            local_names = {}

        else:

            global_names = local_names = {

                '__builtins__': __builtins__,

                '__name__': '__main__',

                '__doc__': None,

                '__package__': None,

            }

        local_names . update ( callargs )

        frame = Frame(code, global_names, local_names, self.frame)

        return frame

    def push_frame(self, frame):

        self.frames.append(frame)

        self.frame = frame

    def pop_frame(self):

        self.frames.pop()

        if self.frames:

            self.frame = self.frames[-1]

        else:

            self.frame = None

    def run_frame(self):

        pass

        # we'll come back to this shortly

Function class

The implementation of Function is a bit convoluted, but most of the details are not important to understanding the interpreter. The important thing is that when the function is called - that is, the __call__ method is called - it creates a new Frame and runs it.

class Function(object):

    """

    Create a realistic function object, defining the things the interpreter expects.

    """

    __slots__ = [

        'func_code', 'func_name', 'func_defaults', 'func_globals',

        'func_locals', 'func_dict', 'func_closure',

        '__name__', '__dict__', '__doc__',

        '_vm', '_func',

    ]

    def __init__(self, name, code, globs, defaults, closure, vm):

        """You don't need to follow this closely to understand the interpreter."""

        self._vm = vm

        self.func_code = code

        self.func_name = self.__name__ = name or code.co_name

        self.func_defaults = tuple(defaults)

        self.func_globals = globs

        self.func_locals = self._vm.frame.f_locals

        self.__dict__ = {}

        self.func_closure = closure

        self.__doc__ = code.co_consts[0] if code.co_consts else None

        # Sometimes, we need a real Python function.  This is for that.

        kw = {

            'argdefs': self.func_defaults,

        }

        if closure:

            kw['closure'] = tuple(make_cell(0) for _ in closure)

        self._func = types.FunctionType(code, globs, **kw)

    def __call__(self, *args, **kwargs):

        """When calling a Function, make a new frame and run it."""

        callargs = inspect.getcallargs(self._func, *args, **kwargs)

        # Use callargs to provide a mapping of arguments: values to pass into the new 

        # frame.

        frame = self._vm.make_frame(

            self.func_code, callargs, self.func_globals, {}

        )

        return self._vm.run_frame(frame)

def make_cell(value):

    """Create a real Python closure and grab a cell."""

    # Thanks to Alex Gaynor for help with this bit of twistiness.

    fn = (lambda x: lambda: x)(value)

    return fn.__closure__[0]

Next, returning to the VirtualMachine object, we also add some helper methods for data stack operations. The stack for bytecode operations is always the data stack of the current frame. These helper functions make our implementation of POP_TOP, LOAD_FAST, and other instructions that operate on the stack more readable.

class VirtualMachine(object):

    [... abridged...]

    # Data stack manipulation

    def top(self):

        return self.frame.stack[-1]

    def pop(self):

        return self.frame.stack.pop()

    def push(self, *vals):

        self.frame.stack.extend(vals)

    def popn ( self , n ):

        """Pop a number of values from the value stack.

        A list of `n` values is returned, the deepest value first.

        """

        if n:

            ret = self.frame.stack[-n:]

            self.frame.stack[-n:] = []

            return ret

        else:

            return []

Before we can run the frame, we need two more methods.

The first method, parse_byte_and_args takes a bytecode as input, first checks whether it has parameters, and if so, parses its parameters. This method also updates the frame's last_instruction attribute, which points to the last instruction executed. An instruction without parameters is only one byte long, while an instruction with parameters is 3 bytes long. The meaning of the parameters depends on what the instruction is. For example, as mentioned earlier, the parameter of the instruction POP_JUMP_IF_FALSE refers to the jump target. BUILD_LIST, its parameter is the number of lists. LOAD_CONST, whose parameter is the index of a constant.

Some instructions take simple numbers as arguments. For others, the virtual machine requires a little effort to discover its meaning. There is a cheat sheet in the dis module in the standard library that explains what parameters mean what, which makes our code cleaner. For example, the list dis.hasname tells us that the parameters of LOAD_NAME, IMPORT_NAME, LOAD_GLOBAL, and the other nine instructions have the same meaning: for these instructions, their parameters represent indexes into the name list in the code object.

class VirtualMachine(object):

    [... abridged...]

    def parse_byte_and_args(self):

        f = self.frame

        opoffset = f.last_instruction

        byteCode = f.code_obj.co_code[opoffset]

        f.last_instruction += 1

        byte_name = dis.opname[byteCode]

        if byteCode >= dis.HAVE_ARGUMENT:

            # index into the bytecode

            arg = f.code_obj.co_code[f.last_instruction:f.last_instruction+2]  

            f.last_instruction += 2   # advance the instruction pointer

            arg_val = arg[0] + (arg[1] * 256)

            if byteCode in dis.hasconst:   # Look up a constant

                arg = f.code_obj.co_consts[arg_val]

            elif byteCode in dis.hasname:  # Look up a name

                arg = f.code_obj.co_names[arg_val]

            elif byteCode in dis.haslocal: # Look up a local name

                arg = f.code_obj.co_varnames[arg_val]

            elif byteCode in dis.hasjrel:  # Calculate a relative jump

                arg = f.last_instruction + arg_val

            else:

                arg = arg_val

            argument = [arg]

        else:

            argument = []

        return byte_name, argument

The next method is dispatch, which looks for the given instruction and performs the corresponding action. In CPython, this dispatch function is implemented as a huge switch statement, with over 1500 lines of code. Fortunately, since we are using Python, our code will be much simpler. We will define a method for each bytecode name and then use getattr to find it. Just like our previous small interpreter, if an instruction is called FOO_BAR, then its corresponding method is byte_FOO_BAR. For now, let's treat these methods as a black box. Each instruction method will return None or a string why. In some cases, the virtual machine needs this additional why information. The return values ​​of these instruction methods are only used as internal indicators of the interpreter state and must not be confused with the return values ​​of the execution frame.

class VirtualMachine(object):

    [... abridged...]

    def dispatch(self, byte_name, argument):

        """ Dispatch by bytename to the corresponding methods.

        Exceptions are caught and set on the virtual machine."""

        # When later unwinding the block stack,

        # we need to keep track of why we are doing it.

        why = None

        try:

            bytecode_fn = getattr(self, 'byte_%s' % byte_name, None)

            if bytecode_fn is None:

                if byte_name.startswith('UNARY_'):

                    self.unaryOperator(byte_name[6:])

                elif byte_name.startswith('BINARY_'):

                    self.binaryOperator(byte_name[7:])

                else:

                    raise VirtualMachineError(

                        "unsupported bytecode type: %s" % byte_name

                    )

            else:

                why = bytecode_fn(*argument)

        except:

            # deal with exceptions encountered while executing the op.

            self.last_exception = sys.exc_info()[:2] + (None,)

            why = 'exception'

        return why

    def run_frame(self, frame):

        """Run a frame until it returns (somehow).

        Exceptions are raised, the return value is returned.

        """

        self.push_frame(frame)

        while True:

            byte_name, arguments = self.parse_byte_and_args()

            why = self.dispatch(byte_name, arguments)

            # Deal with any block management we need to do

            while why and frame.block_stack:

                why = self.manage_block_stack(why)

            if why:

                break

        self.pop_frame()

        if why == 'exception':

            exc, val, tb = self.last_exception

            e = exc(val)

            e.__traceback__ = tb

            raise e

        return self.return_value

Block class

Before we complete each bytecode method, let's briefly discuss blocks. A block is used for some kind of control flow, specifically exception handling and looping. It is responsible for ensuring that the data stack is in the correct state when the operation is completed. For example, within a loop, a special iterator is stored on the stack and popped off the stack when the loop completes. The interpreter needs to check whether the loop is still continuing or has been stopped.

To track this additional information, the interpreter sets a flag to indicate its status. We use a variable why to implement this flag, which can be None or one of the following strings: "continue", "break", "excption", return. They indicate what operations are performed on the block stack and data stack. Going back to our iterator example, if the top of the block stack is a loop block and the why code is continue, the iterator should be saved on the data stack, and if why is break, the iterator will be popped.

The details of block operations are more tedious than this and we won't spend time on them, but it's worth a closer look for interested readers.

Block = collections.namedtuple("Block", "type, handler, stack_height")

class VirtualMachine(object):

    [... abridged...]

    # Block stack manipulation

    def push_block(self, b_type, handler=None):

        level = len(self.frame.stack)

        self.frame.block_stack.append(Block(b_type, handler, stack_height))

    def pop_block(self):

        return self.frame.block_stack.pop()

    def unwind_block(self, block):

        """Unwind the values on the data stack corresponding to a given block."""

        if block.type == 'except-handler':

            # The exception itself is on the stack as type, value, and traceback.

            offset = 3  

        else:

            offset = 0

        while len(self.frame.stack) > block.level + offset:

            self.pop()

        if block.type == 'except-handler':

            traceback, value, exctype = self.popn(3)

            self.last_exception = exctype, value, traceback

    def manage_block_stack(self, why):

        """ """

        frame = self.frame

        block = frame.block_stack[-1]

        if block.type == 'loop' and why == 'continue':

            self.jump(self.return_value)

            why = None

            return why

        self.pop_block()

        self.unwind_block(block)

        if block.type == 'loop' and why == 'break':

            why = None

            self.jump(block.handler)

            return why

        if (block.type in ['setup-except', 'finally'] and why == 'exception'):

            self.push_block('except-handler')

            exctype, value, tb = self.last_exception

            self.push(tb, value, exctype)

            self.push(tb, value, exctype) # yes, twice

            why = None

            self.jump(block.handler)

            return why

        elif block.type == 'finally':

            if why in ('return', 'continue'):

                self.push(self.return_value)

            self.push(why)

            why = None

            self.jump(block.handler)

            return why

        return why

instruction

All that's left is to complete those instruction methods: byte_LOAD_FAST, byte_BINARY_MODULO, etc. The implementation of these instructions is not very interesting. Here we only show a small part . The complete implementation is on GitHub. (The instructions included here are sufficient to execute all the code we described earlier.)

class VirtualMachine(object):

    [... abridged...]

    ## Stack manipulation

    def byte_LOAD_CONST(self, const):

        self.push(const)

    def byte_POP_TOP(self):

        self.pop()

    ## Names

    def byte_LOAD_NAME(self, name):

        frame = self.frame

        if name in frame.f_locals:

            val = frame.f_locals[name]

        elif name in frame.f_globals:

            val = frame.f_globals[name]

        elif name in frame.f_builtins:

            val = frame.f_builtins[name]

        else:

            raise NameError("name '%s' is not defined" % name)

        self.push(val)

    def byte_STORE_NAME(self, name):

        self.frame.f_locals[name] = self.pop()

    def byte_LOAD_FAST(self, name):

        if name in self.frame.f_locals:

            val = self.frame.f_locals[name]

        else:

            raise UnboundLocalError(

                "local variable '%s' referenced before assignment" % name

            )

        self.push(val)

    def byte_STORE_FAST(self, name):

        self.frame.f_locals[name] = self.pop()

    def byte_LOAD_GLOBAL(self, name):

        f = self.frame

        if name in f.f_globals:

            val = f.f_globals[name]

        elif name in f.f_builtins:

            val = f.f_builtins[name]

        else:

            raise NameError("global name '%s' is not defined" % name)

        self.push(val)

    ## Operators

    BINARY_OPERATORS = {

        'POWER':    pow,

        'MULTIPLY': operator.mul,

        'FLOOR_DIVIDE': operator.floordiv,

        'TRUE_DIVIDE':  operator.truediv,

        'MODULO':   operator.mod,

        'ADD':      operator.add,

        'SUBTRACT': operator.sub,

        'SUBSCR': operator.getitem,

        'LSHIFT':   operator.lshift,

        'RSHIFT':   operator.rshift,

        'AND':      operator.and_,

        'XOR':      operator.xor,

        'OR':       operator.or_,

    }

    def binaryOperator(self, op):

        x, y = self.popn(2)

        self.push(self.BINARY_OPERATORS[op](x, y))

    COMPARE_OPERATORS = [

        operator.lt,

        operator.le,

        operator.eq,

        operator.ne,

        operator.gt,

        operator.ge,

        lambda x, y: x in y,

        lambda x, y: x not in y,

        lambda x, y: x is y,

        lambda x, y: x is not y,

        lambda x, y: issubclass(x, Exception) and issubclass(x, y),

    ]

    def byte_COMPARE_OP(self, opnum):

        x, y = self.popn(2)

        self.push(self.COMPARE_OPERATORS[opnum](x, y))

    ## Attributes and indexing

    def byte_LOAD_ATTR(self, attr):

        obj = self.pop()

        val = getattr(obj, attr)

        self.push(val)

    def byte_STORE_ATTR(self, name):

        val, obj = self.popn(2)

        setattr(obj, name, val)

    ## Building

    def byte_BUILD_LIST(self, count):

        each = self.popn(count)

        self.push(elts)

    def byte_BUILD_MAP(self, size):

        self.push({})

    def byte_STORE_MAP(self):

        the_map, val, key = self.popn(3)

        the_map[key] = val

        self.push(the_map)

    def byte_LIST_APPEND(self, count):

        val = self.pop()

        the_list = self.frame.stack[-count] # peek

        the_list.append(val)

    ## Jumps

    def byte_JUMP_FORWARD(self, jump):

        self.jump(jump)

    def byte_JUMP_ABSOLUTE(self, jump):

        self.jump(jump)

    def byte_POP_JUMP_IF_TRUE(self, jump):

        val = self.pop()

        if val:

            self.jump(jump)

    def byte_POP_JUMP_IF_FALSE(self, jump):

        val = self.pop()

        if not val:

            self.jump(jump)

    ## Blocks

    def byte_SETUP_LOOP(self, dest):

        self.push_block('loop', dest)

    def byte_GET_ITER(self):

        self.push(iter(self.pop()))

    def byte_FOR_ITER(self, jump):

        iterobj = self.top()

        try:

            v = next(iterobj)

            self.push(v)

        except StopIteration:

            self.pop()

            self.jump(jump)

    def byte_BREAK_LOOP(self):

        return 'break'

    def byte_POP_BLOCK(self):

        self.pop_block()

    ## Functions

    def byte_MAKE_FUNCTION(self, argc):

        name = self.pop()

        code = self.pop()

        defaults = self.popn(argc)

        globs = self.frame.f_globals

        fn = Function(name, code, globs, defaults, None, self)

        self.push(fn)

    def byte_CALL_FUNCTION(self, arg):

        lenKw, lenPos = divmod(arg, 256) # KWargs not supported here

        posargs = self.popn(lenPos)

        func = self.pop()

        frame = self.frame

        retval = func(*posargs)

        self.push(retval)

    def byte_RETURN_VALUE(self):

        self.return_value = self.pop()

        return "return"

Dynamic typing: the compiler doesn't know what it is

You may have heard that Python is a dynamic language - it is dynamically typed. This information was revealed during the construction of our interpreter.

One meaning of dynamic is that a lot of work is done at runtime. We saw earlier that Python's compiler doesn't have a lot of information about what the code is really doing. As an example, consider the following simple function mod. It takes two parameters and returns their modulo values. From its bytecode, we see that variables a and b are loaded first, and then the bytecode BINAY_MODULO completes this modular operation.

>>> def mod(a, b):

...    return a % b

>>> dis.dis(mod)

  2           0 LOAD_FAST                0 (a)

              3 LOAD_FAST                1 (b)

              6 BINARY_MODULO

              7 RETURN_VALUE

>>> mod(19, 5)

4

Calculating 19 % 5 gives us 4 - not surprising at all. What if we use different types of parameters?

>>> mod("by%sde", "teco")

'bytecode'

What just happened? You may have seen this syntax elsewhere, formatting strings.

>>> print("by%sde" % "teco")

bytecode

Using the % symbol to format a string calls the bytecode BUNARY_MODULO. It modulo the two values ​​on the top of the stack, whether they are strings, numbers, or instances of a class you define. Bytecode is generated when the function is compiled (or in other words, when the function is defined). The same bytecode is used for parameters of different classes.

Python's compiler knows very little about the functionality of the bytecode, and it is up to the interpreter to decide what type of object BINAYR_MODULO should be applied to and perform the correct operation. This is why Python is described as dynamically typed: you don't have to know the types of the function arguments until runtime. In contrast, in a statically typed language, the programmer needs to tell the compiler what the parameter types are (or the compiler can infer the parameter types itself.)

Compiler ignorance is a challenge in optimizing Python - by just looking at the bytecode without actually running it, you don't know what each bytecode is doing! You can define a class that implements the __mod__ method, and Python will automatically call this method when you use % on an instance of this class. Therefore, BINARY_MODULO can actually run any code.

Looking at the code below, the first a % b looks useless.

def mod(a,b):

    a % b

    return a %b

Unfortunately, static analysis of this code - without running it - cannot determine that the first a % b does nothing. Calling __mod__ with % might write to a file, or interact with other parts of the program, or anything else that can be done in Python. It's hard to optimize a function that you don't know what it will do. In Russell Power and Alex Rubinsteyn's excellent paper, "How Fast Can We Interpret Python?", they say, "In the general lack of type information, each instruction must be treated as an INVOKE_ARBITRARY_METHOD."

Summarize

Byterun is a concise Python interpreter that is easier to understand than CPython. Byterun replicates the main structure of CPython: a stack-based interpreter operates on a set of instructions called bytecode, which execute sequentially or jump between instructions, pushing data onto the stack and popping data from it. The interpreter dynamically creates and destroys frames and jumps between frames as functions and generators are called and returned. Byterun also has the same limitations as a real interpreter: because Python uses dynamic typing, the interpreter must determine the correct behavior of instructions at runtime .

I encourage you to disassemble your program and run it using Byterun. You'll quickly discover instructions that are not implemented in this shortened version of Byterun. The complete implementation is at https://github.com/nedbat/byterun, or you can read the real CPython interpreter ceval.c carefully, or you can implement your own interpreter!

Acknowledgments

Thanks to Ned Batchelder for initiating this project and guiding my contributions, to Michael Arntzenius for help debugging the code and revising this article, to Leta Montopoli for revisions, and to the entire Recurse Center community for their support and encouragement. All shortcomings are entirely due to my own failure.

Compiled from: 500 Lines or LessA Python Interpreter Written in Python Author: Allison Kaptur
Original: LCTT  Software Development | Implementing Python Interpreter in Python Translator: qingyunha

Guess you like

Origin blog.csdn.net/delishcomcn/article/details/133063997