Python is an interpreted language (although the definition is blurred by the existence of bytecode compilers), that is, it does not need to be compiled into machine language before running, but is compiled into machine language at runtime. This means that source files can be run without having to explicitly create an executable to run.
In a nutshell, the execution of Python scripts can be simplified and summarized into the following two steps:
- Python Compiler: compiles Python code to bytecode
- Python virtual machine: execute bytecode line by line
Next, let's take a function script that calculates the golden section as an example, how the Python script is compiled into bytecode, and how the bytecode is run:
GOLD = 0.618
def get_golden_ratio(x):
"""计算黄金分割值"""
return GOLD * x
print(get_golden_ratio(3))
We first store the above Python code into a string script
, and then use the built-in function compile
to compile it to get a code object code
.
>>> script = ("GOLD = 0.618\n"
... "def get_golden_ratio(x):\n"
... " return GOLD * x\n"
... "print(get_golden_ratio(3))")
>>> code = compile(script, "test.py", "exec")
>>> code
<code object <module> at 0x000002BCD0AE2290, file "test.py", line 1>
code
The commonly used attributes and meanings of objects are as follows:
attribute name | attribute meaning |
---|---|
co_filename | Create the file name of code the object |
co_firstlineno | The line number of the first line in the Python source code |
co_name | code object name |
co_code | raw bytecode as a string |
co_consts | A tuple of constants used in the bytecode |
co_varnames | A tuple of parameter names and identifiers for local variables |
co_names | tuple of identifiers other than parameters and function local variables |
co_cellvars | tuple of identifiers for unit variables (referenced via the containing scope) |
co_freevars | tuple of identifiers for free variables (referenced by function closures) |
co_stacksize | Requires virtual machine stack space |
Information 2: Python Documentation > inspect
For example, we can view the bytecode of this code (based on Python 3.10) through co_code
the attribute :
>>> code.co_code
b'd\x00Z\x00d\x01d\x02\x84\x00Z\x01e\x02e\x01d\x03\x83\x01\x83\x01\x01\x00d\x04S\x00'
>>> [ch for ch in code.co_code]
[100, 0, 90, 0, 100, 1, 100, 2, 132, 0, 90, 1, 101, 2, 101, 1, 100, 3, 131, 1, 131, 1, 1, 0, 100, 4, 83, 0]
In Python 3.6 and above, each bytecode instruction contains 2 bytes (that is, every 2 integers between 0 and 255 in the above list constitute a bytecode instruction), and the first byte is the bytecode instruction , the second byte is the parameter of the bytecode instruction, if the bytecode instruction has no parameters, it will use 0
a placeholder . Note that bytecode is an implementation detail of the CPython interpreter, and there is no guarantee that bytecode will not be added, removed, or changed between Python versions.
Data 3: Python Documentation > glossary > bytecode
Source 4: Python Documentation > dis
You can also view all the identifiers used by this code through co_names
the attribute , or co_consts
view all the constants used by this code through the attribute:
>>> code.co_names
('GOLD', 'get_golden_ratio', 'print')
>>> code.co_consts
(0.618, <code object get_golden_ratio at 0x000002BCD0AE3E10, file "test.py", line 2>, 'get_golden_ratio', 3, None)
As can be seen through all the constants used in this code, the function get_gold_ratio
is compiled into another code
object, which is referenced here as a constant. Therefore, we can further view the bytecode corresponding to get_gold_ratio
the function (based on Python 3.10):
>>> code.co_consts[1].co_code
b't\x00|\x00\x14\x00S\x00'
>>> [ch for ch in code.co_consts[1].co_code]
[116, 0, 124, 0, 20, 0, 83, 0]
The Python virtual machine is a fully software-defined computer that executes the bytecode generated by the bytecode compiler.
Information 5: Python Documentation > glossary > virtual machine
In addition to analyzing the execution process in the Python virtual machine through the original bytecode in the form of a string through co_code
the attribute can also combine the standard library dis
to decompile the above Python code and analyze the execution process of the bytecode in the Python virtual machine (based on Python 3.10):
>>> import dis
>>> dis.dis(script)
1 0 LOAD_CONST 0 (0.618)
2 STORE_NAME 0 (GOLD)
2 4 LOAD_CONST 1 (<code object get_golden_ratio at 0x000002BCBEECFAA0, file "<dis>", line 2>)
6 LOAD_CONST 2 ('get_golden_ratio')
8 MAKE_FUNCTION 0
10 STORE_NAME 1 (get_golden_ratio)
4 12 LOAD_NAME 2 (print)
14 LOAD_NAME 1 (get_golden_ratio)
16 LOAD_CONST 3 (3)
18 CALL_FUNCTION 1
20 CALL_FUNCTION 1
22 POP_TOP
24 LOAD_CONST 4 (None)
26 RETURN_VALUE
Disassembly of <code object get_golden_ratio at 0x000002BCBEECFAA0, file "<dis>", line 2>:
3 0 LOAD_GLOBAL 0 (GOLD)
2 LOAD_FAST 0 (x)
4 BINARY_MULTIPLY
6 RETURN_VALUE
return GOLD * x
From bottom to top in order of reference, first explain the meaning of the bytecode corresponding to line 3 ( ):
LOAD_GLOBAL(116), 0
:code
Read the reference ofco_names
the 0th identifier (GOLD
) in the object of the previous layer, and push it to the top of the stack; at this time, there is 1 element in the stack;LOAD_FAST(124), 0
:code
Read the reference of theco_varnames
0th identifier ( ) in the current object, and push it to the top of the stack; at this time, there are 2 elements in the stack;x
BINARY_MULTIPLY(20), 0
: Continuously pop two stack top elementsGOLD
(x
the references of and respectively), execute*
the operator , and push the result to the top of the stack; at this time, there is 1 element in the stack;RETURN_VALUE(83), 0
: Pop the top element of the stack and return it to the caller; there is no element in the stack at this time, and the third line ends.
Line 1 ( GOLD = 0.618
) corresponds to the meaning of the bytecode:
LOAD_CONST(100), 0
:code
Read the reference ofco_consts
the 0th value (0.618
) of the current object, and push it to the top of the stack; at this time, there is 1 element in the stack;STORE_NAME(90), 0
: Pop the top element (0.618
reference) of the stack and assign itcode
toco_names
the 0th identifier (GOLD
) in the current object; at this time, there is no element in the stack, and line 1 ends.
Line 2 ( def get_golden_ratio(x):
) corresponds to the meaning of the bytecode:
LOAD_CONST(100), 1
:code
Read the reference ofco_consts
the first value ( the objectget_golden_ratio
) of the current object; there is 1 element in the stack at this time;code
LOAD_CONST(100), 2
:code
Read the reference ofco_consts
the second value (string"get_golden_ratio"
) in the current object; there are 2 elements in the stack at this time;MAKE_FUNCTION(132), 0
: Pop the top element of the stack (the reference"get_golden_ratio"
of ) as the name of the function; then pop the top element of the stack (the reference of the objectget_golden_ratio
) as the code associated with the function; construct a new function object and push it to the top of the stack; at this time there arecode
1 element;STORE_NAME(90), 1
: Pop the top element of the stack (the newly constructedget_golden_ratio
function code object), and assign it to the first identifier ( ) incode
the current object ; at this time, there is no element in the stack, and the second line ends.co_names
get_golden_ratio
Line 4 ( print(get_golden_ratio(3))
) corresponds to the meaning of the bytecode:
LOAD_NAME(101), 2
:code
Read the reference ofco_names
the second identifier (the code objectprint
of current object, and push it to the top of the stack; at this time, there is 1 element in the stack;LOAD_NAME(101), 1
:code
Read the reference ofco_names
the first identifier (get_golden_ratio
function code object) in the current object, and push it to the top of the stack; at this time, there are 2 elements in the stack;LOAD_CONST(100), 3
:code
Read the reference ofco_consts
the third value (integer3
) in the current object, and push it to the top of the stack; at this time, there are 3 elements in the stack;CALL_FUNCTION(131), 1
: Pop a stack top element (reference to an3
integer ) as a parameter of the function; then pop the top stack element (get_golden_ratio
function code object) as the called callable object; then call the callable function with parameters, the callable The return value returned by the object (get_golden_ratio
the return value of the function) is pushed to the top of the stack; at this time, there are 2 elements in the stack;CALL_FUNCTION(131), 1
: Pop a stack top element (get_golden_ratio
the reference to the return value of the function) as a parameter of the function, then pop the top stack element (the code objectprint
of ) as the called callable object; then call the callable with parameters Function, push the return value returned by the callable object (the return valueprint
ofNone
) to the top of the stack; at this time, there is 1 element in the stack;POP_TOP(1), 0
: Pop the top element of the stack and delete it; there is no element in the stack at this time;LOAD_CONST(100), 4
:code
Read the reference ofco_consts
the fourth value (None
) in the current object, and push it to the top of the stack; at this time, there is 1 element in the stack;RETURN_VALUE(83), 0
: Pop the top element of the stack and return it to the caller; there is no element in the stack at this time, and the fourth line ends.