Design and compilation principle of python intermediate code generator

Table of contents

Text content:

experiment procedure:

1. Convert to LL(1) grammar:

2. Write a recursive descent subroutine according to the LL(1) grammar:

3. According to the semantics, modify the recursive subroutine so that it can generate intermediate code

Input example:

Example output:


Text content:

Given the following grammar, write an intermediate code generator capable of generating three-address codes:

S->id=E;
S->if C then S;
S->while C do S;
C->E>E;
C->E<E;
C->E=E;
E->E+T;
E->E-T;
E->T;
T->T*F;
T->T/F;
T->F;
F->(E);
F->id;
F->int;

1. Consider the given grammar, eliminate left recursion, and extract left factors.

2. Compile and simplify the syntax diagram
3. Compile the algorithm of recursive subroutine

4. Compile each recursive subroutine function
5. Connect the lexical analysis function scan( ) of Experiment 1 to test

6. Design the data structure and algorithm for three-address code generation
7. Rewrite each recursive subroutine function into a code generation function

8. Compile the test program (main function)
9. Debug the program: input a statement and check the output three-address code

experiment procedure:

1. Convert to LL(1) grammar:

        Use the method in Python Preliminary Experiment 2 LL(1) Grammar Construction to convert the given grammar into LL(1) grammar

Note that since the article method treats each individual character as a non-terminal, a change to the grammar is required and then changed back:

id = o        f = if         t = then        w = while        d = do        1 = int

S->o=E;
S->f C t S;
S->w C d S;
C->E>E;
C->E<E;
C->E=E;
E->E+T;
E->E-T;
E->T;
T->T*F;
T->T/F;
T->F;
F->(E);
F->o;
F->1;

The grammar that eliminates left recursion and converts back is as follows:

The next two columns are the first character set and the successor character set respectively.

# 经过处理后的文法如下:
# S->id=E|if C then S|while C do S;     id if while     ;
# C->EC';                               ( id int        then do
# E->TE';                               ( id int        > = < ) ; then do
# T->FT';                               ( id int        + - > = < ) ; then do
# F->(E)|id|int;                        ( id int        * / + - > = < ) ; then do
# E'->+TE'|-TE'|;                       + - e           > = < ) ; then do
# T'->*FT'|/FT'|;                       * / e           + - > = < ) ; then do
# C'->>E|=E|<E;                         > = <           then do

2. Write a recursive descent subroutine according to the LL(1) grammar:

See here for the subroutine writing method: Design and Experimental Principle of Python Recursive Descent Analysis Method Compilation Principle 

Note that the scan() here has been changed a little to be able to adapt to hexadecimal and octal, and the keywords are also added

# scan 部分=============================================================================
# =====================================================================================
# 输出结果
def output(str, a, b, type):
    global program
    program.append([type, str[a:b + 1]])


# 判断字符串一部分是否属于关键字
# 是返回1不是返回2
def iskeywords(str, a, b):
    # 关键字
    keywords = {"if", "int", "for", "while", "do", "return", "break", "continue", 'then'}
    s = str[a:b]  # 拷贝字符
    if s in keywords:  # 判断是否存在,存在返回1,否则返回2
        return 1
    else:
        return 2


# 判断字符是否属于运算符或分隔符的一部分。
# 不是返回0,是返回1,是且后面能跟=号返回2
def belong_to(str, type):
    if type == 4:  # 选择运算符
        library = "+-*/=><!"  # 运算符
    else:  # 选择分隔符
        library = ",;{}()"  # 分隔符
    if str in library:  # 存在
        # 是可能后面跟=号的几个符号
        if type == 4 and library.index(str) >= 4:
            return 2
        else:
            return 1
    return 0


# 递归的词法分析函数,读入一行str字符串,初始位置 n = 0
# 分离+判断,打印输出类型
# 由之前的c语言版本改写而成
def scan(str, n):
    # 7 种类型(最后输出1 - 5)
    # -1
    # 0: 初始
    # 1: 关键字, 在keywords中
    # 2: 标识符
    # 3: 常数(无符号整型)
    # 4: 运算符和界符:+ - * / = > < >= <= !=
    # 5: 分隔符:, ; {}()
    # 6: 特殊数字(0x64 或者0o77(16进制或者8进制))开始判断
    # 7:特殊数字(0xa(16进制))
    # 8:特殊数字(0o77(8进制))
    i = n
    type = 0
    while i < len(str):
        if type == 0:  # 初始态
            if str[i] == ' ':  # 空格跳过
                n += 1
                i += 1
                continue
            elif str[i] == '\0' or str[i] == '\n':  # 是结束
                return
            elif ('a' <= str[i] <= 'z') or ('A' <= str[i] <= 'Z'):
                type = 1  # 是字母,
            elif str[i] == '0':
                type = 6  # 是数字,开头0,16进制或者8进制)
            elif '1' <= str[i] <= '9':
                type = 3  # 是数字,常数
            else:
                type = belong_to(str[i], 4)
                if type > 0:  # 是运算符
                    # 是能跟=号的运算符,后面是=号
                    if type == 2 and str[i + 1] == '=':
                        i = i + 1  # 结束位置后移
                    output(str, n, i, 4)  # 输出 + 递归 + 结束
                    scan(str, i + 1)
                    return
                elif belong_to(str[i], 5):  # 是分隔符
                    output(str, n, i, 5)  # 输出 + 递归 + 结束
                    scan(str, i + 1)
                    return
                else:
                    print("失败:", str[i])
                    return
        elif type == 1:  # 关键字或标识符
            if not (('a' <= str[i] <= 'z') or ('A' <= str[i] <= 'Z')):  # 不是字母了
                if '0' <= str[i] <= '9':  # 是数字,只能是标识符
                    type = 2
                else:  # 非字母数字
                    type = iskeywords(str, n, i)
                    output(str, n, i - 1, type)  # 输出 + 递归 + 结束
                    scan(str, i)
                    return
        elif type == 2:  # 标识符
            if not (('a' <= str[i] <= 'z') or ('A' <= str[i] <= 'Z')):
                # 不是字母了
                if not ('0' <= str[i] <= '9'):
                    # 不是数字
                    output(str, n, i - 1, type)  # 输出 + 递归 + 结束
                    scan(str, i)
                    return
        elif type == 3:
            if not ('0' <= str[i] <= '9'):
                # 不是数字
                output(str, n, i - 1, type)  # 输出 + 递归 + 结束
                scan(str, i)
                return
        elif type == 6:
            if str[i] == 'x':  # 16进制
                type = 7
            elif str[i] == 'o':  # 8进制
                type = 8
            elif str[i] == ' ':  # 就一个0
                output(str, n, i - 1, type)  # 输出 + 递归 + 结束
                scan(str, i)
                return
            else:
                print("%d失败" % type)
                return
        elif type == 7:  # 16进制
            if not (('0' <= str[i] <= '9') or ('a' <= str[i] <= 'f')):
                # 不是16数字
                output(str, n, i - 1, 3)  # 输出 + 递归 + 结束
                scan(str, i)
                return
        elif type == 8:  # 8进制
            if not ('0' <= str[i] <= '7'):
                # 不是8数字
                output(str, n, i - 1, 3)  # 输出 + 递归 + 结束
                scan(str, i)
                return
        else:
            print("%d失败" % type)
            return
        i += 1


# 递归下降分析程序部分=====================================================================
# =====================================================================================
# 经过处理后的文法如下:
# S->id=E|if C then S|while C do S;     id if while     ;
# C->EC';                               ( id int        then do
# E->TE';                               ( id int        > = < ) ;
# T->FT';                               ( id int        + - > = < ) ;
# F->(E)|id|int;                        ( id int        * / + - > = < ) ;
# E'->+TE'|-TE'|;                       + - e           > = < ) ;
# T'->*FT'|/FT'|;                       * / e           + - > = < ) ;
# C'->>E|=E|<E;                         > = <           then do


# 由于函数名不能有‘所以里面的’由1代替
def Parse():
    def ParseS():  # S的分析子程序 S->id=E|if C then S|while C do S;     id if while     ;
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[0] == 2:  # id=E
            MatchToken(2)
            MatchToken('=')
            ParseE()
        elif lookahead[1] == 'if':  # if C then S
            MatchToken('if')
            ParseC()
            MatchToken('then')
            ParseS()
        elif lookahead[1] == 'while':  # while C do S
            MatchToken('while')
            ParseC()
            MatchToken('do')
            ParseS()
        else:
            print("S 错误")
            parseerror = 1
            # exit(0)

    def ParseC():  # C的分析子程序 C->EC';                               ( id int        then do
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[1] == '(' or lookahead[0] == 2 or lookahead[0] == 3:  # EC'
            ParseE()
            ParseC1()
        else:
            print("C 错误")
            parseerror = 2
            # exit(0)

    def ParseE():  # E的分析子程序 E->TE';                               ( id int        > = < ) ;
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[1] == '(' or lookahead[0] == 2 or lookahead[0] == 3:  # TE'
            ParseT()
            ParseE1()
        else:
            print("E 错误")
            parseerror = 3
            # exit(0)

    def ParseT():  # T的分析子程序 T->FT';                               ( id int        + - > = < ) ; then do
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[1] == '(' or lookahead[0] == 2 or lookahead[0] == 3:  # FT'
            ParseF()
            ParseT1()
        else:
            print("T 错误")
            parseerror = 4
            # exit(0)

    def ParseF():  # F的分析子程序 F->(E)|id|int;                        ( id int        * / + - > = < ) ; then do
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[1] == '(':  # (E)
            MatchToken('(')
            ParseE()
            MatchToken(')')
        elif lookahead[0] == 2:  # id
            MatchToken(2)
        elif lookahead[0] == 3:  # int
            MatchToken(3)
        else:
            print("F 错误")
            parseerror = 5
            # exit(0)

    def ParseE1():  # E'的分析子程序 E'->+TE'|-TE'|;                       + - e           > = < ) ; then do
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[1] == '+':  # +TE'
            MatchToken('+')
            ParseT()
            ParseE1()
        elif lookahead[1] == '-':  # -TE'
            MatchToken('-')
            ParseT()
            ParseE1()
        elif lookahead[1] == ')' or lookahead[1] == ';' \
                or lookahead[1] == '>' or lookahead[1] == '=' or lookahead[1] == '<' or \
                lookahead[1] == 'then' or lookahead[1] == 'do':
            pass
        else:
            print("E1 错误")
            parseerror = 5
            # exit(0)

    def ParseT1():  # T'的分析子程序 T'->*FT'|/FT'|;                       * / e           + - > = < ) ; then do
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[1] == '*':  # *FT'
            MatchToken('*')
            ParseF()
            ParseT1()
        elif lookahead[1] == '/':  # /FT'
            MatchToken('/')
            ParseF()
            ParseT1()
        elif lookahead[1] == '+' or lookahead[1] == '-' or lookahead[1] == ')' or lookahead[1] == ';' or \
                lookahead[1] == '>' or lookahead[1] == '=' or lookahead[1] == '<' or \
                lookahead[1] == 'then' or lookahead[1] == 'do':
            pass
        else:
            print("T1 错误")
            parseerror = 5
            # exit(0)

    def ParseC1():  # C'的分析子程序 C'->>E|=E|<E;                         > = <           then do
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[1] == '>':
            MatchToken('>')
            ParseE()
        elif lookahead[1] == '=':
            MatchToken('=')
            ParseE()
        elif lookahead[1] == '<':
            MatchToken('<')
            ParseE()
        else:
            print("C1 错误")
            parseerror = 5
            # exit(0)

    def MatchToken(need_type):
        global lookahead, parseerror
        mate = 0
        if parseerror:
            return
        elif isinstance(need_type, int):  # 输入的是int
            if lookahead[0] == need_type:  # 输入的和需要的相同
                mate = 1
        elif lookahead[1] == need_type:  # 匹配的是字符串
            mate = 1
        if mate:
            lookahead = GetToken()  # 读入下一个
        else:
            print("需要", need_type, "实际", lookahead, "匹配错误")
            parseerror = 6
            # exit(0)

    def GetToken():
        global program, lookahead
        return program.pop(0)

    global program, lookahead, parseerror
    parseerror = 0  # 错误标记
    lookahead = program.pop(0)
    ParseS()
    if parseerror == 0:
        print("正确")


file = "program.txt"
file = open(file)  # 读取文件
while i := file.readline():
    program = []  # 记录读到的句子
    scan(i, 0)
    print(i[:-1])
    print(program)
    Parse()
file.close()

3. According to the semantics, modify the recursive subroutine so that it can generate intermediate code

A new class called NonTerminal is added here, which is used to hold inherited attributes and integrated attributes, and finally output integrated attributes.

# scan 部分=============================================================================
# =====================================================================================
# 输出结果
def output(str, a, b, type):
    global program
    program.append([type, str[a:b + 1]])


# 判断字符串一部分是否属于关键字
# 是返回1不是返回2
def iskeywords(str, a, b):
    # 关键字
    keywords = {"if", "int", "for", "while", "do", "return", "break", "continue", 'then'}
    s = str[a:b]  # 拷贝字符
    if s in keywords:  # 判断是否存在,存在返回1,否则返回2
        return 1
    else:
        return 2


# 判断字符是否属于运算符或分隔符的一部分。
# 不是返回0,是返回1,是且后面能跟=号返回2
def belong_to(str, type):
    if type == 4:  # 选择运算符
        library = "+-*/=><!"  # 运算符
    else:  # 选择分隔符
        library = ",;{}()"  # 分隔符
    if str in library:  # 存在
        # 是可能后面跟=号的几个符号
        if type == 4 and library.index(str) >= 4:
            return 2
        else:
            return 1
    return 0


# 递归的词法分析函数,读入一行str字符串,初始位置 n = 0
# 分离+判断,打印输出类型
# 由之前的c语言版本改写而成
def scan(str, n):
    # 7 种类型(最后输出1 - 5)
    # -1
    # 0: 初始
    # 1: 关键字, 在keywords中
    # 2: 标识符
    # 3: 常数(无符号整型)
    # 4: 运算符和界符:+ - * / = > < >= <= !=
    # 5: 分隔符:, ; {}()
    # 6: 特殊数字(0x64 或者0o77(16进制或者8进制))开始判断
    # 7:特殊数字(0xa(16进制))
    # 8:特殊数字(0o77(8进制))
    i = n
    type = 0
    while i < len(str):
        if type == 0:  # 初始态
            if str[i] == ' ':  # 空格跳过
                n += 1
                i += 1
                continue
            elif str[i] == '\0' or str[i] == '\n':  # 是结束
                return
            elif ('a' <= str[i] <= 'z') or ('A' <= str[i] <= 'Z'):
                type = 1  # 是字母,
            elif str[i] == '0':
                type = 6  # 是数字,开头0,16进制或者8进制)
            elif '1' <= str[i] <= '9':
                type = 3  # 是数字,常数
            else:
                type = belong_to(str[i], 4)
                if type > 0:  # 是运算符
                    # 是能跟=号的运算符,后面是=号
                    if type == 2 and str[i + 1] == '=':
                        i = i + 1  # 结束位置后移
                    output(str, n, i, 4)  # 输出 + 递归 + 结束
                    scan(str, i + 1)
                    return
                elif belong_to(str[i], 5):  # 是分隔符
                    output(str, n, i, 5)  # 输出 + 递归 + 结束
                    scan(str, i + 1)
                    return
                else:
                    print("失败:", str[i])
                    return
        elif type == 1:  # 关键字或标识符
            if not (('a' <= str[i] <= 'z') or ('A' <= str[i] <= 'Z')):  # 不是字母了
                if '0' <= str[i] <= '9':  # 是数字,只能是标识符
                    type = 2
                else:  # 非字母数字
                    type = iskeywords(str, n, i)
                    output(str, n, i - 1, type)  # 输出 + 递归 + 结束
                    scan(str, i)
                    return
        elif type == 2:  # 标识符
            if not (('a' <= str[i] <= 'z') or ('A' <= str[i] <= 'Z')):
                # 不是字母了
                if not ('0' <= str[i] <= '9'):
                    # 不是数字
                    output(str, n, i - 1, type)  # 输出 + 递归 + 结束
                    scan(str, i)
                    return
        elif type == 3:
            if not ('0' <= str[i] <= '9'):
                # 不是数字
                output(str, n, i - 1, type)  # 输出 + 递归 + 结束
                scan(str, i)
                return
        elif type == 6:
            if str[i] == 'x':  # 16进制
                type = 7
            elif str[i] == 'o':  # 8进制
                type = 8
            elif str[i] == ' ':  # 就一个0
                output(str, n, i - 1, type)  # 输出 + 递归 + 结束
                scan(str, i)
                return
            else:
                print("%d失败" % type)
                return
        elif type == 7:  # 16进制
            if not (('0' <= str[i] <= '9') or ('a' <= str[i] <= 'f')):
                # 不是16数字
                output(str, n, i - 1, 3)  # 输出 + 递归 + 结束
                scan(str, i)
                return
        elif type == 8:  # 8进制
            if not ('0' <= str[i] <= '7'):
                # 不是8数字
                output(str, n, i - 1, 3)  # 输出 + 递归 + 结束
                scan(str, i)
                return
        else:
            print("%d失败" % type)
            return
        i += 1


# 递归下降分析程序部分=====================================================================
# =====================================================================================
# 经过处理后的文法如下:
# S->id=E|if C then S|while C do S;     id if while     ;
# C->EC';                               ( id int        then do
# E->TE';                               ( id int        > = < ) ; then do
# T->FT';                               ( id int        + - > = < ) ; then do
# F->(E)|id|int;                        ( id int        * / + - > = < ) ; then do
# E'->+TE'|-TE'|;                       + - e           > = < ) ; then do
# T'->*FT'|/FT'|;                       * / e           + - > = < ) ; then do
# C'->>E|=E|<E;                         > = <           then do


class NonTerminal:
    def __init__(self):
        self.begin = ''
        self.place = ''
        self.next = ''
        self.true = ''
        self.false = ''
        self.code = []


# 由于函数名不能有‘所以里面的’由1代替
def Parse():
    def ParseS(S):  # S的分析子程序 S->id=E|if C then S|while C do S;     id if while     ;
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[0] == 2:  # id=E
            id = MatchToken(2)
            MatchToken('=')
            E = NonTerminal()
            ParseE(E)
            S.code = E.code
            S.code = [E.code, id.place, " := ", E.place,'\n','goto ', S.next, '\n']
        elif lookahead[1] == 'if':  # if C then S
            MatchToken('if')
            C = NonTerminal()
            C.true = newlable()
            C.false = S.next
            ParseC(C)
            MatchToken('then')
            S1 = NonTerminal()
            S1.begin = C.true
            S1.next = S.next
            ParseS(S1)
            S.code = [C.code, C.true, ':\n',  S1.code, ]
        elif lookahead[1] == 'while':  # while C do S
            MatchToken('while')
            C = NonTerminal()
            C.true = newlable()
            C.false = S.next
            ParseC(C)
            MatchToken('do')
            S1 = NonTerminal()
            S1.begin = C.true

            S1.next = S.begin
            ParseS(S1)
            S.code = [C.code, C.true, ':\n', S1.code, ]
        else:
            print("S 错误")
            parseerror = 1
            # exit(0)

    def ParseC(C):  # C的分析子程序 C->EC';                               ( id int        then do
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[1] == '(' or lookahead[0] == 2 or lookahead[0] == 3:  # EC'
            E = NonTerminal()
            ParseE(E)
            C1 = NonTerminal()
            ParseC1(C1)
            C.code = [E.code, C1.code, 'if ', E.place, C1.place, ' goto ', C.true, '\n', ' goto ', C.false, '\n']
        else:
            print("C 错误")
            parseerror = 2
            # exit(0)

    def ParseE(E):  # E的分析子程序 E->TE';                               ( id int        > = < ) ;
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[1] == '(' or lookahead[0] == 2 or lookahead[0] == 3:  # TE'
            T = NonTerminal()
            ParseT(T)
            E1 = NonTerminal()
            ParseE1(E1)
            if len(E1.code) > 0:  # 有东西
                E.place = newtemp()
                E.code = [T.code, E1.code, E.place, ' := ', T.place, E1.place, '\n']
            else:
                E.place = T.place
                E.code = [T.code]
        else:
            print("E 错误")
            parseerror = 3
            # exit(0)

    def ParseT(T):  # T的分析子程序 T->FT';                               ( id int        + - > = < ) ; then do
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[1] == '(' or lookahead[0] == 2 or lookahead[0] == 3:  # FT'
            F = NonTerminal()
            ParseF(F)
            T1 = NonTerminal()
            ParseT1(T1)
            if len(T1.code) > 0:  # 有东西
                T.place = newtemp()
                T.code = [F.code, T1.code, T.place, ' := ', F.place, T1.place, '\n']
            else:
                T.place = F.place
                T.code = [F.code]
        else:
            print("T 错误")
            parseerror = 4
            # exit(0)

    def ParseF(F):  # F的分析子程序 F->(E)|id|int;                        ( id int        * / + - > = < ) ; then do
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[1] == '(':  # (E)
            MatchToken('(')
            E = NonTerminal()
            ParseE(E)
            MatchToken(')')
            F.place = E.place
            F.code = E.code
        elif lookahead[0] == 2:  # id
            E = MatchToken(2)
            F.place = E.place
            F.code = E.code
        elif lookahead[0] == 3:  # int
            E = MatchToken(3)
            F.place = E.place
            F.code = E.code
        else:
            print("F 错误")
            parseerror = 5
            # exit(0)

    def ParseE1(E1):  # E'的分析子程序 E'->+TE'|-TE'|;                       + - e           > = < ) ; then do
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[1] == '+':  # +TE'
            MatchToken('+')
            T = NonTerminal()
            ParseT(T)
            E2 = NonTerminal()
            ParseE1(E2)
            E1.place = '+' + T.place
            E1.code = [T.code, E2.code, ]
        elif lookahead[1] == '-':  # -TE'
            MatchToken('-')
            T = NonTerminal()
            ParseT(T)
            E2 = NonTerminal()
            ParseE1(E2)
            E1.place = '-' + T.place
            E1.code = [T.code, E2.code, ]
        elif lookahead[1] == ')' or lookahead[1] == ';' \
                or lookahead[1] == '>' or lookahead[1] == '=' or lookahead[1] == '<' or \
                lookahead[1] == 'then' or lookahead[1] == 'do':
            pass
        else:
            print("E1 错误")
            parseerror = 5
            # exit(0)

    def ParseT1(T1):  # T'的分析子程序 T'->*FT'|/FT'|;                       * / e           + - > = < ) ; then do
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[1] == '*':  # *FT'
            MatchToken('*')
            F = NonTerminal()
            ParseF(F)
            T2 = NonTerminal()
            ParseT1(T2)
            T1.place = '*' + F.place
            T1.code = [F.code,  T2.code, ]
        elif lookahead[1] == '/':  # /FT'
            MatchToken('/')
            F = NonTerminal()
            ParseF(F)
            T2 = NonTerminal()
            ParseT1(T2)
            T1.place = '/' + F.place
            T1.code = [F.code, '\n', T2.code, ]
        elif lookahead[1] == '+' or lookahead[1] == '-' or lookahead[1] == ')' or lookahead[1] == ';' or \
                lookahead[1] == '>' or lookahead[1] == '=' or lookahead[1] == '<' or \
                lookahead[1] == 'then' or lookahead[1] == 'do':
            pass
        else:
            print("T1 错误")
            parseerror = 5
            # exit(0)

    def ParseC1(C1):  # C'的分析子程序 C'->>E|=E|<E;                         > = <           then do
        global lookahead, parseerror
        if parseerror:
            return
        elif lookahead[1] == '>':
            MatchToken('>')
            E = NonTerminal()
            ParseE(E)
            C1.place = '>' + E.place
            C1.code = [E.code]
        elif lookahead[1] == '=':
            MatchToken('=')
            E = NonTerminal()
            ParseE(E)
            C1.place = '=' + E.place
            C1.code = [E.code]
        elif lookahead[1] == '<':
            MatchToken('<')
            E = NonTerminal()
            ParseE(E)
            C1.place = '<' + E.place
            C1.code = [E.code]
        else:
            print("C1 错误")
            parseerror = 5
            # exit(0)

    def MatchToken(need_type):
        global lookahead, parseerror
        A = NonTerminal()
        mate = 0
        if parseerror:
            return
        elif isinstance(need_type, int):  # 输入的是int
            if lookahead[0] == need_type:  # 输入的和需要的相同
                if need_type == 2:
                    A.place = lookahead[1]
                if need_type == 3:
                    A.place = str(eval(lookahead[1]))
                A.code = ['']
                mate = 1
                lookahead = GetToken()  # 读入下一个
                return A
        elif lookahead[1] == need_type:  # 匹配的是字符串
            mate = 1
        if mate:
            lookahead = GetToken()  # 读入下一个
        else:
            print("需要", need_type, "实际", lookahead, "匹配错误")
            parseerror = 6
            # exit(0)

    def GetToken():
        global program, lookahead
        return program.pop(0)

    def newlable():
        global lable_count
        lable_count += 1
        return "L" + str(lable_count)

    def newtemp():
        global z_temp
        z_temp += 1
        return "z" + str(z_temp)

    def Print_Place(code):
        s = ''
        for i in code:
            if isinstance(i, list):
                s += Print_Place(i)
            else:
                s = s + i
        return s

    global program, parseerror, lookahead, lable_count, z_temp
    lable_count = 0  # 块号
    z_temp = 0
    parseerror = 0  # 错误标记
    lookahead = program.pop(0)
    S = NonTerminal()
    S.begin = newlable()
    S.next = 'L0'
    try:
        ParseS(S)
        if parseerror == 0:
            print("正确")
            print("中间代码:")
            print(S.begin,':')
            print(Print_Place(S.code))
            print("L0: # S->next")
    except:
        print("错误")



file = "program.txt"
file = open(file)  # 读取文件
while i := file.readline():
    print("# =====================================================================================")
    program = []  # 记录读到的句子
    scan(i, 0)
    print(i[:-1])
    print(program)
    Parse()
file.close()

Input example:

In the file "program.txt"

a = 15*8-25;
while a>3 do a=a+3;
if s>0xa then s=s+1;
while a=0o16 do if a>5 then a = a+1;
while a>5 do if a=5 then while a<8 do a=a+1;
while (a3+15)>0xa do if x2>0o7 then while y<z do y=x*y/z;

Example output:

# =====================================================================================
a = 15*8-25;
[[2, 'a'], [4, '='], [3, '15'], [4, '*'], [3, '8'], [4, '-'], [3, '25'], [5, ';']]
正确
中间代码:
L1 :
z1 := 15*8
z2 := z1-25
a := z2
goto L0

L0: # S->next
# =====================================================================================
while a>3 do a=a+3;
[[1, 'while'], [2, 'a'], [4, '>'], [3, '3'], [1, 'do'], [2, 'a'], [4, '='], [2, 'a'], [4, '+'], [3, '3'], [5, ';']]
正确
中间代码:
L1 :
if a>3 goto L2
 goto L0
L2:
z1 := a+3
a := z1
goto L1

L0: # S->next
# =====================================================================================
if s>0xa then s=s+1;
[[1, 'if'], [2, 's'], [4, '>'], [3, '0xa'], [1, 'then'], [2, 's'], [4, '='], [2, 's'], [4, '+'], [3, '1'], [5, ';']]
正确
中间代码:
L1 :
if s>10 goto L2
 goto L0
L2:
z1 := s+1
s := z1
goto L0

L0: # S->next
# =====================================================================================
while a=0o16 do if a>5 then a = a+1;
[[1, 'while'], [2, 'a'], [4, '='], [3, '0o16'], [1, 'do'], [1, 'if'], [2, 'a'], [4, '>'], [3, '5'], [1, 'then'], [2, 'a'], [4, '='], [2, 'a'], [4, '+'], [3, '1'], [5, ';']]
正确
中间代码:
L1 :
if a=14 goto L2
 goto L0
L2:
if a>5 goto L3
 goto L1
L3:
z1 := a+1
a := z1
goto L1

L0: # S->next
# =====================================================================================
while a>5 do if a=5 then while a<8 do a=a+1;
[[1, 'while'], [2, 'a'], [4, '>'], [3, '5'], [1, 'do'], [1, 'if'], [2, 'a'], [4, '='], [3, '5'], [1, 'then'], [1, 'while'], [2, 'a'], [4, '<'], [3, '8'], [1, 'do'], [2, 'a'], [4, '='], [2, 'a'], [4, '+'], [3, '1'], [5, ';']]
正确
中间代码:
L1 :
if a>5 goto L2
 goto L0
L2:
if a=5 goto L3
 goto L1
L3:
if a<8 goto L4
 goto L1
L4:
z1 := a+1
a := z1
goto L3

L0: # S->next
# =====================================================================================
while (a3+15)>0xa do if x2>0o7 then while y<z do y=x*y/z
[[1, 'while'], [5, '('], [2, 'a3'], [4, '+'], [3, '15'], [5, ')'], [4, '>'], [3, '0xa'], [1, 'do'], [1, 'if'], [2, 'x2'], [4, '>'], [3, '0o7'], [1, 'then'], [1, 'while'], [2, 'y'], [4, '<'], [2, 'z'], [1, 'do'], [2, 'y'], [4, '='], [2, 'x'], [4, '*'], [2, 'y'], [4, '/'], [2, 'z'], [5, ';']]
正确
中间代码:
L1 :
z1 := a3+15
if z1>10 goto L2
 goto L0
L2:
if x2>7 goto L3
 goto L1
L3:
if y<z goto L4
 goto L1
L4:

z2 := x*y
y := z2
goto L3

L0: # S->next

Guess you like

Origin blog.csdn.net/weixin_58196051/article/details/131217551