[Formal method] Part A: Concrete execution

In this lesson, we discussed a specific style of defining program semantics: operational semantics, that is, programs are executed on certain virtual machines. We also discussed symbolic execution, which is similar to operational semantics, except that the program is executed on a symbolic virtual machine, and all program paths are explored in a systematic way. Finally, we discussed the execution of concolic, that is, the specific execution and symbolic execution of the program.

In this task, we will build several actuators: concrete actuators, symbolic actuators and concolic actuators. Specifically, this assignment is divided into four parts, each of which contains some tutorials and questions. The first part is about specific execution. You will implement an executor based on the large-step operation semantics we discussed in the class; the second part is symbolic execution, and the third part is concolic execution, which combines symbolic execution and specific execution; Finally, and in the last part, we will gain some experience in symbolic execution applications by applying this technique to an easy-to-understand security problem: SQL injection. Some problems are marked as exercises and you should solve them. Some questions are marked as challenges and they are optional. First download this code template.

Part A: Concrete execution (specific execution)

The idea of ​​specific execution is very simple: we build a virtual machine to execute the program. The execution rules of the VM follow the operational semantic rules (large steps or small steps, depending on the style you choose). In many cases, such an execution engine is usually called an interpreter, because we interpret the code directly instead of compiling them into native code.

In this part, we will build a specific executor (interpreter) for the MiniPy language.

MiniPy's abstract syntax:

Among them, the non-terminal F represents a function, (x1, ..., xn) represents the parameters of the function, S represents the statement in the function, E represents the return expression of the function; the non-terminal S represents a statement; the non-terminal E represents An expression; non-terminal B represents a binary operator.

The specific execution program (interpreter) you will build will directly execute MiniPy's source code. To do this, you will first build a data structure to represent MiniPy's grammar. This data structure is called an abstract syntax tree (AST), as we have seen many times in previous homework, for various languages.

Exercise 1: Read the code in mini_py. It contains the AST data structure defined for you. In order to keep it simple (but not simpler), we modified the MiniPy syntax, the biggest change is the deletion of the function call statement f(E1,...,En). You will also find that the magic method __str__() in the function class is not implemented. Your task in this exercise is to complete it. After finishing the code, don't forget to run the test case to get the required code print result.

# This bunch of code declare the syntax for the language MiniPy:
'''
B ::= + | - | * | / | == | != | > | < | >= | <=
E ::= n | x | E B E
S ::= pass
    | x = E
    | seq(S, S)
    | f(E1, ..., En)
    | if(E, S, S)
    | while(E, S)
F ::= f((x1, ..., xn), S, E)
'''


##################################
# bops
class Bop(Enum):
    ADD = "+"
    MIN = "-"
    MUL = "*"
    DIV = "/"
    EQ = "=="
    NE = "!="
    GT = ">"
    GE = ">="
    LT = "<"
    LE = "<="


##########################################
# expressions
class Expr:
    pass


class ExprNum(Expr):
    def __init__(self, n: int):
        self.num = n

    def __str__(self):
        return f"{self.num}"


class ExprVar(Expr):
    def __init__(self, var: str):
        self.var = var

    def __str__(self):
        return f"{self.var}"


class ExprBop(Expr):
#表达式函数、比如 a + b、a - b这种式子
    def __init__(self, left: Expr, right: Expr, bop: Bop):
        self.left = left
        self.right = right
        self.bop = bop

    def __str__(self):
        if isinstance(self.left, ExprBop):
            left_str = f"({self.left})"
        else:
            left_str = f"{self.left}"

        if isinstance(self.right, ExprBop):
            right_str = f"({self.right})"
        else:
            right_str = f"{self.right}"

        return f"{left_str} {self.bop.value} {right_str}"# 左 OP 右边


###############################################
# statement
class Stmt:
    def __init__(self):
        self.level = 0

    def __repr__(self):
        return str(self)


class StmtAssign(Stmt):
# 语句函数 比如 x = ?

    def __init__(self, var: str, expr: Expr):
        super().__init__()
        self.var = var
        self.expr = expr

    def __str__(self):
        indent_space = self.level * "\t"
        return f"{indent_space}{self.var} = {self.expr}\n" # a = 式子


class StmtIf(Stmt):
    def __init__(self, expr: Expr, then_stmts: List[Stmt], else_stmts: List[Stmt]):
        super().__init__()
        self.expr = expr
        self.then_stmts = then_stmts
        self.else_stmts = else_stmts

    def __str__(self):
        indent_space = self.level * "\t"

        for stm in self.then_stmts:
            stm.level = self.level + 1

        for stm in self.else_stmts:
            stm.level = self.level + 1

        then_stmts_str = "".join([str(stmt) for stmt in self.then_stmts])
        else_stmts_str = "".join([str(stmt) for stmt in self.else_stmts])

        then_str = (f"{indent_space}if {self.expr} :\n"
                    f"{then_stmts_str}")

        if self.else_stmts:
            return (f"{then_str}"
                    f"{indent_space}else:\n"
                    f"{else_stmts_str}")
        else:
            return then_str

#while语句主要是两个部分,条件、函数体,其中函数体包含多个语句,所以传入的是Stmt类型的语句数组List[Stmt],条件一般是不等式,所以需要ExprBop函数来构建
class StmtWhile(Stmt):
    def __init__(self, expr: Expr, stmts: List[Stmt]):
        super().__init__()
        self.expr = expr
        self.stmts = stmts

    def __str__(self):
        indent_space = self.level * "\t"
        for stmt in self.stmts:
            stmt.level = self.level + 1

        #函数体:把数组里面的stmt数组一条条输出
        stmts_str = "".join([str(stmt) for stmt in self.stmts])

        return (f"{indent_space}while {self.expr}:\n"
                f"{stmts_str}")
        # 返回的结果是 while(条件表达式):换行 函数体


###############################################
# function
class Function:
    def __init__(self, name: str, args: List[str], stmts: List[Stmt], ret: Expr):
        self.name = name
        self.args = args
        self.stmts = stmts
        self.ret = ret

    def __str__(self):
        arg_str = ",".join(self.args)
        for stmt in self.stmts:
            stmt.level += 1

        stmts_str = "".join([str(stmt) for stmt in self.stmts])

        # exercise 1: Finish the magic methods __str__() method to get the
        # desired code-printing result:
        #
        # Your code here:

        return ("def " + self.name + "(" + arg_str + ")\n") + (stmts_str) + ("\treturn ") + str(self.ret)

###############################################
# test
#StmtAssign('s',ExprNum(0)),表达的是 s = 0
# StmtAssign('i', ExprNum(0)), 表达的是 i = 0
# StmtWhile 表达while语句、需要传入条件,当然是传入表达式,就需要ExprBop函数,
#表达while i <= n -3
#ExprBop(左边,右边,操作符)
#ExprBop(i, n-3, <=)
#n-3又需要ExprBop函数来表示,所以是 ExprBop(n,3, -)
#对应好相应的类型,i要写成ExprVar('i'),n写成ExprVar('n'),3写成ExprNum(3),-写成Bop.MIN,<=写成Bop.LE
#所以这个while表达式就是 StmtWhile(ExprBop(ExprVar('i'), ExprBop(ExprVar('n'), ExprNum(3), Bop.MIN), Bop.LE)
#同理IF表达式也是这样分析
test_stmt = [StmtAssign('s', ExprNum(0)),
             StmtAssign('i', ExprNum(0)),
             StmtWhile(ExprBop(ExprVar('i'), ExprBop(ExprVar('n'), ExprNum(3), Bop.MIN), Bop.LE),
                       [StmtAssign('s', ExprBop(ExprVar('s'), ExprVar('i'), Bop.ADD)),
                        StmtAssign('i', ExprBop(ExprVar('i'), ExprNum(1), Bop.ADD)),
                        StmtIf(ExprBop(ExprVar('s'), ExprVar('i'), Bop.GT),
                               [StmtAssign("b", ExprBop(ExprVar('s'), ExprNum(1), Bop.MIN))],
                               [])
                        ]),
             StmtIf(ExprBop(ExprVar('s'), ExprVar('i'), Bop.GT),
                    [StmtAssign("s", ExprBop(ExprVar('i'), ExprNum(1), Bop.MIN))],
                    [StmtAssign("s", ExprBop(ExprVar('i'), ExprNum(1), Bop.ADD))])
             ]

test_func = Function(name='printer_test', args=['n'], stmts=test_stmt, ret=ExprVar('s'))


if __name__ == '__main__':
    # Your code should print:
    #
    # def printer_test(n):
    #     s = 0
    #     i = 0
    #     while i <= (n - 3):
    #         s = s + i
    #         i = i + 1
    #         if s > i:
    #             b = s - 1
    #     if s > i:
    #         s = i - 1
    #     else:
    #         s = i + 1
    #     return s
    #
    print(test_func)

The output is:

 

In order to build a specific executor, we first need to define a memory model. For this, we will use Python's data class model. We define a Python data class memory in a specific .py file, which contains the definition of specific memory. The memory operation is defined as:

Exercise 2: Read the code in the concrete.py file and make sure you understand the specific model data structure before continuing.

 

#中科大软院-Hua health care teacher's formal course notes, welcome to exchange private messages~

 

 

 

Guess you like

Origin blog.csdn.net/weixin_41950078/article/details/112745090