"Translated" What is an abstract syntax tree

AST is an acronym abstract syntax tree represents the token programming language statements and expressions generated. With AST, interpreter or compiler may generate machine code for an instruction or evaluation.

Tips : By using Bit , you can use any of the JS code into a project and share applications, use and synchronization API, in order to build faster and reuse more code. Try it.

Suppose we have the following this simple expression:

1 + 2
复制代码

AST used to represent words, it is this:

+ BinaryExpression
 - type: +
 - left_value: 
  LiteralExpr:
   value: 1
 - right_vaue:
  LiteralExpr:
   value: 2
复制代码

Such ifstatements can be so expressed as follows:

if(2 > 6) {
    var d  = 90
    console.log(d)
}
IfStatement
 - condition
  + BinaryExpression
   - type: >
   - left_value: 2
   - right_value: 6
 - body
  [
    - Assign
        - left: 'd';
        - right: 
            LiteralExpr:
            - value: 90
    - MethodCall:
         - instanceName: console
         - methodName: log
         - args: [
         ]
  ]
复制代码

This tells the interpreter how to interpret the statement, while the statement tells the compiler how to generate the corresponding code.

Look at this 1 + 2expression: . Our brain is determined that a left and right values of the adder are added. Now, in order to make the computer work like our brains, we have to look at it in the form of the brain similar to represent it.

We use a class to represent, the entire contents of which attribute tells the interpreter operation, the left and right values. Because a binary operation involves two values, so we give the class named Binary:

class Binary {  
    constructor(left, operator, right) {  
        this.left = left  
        this.operator = operator  
        this.right = right  
    }  
}
复制代码

During instantiation, we will bring 1to pass the first property, the ADDpassed to the second property, the 2passed third property:

new Binary('1', 'ADD', '2')
复制代码

When we pass it to the interpreter, the interpreter that it is a binary operator, the operator then checks that it is an adder, followed by a request to continue the example leftvalues and rightvalues, and adding up :

const binExpr = new Binary('1', 'ADD', '2')

if(binExpr.operator == 'ADD') {  
    return binExpr.left + binExpr.right  
}  
// 返回 `3` 
复制代码

Look, AST can execute expressions and statements brain like that.

Single numbers, strings, Boolean values, etc. are expressions, which can be expressed and evaluated in the AST.

23343
false
true
"nnamdi"
复制代码

Take one example:

1
复制代码

We Literal (literal) in the AST class to represent it. A literal is a word or number, Literal class with a property to hold it:

class Literal {  
    constructor(value) {  
        this.value = value  
    }  
}
复制代码

We can show the image below Literal 1:

new Literal(1)
复制代码

When the interpreter evaluates it, it requests examples Literal valuevalue of the property:

const oneLit = new Literal(1)  
oneLit.value  
// `1`
复制代码

In our binary expressions, we directly pass the value

new Binary('1', 'ADD', '2')
复制代码

This is in fact unreasonable. Because, as we see above, 1and 2is an expression, a basic expression. As literals, they also need to be evaluated and represented by Literal class.

const oneLit = new Literal('1')  
const twoLit = new Literal('2')
复制代码

Therefore, binary expressions will oneLitand twoLitwere used as left and right properties.

// ...  
new Binary(oneLit, 'ADD', twoLit)
复制代码

In the evaluation phase, the left and right properties also need to be evaluated to obtain respective values:

const oneLit = new Literal('1')  
const twoLit = new Literal('2')  
const binExpr = new Binary(oneLit, 'ADD', twoLit)

if(binExpr.operator == 'ADD') {  
    return binExpr.left.value + binExpr.right.value  
}  
// 返回 `3` 
复制代码

, For example, if statement in other statements in AST are mostly represented by Binary.

We know that, in an if statement, the condition is true only when the code block will be executed.

if(9 > 7) {  
    log('Yay!!')  
}
复制代码

If the above statement, the conditional block of code that 9must be greater than 7, then we can see the output on the terminal Yay!!.

In order for an interpreter or compiler to perform this way, we will be in a contained condition, bodyclass attributes to represent it. conditionAfter saving the resolution must be true condition, bodyit is an array that contains all statements if the code block. Interpreter will traverse the array and execute inside the statement.

class IfStmt {  
    constructor(condition, body) {  
        this.condition = condition  
        this.body = body  
    }  
}
复制代码

Now, let us express the following statement in IfStmt class

if(9 > 7) {  
    log('Yay!!')  
}
复制代码

With the proviso that a binary operation, which will be expressed as:

const cond = new Binary(new Literal(9), "GREATER", new Literal(7))
复制代码

Like before, I hope you remember? So this is a GREATER operation.

code block if only one sentence statement: a function call. Function call can also be expressed in one class, it contains attributes: Used to refer to the called function nameand the parameters used to represent the transmission args:

class FuncCall {  
    constructor(name, args) {  
        this.name = name  
        this.args = args  
    }  
}
复制代码

Therefore, log ( "Yay !!") call can be expressed as:

const logFuncCall = new FuncCall('log', [])
复制代码

Now, we put these together, our if statement can be expressed as:

const cond = new Binary(new Literal(9), "GREATER", new Literal(7));  
const logFuncCall = new FuncCall('log', []);

const ifStmt = new IfStmt(cond, [  
    logFuncCall  
])
复制代码

The interpreter can be explained if statement like the following:

const ifStmt = new IfStmt(cond, [  
    logFuncCall  
])

function interpretIfStatement(ifStmt) {  
    if(evalExpr(ifStmt.conditon)) {  
        for(const stmt of ifStmt.body) {  
            evalStmt(stmt)  
        }  
    }  
}

interpretIfStatement(ifStmt)
复制代码

Output:

Yay!!
复制代码

Because 9 > 7:)

We check conditionto explain if the statement is true parsed. If true, we go through bodythe array and executes inside the statement.

AST execution

The use of visitors to AST is evaluated. Visitor pattern is a design pattern that allows the algorithm to achieve a set of objects in one place.

ASTs, Literal, Binary, IfStmnt is a group of related classes, each class will need to carry a method of obtaining the interpreter to make their values ​​or evaluation thereof.

Visitor pattern allows us to create a single class, and prepared to achieve AST in the class, the class will be available to AST. Each AST has a public method explained by the class instance will be invoked, then the AST class calls the appropriate method in the incoming implementation class to calculate the AST.

class Literal {  
    constructor(value) {  
        this.value = value  
    }

    visit(visitor) {  
        return visitor.visitLiteral(this)  
    }  
}

class Binary {  
    constructor(left, operator, right) {  
        this.left = left  
        this.operator = operator  
        this.right = right  
    }

    visit(visitor) {  
        return visitor.visitBinary(this)  
    }  
}
复制代码

Look, AST Literal and Binary have access method, but the method in which they are calling the method a visitor to find examples of the value of their own. Literal call visitLiteral, Binary is called visitBinary.

Now, as the Vistor implementation which will achieve visitLiteral and visitBinary method:

class Visitor {

    visitBinary(binExpr) {  
        // ...  
        log('not yet implemented')  
    }

    visitLiteral(litExpr) {  
        // ...  
        log('not yet implemented')  
    }  
}
复制代码

visitBinary and visitLiteral will have its own realization in the Vistor class. Therefore, when an interpreter tried to explain a binary expressions, it calls the binary expressions of the access method, and pass an instance of the class Vistor:

const binExpr = new Binary(...)  
const visitor = new Visitor()

binExpr.visit(visitor)
复制代码

Access method calls visitBinary visitors, and pass it to the method, after printing not yet implemented. This is known as double dispatch.

  1. Call the Binaryaccess method.
  2. It ( Binary) in turn calls Visitorthe instance visitBinary.

We complete code visitLiteral write about. Since the value of the property preserved examples of Literal value, so just like back here this value:

class Visitor {

    visitBinary(binExpr) {  
        // ...  
        log('not yet implemented')  
    }

    visitLiteral(litExpr) {  
        return litExpr.value  
    }  
}
复制代码

For visitBinary, we know there Binary class operator, left and right properties. It will represent operator about the operation of the attribute. We can write achieve the following:

class Visitor {

    visitBinary(binExpr) {  
        switch(binExpr.operator) {  
            case 'ADD':  
            // ...  
        }  
    }

    visitLiteral(litExpr) {  
        return litExpr.value  
    }  
}
复制代码

Note that the left and right values are an expression may be literal expression, binary expressions, calling expression or other expressions. We can not ensure binary operation of the left and right sides always literal. Every expression must have access to a method used to evaluate the expression, so the above visitBinary method, by calling us each corresponding visitto Binary left and right properties are evaluated methods:

class Visitor {

    visitBinary(binExpr) {  
        switch(binExpr.operator) {  
            case 'ADD':  
                return binExpr.left.visit(this) + binExpr.right.visit(this)  
        }  
    }

    visitLiteral(litExpr) {  
        return litExpr.value  
    }  
}
复制代码

Therefore, both the left and right values ​​saved what kind of expression, and finally can be delivered.

So, if we have the following statements:

const oneLit = new Literal('1')  
const twoLit = new Literal('2')  
const binExpr = new Binary(oneLit, 'ADD', twoLit)  
const visitor = new Visitor()

binExpr.visit(visitor)
复制代码

In this case, the binary operation is stored literal.

Visitors visitBinarywill be called while binExpr passed in Vistor class, visitBinarywe will be left as oneLit value, twoLit as the right value. Because oneLit and twoLit are Literal instance, their access method is invoked, while Visitor incoming class. For oneLit, its internal Literal class will call visitLiteral method Vistor class and oneLitpassing, while visitLiteral method Vistor Returns Literal class of property value, that is 1. Similarly, for twoLit, the return is 2.

Since the implementation of the switch statement case 'ADD', the value returned will be added together, and finally returns 3.

If we binExpr.visit(visitor)pass console.log, it will print3

console.log(binExpr.visit(visitor))  
// 3
复制代码

Below, we pass a binary operation branch 3:

1 + 2 + 3
复制代码

First of all, we have chosen 1 + 2, then the result will be left as a value, that is + 3.

Binary type described above may be expressed as:

new Binary (new Literal(1), 'ADD', new Binary(new Literal(2), 'ADD', new Literal(3)))
复制代码

We can see the right value is not literal, but a binary expressions. Therefore, before performing the addition operation, it must first binary expressions that are evaluated, and the result value as a right of the final evaluation.

const oneLit = new Literal(1)  
const threeLit =new Literal(3)  
const twoLit = new Literal(2)

const binExpr2 = new Binary(twoLit, 'ADD', threeLit)  
const binExpr1 = new Binary (oneLit, 'ADD', binExpr2)

const visitor = new Visitor()

log(binExpr1.visit(visitor))

6
复制代码

Adding ifstatement

The ifstatement to the equation. In order for an if statement is evaluated, we will add a class to IfStmt visitmethod, after which it will call visitIfStmt method:

class IfStmt {  
    constructor(condition, body) {  
        this.condition = condition  
        this.body = body  
    }

    visit(visitor) {  
        return visitor.visitIfStmt(this)  
    }  
}
复制代码

An insight into the power of the visitor pattern yet? We added to some of the classes in a class that corresponds to only need to add the same access method, which will invoke the corresponding method it is located Vistor class. This way will not disrupt or affect other related classes, the visitor pattern we follow the principle of opening and closing.

Thus, we achieve Vistor class visitIfStmt:

class Visitor {  
    // ...

    visitIfStmt(ifStmt) {  
        if(ifStmt.condition.visit(this)) {  
            for(const stmt of ifStmt.body) {  
                stmt.visit(this)  
            }  
        }  
    }  
}
复制代码

Because the condition is an expression, so we call it the access method evaluates it. We use an if statement checking JS return value, if true, then traverse the block of code statements ifStmt.body, by calling the visitmethod and passing Vistor, in the array for each statement is evaluated.

So we translated this statement:

if(67 > 90)
复制代码

Add function calls and function declarations

Then add a function call. We already have a corresponding class of:

class FuncCall {  
    constructor(name, args) {  
        this.name = name  
        this.args = args  
    }  
}
复制代码

Add an access method:

class FuncCall {  
    constructor(name, args) {  
        this.name = name  
        this.args = args  
    }

    visit(visitor) {  
        return visitor.visitFuncCall(this)  
    }  
}
复制代码

To Visitoradd a class visitFuncCallmethod:

class Visitor {  
    // ...

    visitFuncCall(funcCall) {  
        const funcName = funcCall.name  
        const args = []  
        for(const expr of funcCall.args)  
            args.push(expr.visit(this))  
        // ...  
    }  
}
复制代码

There is a problem. In addition to built-in functions, there are custom functions, we need to create a "container" for the latter, and on the inside by the function name to save and reference the function.

const FuncStore = (  
    class FuncStore {

        constructor() {  
            this.map = new Map()  
        }

        setFunc(name, body) {  
            this.map.set(name, body)  
        }

        getFunc(name) {  
            return this.map.get(name)  
        }  
    }  
    return new FuncStore()  
)()
复制代码

FuncStoreFunction is preserved, and from a Mapretrieving function of these examples.

class Visitor {  
    // ...

    visitFuncCall(funcCall) {  
        const funcName = funcCall.name  
        const args = []  
        for(const expr of funcCall.args)  
            args.push(expr.visit(this))  
        if(funcName == "log")  
            console.log(...args)  
        if(FuncStore.getFunc(funcName))  
            FuncStore.getFunc(funcName).forEach(stmt => stmt.visit(this))  
    }  
}
复制代码

Look what we've done. If the function name funcName(remember, FuncCallthe class will save the function name in nameproperty) is log, run JS console.log(...), and pass parameters to it. If we find a function in the function preservation, then the function body is traversed in order to access and execute.

Now look at how to put our function declaration function preservation.

Function declaration to fucntionbegin with. The general function of this structure is:

function function_name(params) {  
    // function body  
}
复制代码

Therefore, we can attribute indicates that a function declaration in a class by: name save function function name, body is an array, save the function body of the statement:

class FunctionDeclaration {  
    constructor(name, body) {  
        this.name = name  
        this.body = body  
    }  
}
复制代码

We add an access method, which is referred to in Vistor in visitFunctionDeclaration:

class FunctionDeclaration {  
    constructor(name, body) {  
        this.name = name  
        this.body = body  
    }

    visit(visitor) {  
        return visitor.visitFunctionDeclaration(this)  
    }  
}
复制代码

In the Visitor:

class Visitor {  
    // ...

    visitFunctionDeclaration(funcDecl) {  
        FuncStore.setFunc(funcDecl.name, funcDecl.body)  
    }  
}
复制代码

The name of the function as the key to saving function.

Now, suppose we have the following functions:

function addNumbers(a, b) {  
    log(a + b)  
}

function logNumbers() {  
    log(5)  
    log(6)  
}
复制代码

It can be expressed as:

const funcDecl = new FunctionDeclaration('logNumbers', [  
    new FuncCall('log', [new Literal(5)]),  
    new FuncCall('log', [new Literal(6)])  
])

visitor.visitFunctionDeclaration(funcDecl)
复制代码

Now, let's call the function logNumbers:

const funcCall = new FuncCall('logNumbers', [])  
visitor.visitFuncCall(funcCall)
复制代码

The console will print:

5
6
复制代码

in conclusion

AST understanding of the process is very daunting and mental consumption. Even write the simplest parser also requires a lot of code.

Note that we did not introduce a scanner and parser, but first explain ASTs to showcase their work process. If you can understand the depth AST and what it needs, so when you start writing your own programming language, naturally redoubled.

Practice makes perfect, you can continue to add other programming language features, such as:

  • Classes and Objects
  • Method Invocation
  • Encapsulation and inheritance
  • for-of Statement
  • while Statement
  • for-in Statement
  • Any other interesting features you can think of

If you have any questions or anything I need to add, modify, deletion of content, welcome comments and induced-mail.

thank! ! !

Reproduced in: https: //juejin.im/post/5d05b9356fb9a07ef56234a3

Guess you like

Origin blog.csdn.net/weixin_34337381/article/details/93181415