Re-learn front-end study notes (28) - a quick understanding of compiler theory by four arithmetic interpreter

Notes Description

Re-learn the front Cheng Shao non (winter) Former phone Taobao front-end person in charge] a column open at geeks time, 10 minutes a day, remodeling your front-end knowledge , the author finishing process of learning some of the points notes and insights, complete the column can be added to winter learning [sic] voice of winter, if infringement please contact me, email: [email protected].

First, analysis

Compiled in accordance with the relevant principles of knowledge, which is divided into several steps.

  • 定义四则运算: Lexical definitions and syntax definitions outputs four operations.
  • 词法分析: Input stream into the string token.
  • 语法分析: The token turned into an abstract syntax tree AST.
  • 解释执行: Postorder AST, the implementation of the outcome.

Second, the definition of the four operations

2.1, the definition of lexical

  • Token
    • Number The : 1 2 3 4 5 6 7 8 9 0combination
    • Operator : +、-、*、/One
  • Whitespace<sp>
  • LineTerminator :<LE> <CR>

2.2, the definition of grammar

The syntax for defining most of the use BNF. Backus-Naur Form (BNF: Backus-Naur Formabbreviation) was first introduced by John Backus and Peter Naur to a formal notation to describe the syntax of a given language (first used to describe ALGOL 60 programming language). JavaScript standard which is a similar syntax with homemade BNF.

1, addition is then connected by a plurality of multiplications by plus or minus into:

<Expression> ::=
    <AdditiveExpression><EOF>

<AdditiveExpression> ::=
    <MultiplicativeExpression>
    |<AdditiveExpression><+><MultiplicativeExpression>
    |<AdditiveExpression><-><MultiplicativeExpression>
复制代码

2, as a special case of the conventional digital multiplication:

<MultiplicativeExpression> ::=
    <Number>
    |<MultiplicativeExpression><*><Number>
    |<MultiplicativeExpression></><Number>
复制代码

Above is the four-defined operations.

Third, the lexical analysis: state machine

Lexical analysis: the token stream into a character stream, there are two options, one is the state machine, one is a regular expression, they are equivalent.

3.1, implementing a state machine

// 可能产生四种输入元素,其中只有两种 token,状态机的第一个状态就是根据第一个输入字符来判断进入了哪种状态

var token = [];
function start(char) {
    if(char === '1' || char === '2'|| char === '3'
        || char === '4'|| char === '5'|| char === '6'|| char === '7'
        || char === '8'|| char === '9'|| char === '0') {
        token.push(char);
        return inNumber;
    }
    if(char === '+' || char === '-' || char === '*' || char === '/') {
        emmitToken(char, char);
        return start
    }
    if(char === ' ') {
        return start;
    }
    if(char === '\r' || char === '\n') {
        return start;
    }
}
function inNumber(char) {
    if ( char === '1' || char === '2' || char === '3'
        || char === '4'|| char === '5' || char === '6' || char === '7'
        || char === '8' || char === '9' || char === '0') {
        token.push(char);
        return inNumber;
    } else {
        emmitToken("Number", token.join(""));
        token = [];
        // put back char
        return start(char);
    }
}

// 用函数表示状态,用 if 表示状态的迁移关系,用 return 值表示下一个状态。
复制代码

3.2, running state machine

function emmitToken(type, value) {
    console.log(value);
}

var input = "1024 + 2 * 256"

var state = start;

for(var c of input.split(''))
    state = state(c);

state(Symbol('EOF'))

// 输出结果
1024
+
2
*
256
复制代码

Fourth, the syntax analysis: LL

LL parsing to write a function according to each production.

4.1, write the function name

function AdditiveExpression( ){

}
function MultiplicativeExpression(){

}
复制代码

4.2, assuming've got the token

var tokens = [{
    type:"Number",
    value: "1024"
}, {
    type:"+",
    value: "+"
}, {
    type:"Number",
    value: "2"
}, {
    type:"*",
    value: "*"
}, {
    type:"Number",
    value: "256"
}, {
    type:"EOF"
}];
复制代码

4.3, AdditiveExpression processing

1, three cases

<AdditiveExpression> ::=
    <MultiplicativeExpression>
    |<AdditiveExpression><+><MultiplicativeExpression>
    |<AdditiveExpression><-><MultiplicativeExpression>
复制代码

2, AdditiveExpression wording

AdditiveExpression wording is the root node of the incoming synthesis generative new node.

function AdditiveExpression(source){
    if(source[0].type === "MultiplicativeExpression") {
        let node = {
            type:"AdditiveExpression",
            children:[source[0]]
        }
        source[0] = node;
        return node;
    }
    if(source[0].type === "AdditiveExpression" && source[1].type === "+") {
        let node = {
            type:"AdditiveExpression",
            operator:"+",
            children:[source.shift(), source.shift(), MultiplicativeExpression(source)]
        }
        source.unshift(node);
    }
    if(source[0].type === "AdditiveExpression" && source[1].type === "-") {
        let node = {
            type:"AdditiveExpression",
            operator:"-",
            children:[source.shift(), source.shift(), MultiplicativeExpression(source)]
        }
        source.unshift(node);
    }
}
复制代码

4.4, the processing function Expression

1, the top layer of the parsed token handler Expression imparted.

Expression(tokens);
复制代码

2. If the first Expression receive a token, a Number, it is necessary for production of the first item of the layers unfold, call the appropriate handler according to all possibilities, this process is known as demand in translation principle closure.

function Expression(source){
    if(source[0].type === "AdditiveExpression" && source[1] && source[1].type === "EOF" ) {
        let node = {
            type:"Expression",
            children:[source.shift(), source.shift()]
        }
        source.unshift(node);
        return node;
    }
    AdditiveExpression(source);
    return Expression(source);
}
function AdditiveExpression(source){
    if(source[0].type === "MultiplicativeExpression") {
        let node = {
            type:"AdditiveExpression",
            children:[source[0]]
        }
        source[0] = node;
        return AdditiveExpression(source);
    }
    if(source[0].type === "AdditiveExpression" && source[1] && source[1].type === "+") {
        let node = {
            type:"AdditiveExpression",
            operator:"+",
            children:[]
        }
        node.children.push(source.shift());
        node.children.push(source.shift());
        MultiplicativeExpression(source);
        node.children.push(source.shift());
        source.unshift(node);
        return AdditiveExpression(source);
    }
    if(source[0].type === "AdditiveExpression" && source[1] && source[1].type === "-") {
        let node = {
            type:"AdditiveExpression",
            operator:"-",
            children:[]
        }
        node.children.push(source.shift());
        node.children.push(source.shift());
        MultiplicativeExpression(source);
        node.children.push(source.shift());
        source.unshift(node);
        return AdditiveExpression(source);
    }
    if(source[0].type === "AdditiveExpression")
        return source[0];
    MultiplicativeExpression(source);
    return AdditiveExpression(source);
}
function MultiplicativeExpression(source){
    if(source[0].type === "Number") {
        let node = {
            type:"MultiplicativeExpression",
            children:[source[0]]
        }
        source[0] = node;
        return MultiplicativeExpression(source);
    }
    if(source[0].type === "MultiplicativeExpression" && source[1] && source[1].type === "*") {
        let node = {
            type:"MultiplicativeExpression",
            operator:"*",
            children:[]
        }
        node.children.push(source.shift());
        node.children.push(source.shift());
        node.children.push(source.shift());
        source.unshift(node);
        return MultiplicativeExpression(source);
    }
    if(source[0].type === "MultiplicativeExpression"&& source[1] && source[1].type === "/") {
        let node = {
            type:"MultiplicativeExpression",
            operator:"/",
            children:[]
        }
        node.children.push(source.shift());
        node.children.push(source.shift());
        node.children.push(source.shift());
        source.unshift(node);
        return MultiplicativeExpression(source);
    }
    if(source[0].type === "MultiplicativeExpression")
        return source[0];

    return MultiplicativeExpression(source);
};

var source = [{
    type:"Number",
    value: "3"
}, {
    type:"*",
    value: "*"
}, {
    type:"Number",
    value: "300"
}, {
    type:"+",
    value: "+"
}, {
    type:"Number",
    value: "2"
}, {
    type:"*",
    value: "*"
}, {
    type:"Number",
    value: "256"
}, {
    type:"EOF"
}];
var ast = Expression(source);

console.log(ast);
// 输出结果 children: Array(1) children: Array(3)还可以继续展开。。。
{
    type: "Expression",
    children: [
        {
            type: "AdditiveExpression",
            operator: "+",
            children: [
                {
                    type: "AdditiveExpression",
                    children: Array(1)
                },
                {
                    type: "+",
                    value: "+"
                },
                {
                    type: "MultiplicativeExpression",
                    operator: "*",
                    children: Array(3)
                }
            ]
        },
        {
            type: "EOF"
        }
    ]
}
复制代码

V. interpreted

After getting the AST, only it needs to be done to traverse operations performed on the tree.

// 根据不同的节点类型和其它信息,用 if 分别处理即可:
function evaluate(node) {
    if(node.type === "Expression") {
        return evaluate(node.children[0])
    }
    if(node.type === "AdditiveExpression") {
        if(node.operator === '-') {
            return evaluate(node.children[0]) - evaluate(node.children[2]);
        }
        if(node.operator === '+') {
            return evaluate(node.children[0]) + evaluate(node.children[2]);
        }
        return evaluate(node.children[0])
    }
    if(node.type === "MultiplicativeExpression") {
        if(node.operator === '*') {
            return evaluate(node.children[0]) * evaluate(node.children[2]);
        }
        if(node.operator === '/') {
            return evaluate(node.children[0]) / evaluate(node.children[2]);
        }
        return evaluate(node.children[0])
    }
    if(node.type === "Number") {
        return Number(node.value);
    }
}
复制代码

Personal summary

This is a completely senseless force _(:3」∠)_

Reproduced in: https: //juejin.im/post/5cf7be12f265da1b95704615

Guess you like

Origin blog.csdn.net/weixin_33735077/article/details/91467560
Recommended