抽象语法树

什么是抽象语法树？

It is a hierarchical program representation that presents source code structure according to the grammar of a programming language, each AST node corresponds to an item of a source code.

抽象语法树是源代码语法结构的一种抽象表示。它以树状的形式表现编程语言的语法结构，树上的每个节点都表示源代码中的一种结构

看不懂没关系，抽象语法树有很多章节，我们不需要逐一了解

这篇文章会帮你建立起，抽象语法树的印象

我们只需要把目光聚焦于词法分析(Lexical Analysis)和语法分析(Syntax Analysis)上，这两步在转换抽象语法树过程中扮演着极其重要的角色。

词法分析 Lexical Analysis

也叫scanner(扫描器)，它读取我们的source code中你的每一个字符，转换成token（词法令牌）, 最后，我的源代码可能会被转换成 list of tokens

input => const a = 5;
output => [{type: 'keyword', value: 'const', ...}, {type: 'identifier', value: 'a', ...}, {type: 'value', value: '5', ...}, ...]

语法分析 Syntax Analysis

也叫parser（解析器），将词法分析器解析出的list of token，转换成tree representation

input => [{type: 'keyword', value: 'const', ...}, {type: 'identifier', value: 'a', ...}, {type: 'value', value: '5', ...}, ...]
output => [{type: 'VariableDeclarator', declarations: {kind: 'const', type: 'Identifier', name: 'a'}, init: {type: 'Literal', value: '5'}, ...}]

最终，经过词法分析和语法分析，我们的代码被转换成了一个树形节点

所有的树形节点组合起来，就形成了concrete syntax tree（混合语法树），该树虽然和代码并不是100%匹配，但却包含了足够的信息使解析器能够正确的处理代码

Babel

babel是一个js编译器，他解析高版本es语法代码，生成向后兼容的低版本js代码。

how it works ？

在高层次上，babel解析分为三步

parser => transform => generate

我们将使用伪代码分析每一步的输入输出目标

step 1: parser

  import * as BabelParser from '***@babel/parser*';
  const code = ` const a = 5 `;
  const ast = BabelParser.parse(code);

首先，parser输入源码，输出抽象语法树ast

step 2: transform

import traverse from '***@babel/traverse***';
const new_ast = traverse(ast, {
  enter(path) {
    if (path.node.type === 'Identifier') {
      // do something transformal
    }
    ...
  }
});

然后, 结合babel preset，plugin，转换上述ast，生成新的ast

step3: generate

import generate from '***@babel/generator***';
const newCode = generate(new_ast);

最后，根据新的语法树ast，生成编译后新的代码

总结起来就是：

parser: source_code => ast
traverse: ast => new_ast
generate: new_ast => target_code

实际上，babel的转换过程就是构建和修改抽象语法树的过程。

五分钟了解抽象语法树（AST）babel是如何转换的？