一直对编译器比较感兴趣, 就通过写一个计算器来探讨编译怎么实现的。
总共分为3块:
1. 词法解析
把输入的字符串,进行分类, 切割成各种token。例如 变量, '+’, ‘-', 数字等。
2. 语法分析
根据token的结构, 得到语法分析树。 简单点讲就是通过token的结构得到得到命令以及对应的参数。这些命令以及按优先级, 结合顺序排好序了。
3. 执行。
执行命令。
实现思路:
词法分析:
输入 | token类型 | token值 |
空格,\t | 过滤 | |
换行符 | 过滤 | |
数字 | INT | 转成整形对应的值 |
+ | ADD | + |
词法分析:
1. 表达式可以扩展怎么解析 -- 递归下降解析。
2. 优先级怎么实现 -- 优先级低的递归调用优先级高的解析。
3. 操作符结合性解析 -- 调整输出顺序。
执行:
词法分析之后输出逆波兰式容易执行了, 新建1个缓存堆栈,从左到右扫描这个逆波兰式, 遇到数值就入堆栈, 遇到操作符就运行, 操作数根据操作符需要的参数个数从堆栈中取, 执行完结果再入堆栈。 知道扫描完这个逆波兰式, 得到的栈顶元素就是执行结果。
词法分析
package test;
import java.util.ArrayList;
import java.util.List;
public class StringToTokens {
public List<Token> parse(String express) throws Exception {
int currentIndex = 0;
List<Token> tokens = new ArrayList<>();
for(currentIndex = 0; currentIndex < express.length(); currentIndex++) {
char c = express.charAt(currentIndex);
if(c == ' ')
continue;
if(isAlphabet(c)) {
throw new Exception("unsupport alphabet");
} else if(isDigit(c)) {
int beginIndex = currentIndex;
while (currentIndex < express.length() && isDigit(express.charAt(currentIndex))) {
currentIndex++;
}
int endIndex = currentIndex;
tokens.add(new Token(TokenType.INT, Integer.valueOf(express.substring(beginIndex, endIndex))));
--currentIndex;
} else if(c == '+') {
tokens.add(new Token(TokenType.ADD, c));
} else if(c == '-') {
tokens.add(new Token(TokenType.SUB, c));
} else if(c == '*') {
tokens.add(new Token(TokenType.MUL, c));
} else if(c == '/') {
tokens.add(new Token(TokenType.DIV, c));
} else if(c == '(') {
tokens.add(new Token(TokenType.LEFTPARENT, c));
} else if(c == ')') {
tokens.add(new Token(TokenType.RIGHTPARENT, c));
} else {
tokens.add(new Token(TokenType.UNKNOWN, c));
}
}
tokens.add(new Token(TokenType.EOF, 0));
return tokens;
}
private boolean isAlphabet(char c) {
return ((c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z'));
}
private boolean isDigit(char c) {
return c >= '0' && c <= '9';
}
}
2. 语法分析及运行
package test;
import java.util.ArrayList;
import java.util.List;
import java.util.Stack;
public class Express {
private String express;
private List<Token> tokens;
private int currentIndex;
private List<Token> parseResults = new ArrayList<>();
private Stack<Integer> temporaryResults = new Stack<>();
public Express(String express) {
this.express = express;
}
public static void main(String args[]) {
Express express = new Express("3 + 4 * (1 + 2) - 6");
// Express express = new Express("3 + 4");
express.parse();
express.execute();
}
public void parse() {
try {
tokens = new StringToTokens().parse(express);
currentIndex = 0;
parseIter();
match(TokenType.EOF);
} catch (Exception e) {
e.printStackTrace();
}
}
public void parseIter() {
term();addAndSub();
}
private void term() {
factor(); mulAndDiv();
}
private void addAndSub() {
Token currentToken = tokens.get(currentIndex);
switch(currentToken.getType()) {
case ADD:
case SUB:
match(currentToken.getType());term();pushParseResult(currentToken);addAndSub();
break;
default:
return ;
}
}
private void mulAndDiv() {
Token currentToken = tokens.get(currentIndex);
switch(currentToken.getType()) {
case MUL:
case DIV:
match(currentToken.getType());factor();pushParseResult(currentToken);mulAndDiv();
break;
default:
return ;
}
}
private void factor() {
Token currentToken = tokens.get(currentIndex);
switch(currentToken.getType()) {
case INT:
match(TokenType.INT);pushParseResult(currentToken);
break;
case LEFTPARENT:
match(TokenType.LEFTPARENT);parseIter(); match(TokenType.RIGHTPARENT);
break;
default:
throw new RuntimeException("Grammatical errors");
}
}
public void execute() {
for(Token parseResult: parseResults) {
if(parseResult.getType() == TokenType.INT) {
temporaryResults.push(parseResult.getValue());
} else {
int b = temporaryResults.pop();
int a = temporaryResults.pop();
switch (parseResult.getType()) {
case ADD:
temporaryResults.push(add(a, b));
break;
case SUB:
temporaryResults.push(sub(a, b));
break;
case MUL:
temporaryResults.push(mul(a, b));
break;
case DIV:
temporaryResults.push(div(a, b));
break;
default:
throw new RuntimeException(parseResult.getType().toString());
}
}
}
System.out.println("result: " + temporaryResults.pop());
}
public int add(int a, int b) {
return a + b;
}
public int sub(int a, int b) {
return a - b;
}
public int mul(int a, int b) {
return a * b;
}
public int div(int a, int b) {
return a / b;
}
private void match(TokenType type) {
if(tokens.get(currentIndex).getType() != type) {
throw new RuntimeException("Grammatical errors");
}
currentIndex++;
}
private void pushParseResult(Token token) {
parseResults.add(token);
}
}