C-语言词法分析器与语法分析器(二)

说明:

为实践《编译原理》中的相关知识,认真完成了课程设计,实现了C-语言的词法分析器与语法分析器

C-语言是C语言的一个子集,语法包括:

整型变量与函数的声明

if else 分支语句

while 循环语句


本篇介绍语法分析器的实现

将利用将文法改写为LL(1)文法的技术实现语法分析器

流程:

  1. 去除左递归
  2. 提取左公因子
  3. 获取FIRST, FOLLOW集合
  4. 构造预测分析表
  5. 匹配词法分析器产生的token构造语法树

数据结构:

因为要对文法中的产生式频繁的删改,最开始用了std::list<std::list<std::string>>,内层的每个list存产生式,因为在判断产生式是几个终结符、非终结符并列时,要逐字符判断空格,后面将数据结构换成std::vector<std::vector<std::vector<std::string>>>储存文法,最内层是符号,外面一层是一个产生式,再外面一层是一条文法,最外层是所有文法

去除左递归:

先将文法左端的前面文法的非终结符替换为该非终结符的产生式

再去除左递归

将A -> Aα| β改为

 A -> βA1

A1 -> αA1

提取左公因子:

因为LL(1)文法每次的产生式是确定的,而两个产生式会导致选择是产生分歧,故要去除公因子

将A -> αB| αC

改为A -> αA'

A' -> B | C

在最开始,我以为难点主要在去左递归,提取左公因子看起来很直观很好操作,后面踩了坑发现提取左公因子是最难的一步,因为一个可能两个产生式虽然不一样,但他们产生的非终结符是相同的,而提取左公因子就要把左公终结符全部提取出来,但是直接把一个非终结符充分展开显然非常麻烦。

在判断是否是LL(1)文法是会求FIRST集合,而通过FIRST集合判断LL(1)文法就是在找是否有公因子

我的做法是两两比较一个非终结符的所有产生式,如果有明显的左公因子,即字符相同的Vn&Vt,先将明显的左公因子提取出来,将新的文法加入所有文法,回到算法开头重新开始。如果没有明显的左公因子,算一次FIRST集合,判断有没有隐藏在产生式中的左公因子,有的话就展开当前比较的两条产生式,然后再回到算法开头重新开始。直到遍历完所有文法找不到左共因子,就提取出了所有左公因子

获取FIRST, FOLLOW集合

FIRST(α)被定义为可从α推导得到的串的首符号的集合,其中α是任意的文法符号串。

对于产生式 α -> βγ

在计算FIRST(α)时,要先知道FIRST(β),故想到采用递归解决,根据规则,如果FIRST(β)中含有ε,再继续求FIRST(γ),若FIRST(γ)中也含有ε,则将ε加入FIRST(α)

对于非终结符A,FOLLOW(A)被定义为可能在某些句型中紧跟在A右边的终结符号的集合

规则是

(1)将$放到FOLLOW(S)中,其中S是开始符号,而$是输入右端的结束标记

(2)如果存在一个产生式A -> αBβ,那么FIRST(β)中除ε之外的所有符号都在FOLLOW(B)中

(3)如果存在一个产生式A -> αB,或存在产生式A -> αBβ且FIRST(β)包含ε,那么FOLLOW(A)中的所有符号都在FOLLOW(B)中

根据规则,在计算完FIRST集合后,扫一遍所有产生式即可计算出FOLLOW集合

FIRST集合和FOLLOW集合正好可以用std::set存储

构造预测分析表

对于文法G的每个产生式A -> α,进行如下处理:

(1)对于FIRST(α)中的每个终结符号a, 将A -> α加入到M [A, a]中。

(2)如果ε在FIRST(α)中,那么对于FOLLOW(A)中的每个终结服符号b,将A -> α加入到M [A, b]中。如果ε在FIRST(α)中,且$在FOLLOW(A)中,也将A -> α加入到M [A, $]中。

在求出FIRST集合和FOLLOW集合后,根据规则很容易求出预测分析表

因为这一步将一个非终结符到一个终结符的推导映射到一个产生式上,可以用std::map存储

匹配词法分析器产生的token构造语法树

预测分析表构造好后,就可以拿来处理词法分析器得出的token,如果文法是处理正确的LL(1)文法,且源代码也没有语法错误,那么这一步将会把所有的token构造出一棵语法树。


详细代码:

//Parser.h
//作者:IuSpet
//作用:将一般文法转变为LL(1)文法并构建语法分析树

#ifndef Parser_h
#define Parser_h
#include"utlib.h"

//语法树的节点
struct node
{
	node *Parent;
	std::string type;
	std::string value;
	std::list<node*> sons;
};
class Parser
{
public:
	Parser();
	
	void get_LL1_grammar();						//得到LL(1)文法
	void Parse();								//解析,匹配文法与token,建树

	~Parser()
	{
		tokenfile.close();
	}
	
private:
	std::list<std::list<std::string>> grammar;		//文法	
	std::vector<std::vector<std::vector<std::string>>> final_grammar;	//处理后的LL(1)文法
	std::map<std::string, bool> can_produce_empty;
	std::map<std::string, bool> is_Vn;
	std::map<std::string, std::set<std::string>> FIRST, FOLLOW;
	//预测分析表,结构为		predictive_table[非终结符号,终结符号] = 产生式
	std::map<std::pair<std::string, std::string>, std::vector<std::string>> predictive_table;		
	const char *grammar_file;
	const char *token_file;
	FILE* f;
	int filepos;
	node root;										//语法树根
	std::ifstream tokenfile;


	void get_grammar();								//获取初始文法
	void Eliminate_left_recursion();				//消除左递归
	void get_FIRST();								//获取FIRST集合
	void get_FOLLOW();								//获取FOLLOW集合
	bool judge_LL1_grammar();						//判断是不是LL(1)文法
	bool cmp_set(const std::set<std::string> s1, const std::set<std::string> s2);	//判断两个set是否有交集
	void get_predict_table();						//计算预测分析表
	void get_left_common_factor();					//提取左公因子
	void get_all_Vn();								//找出所有非终结符
	void reconsitution();							//换个方便的数据结构。。。
	std::string get_next_token();					//不断获得下一个token
	void print_grammar0();							//打印初始文法
	void print_grammar1();							//打印去除左递归后的文法
	void print_grammar2();							//打印提取左公因子后的文法
	void print_final_grammar();						//打印重构后的文法,测试
	std::set<std::string> cal_first(std::string Vn);//用于递归计算FIRST集
	void print_FIRST();								//打印FIRST集合
	void print_FOLLOW();							//打印FOLLOW集合
	void print_predictive_table();					//打印预测分析表
	void print_tree();								//打印语法树
	void string_to_vector(std::string &s, std::vector<std::string> &v);		//将string表示的产生式转为vector
	void vector_to_string(std::string &s, std::vector<std::string> &v);		//将vector表示的产生式转为string
	bool has_common_prefix(std::vector<std::string> &gm1, std::vector<std::string> &gm2);	//判断两个产生式有无左公因子
	std::set<std::string> get_left(std::vector<std::string> & tmp);		//返回左端,即first集,用于查看是否有左公因子
	void get_token_value(std::string &token, std::string &value, std::string &type);		//获取token属性
	void deep_print(std::ofstream &out, int r, node *t);				//递归打印树
};
#endif // !Parser_h
#pragma once
//Parser.cpp
//作者:IuSpet
//作用:将一般文法转变为LL(1)文法并构建语法分析树

#include "Parser.h"
const int BUFFERLENGTH = 4096;

//构造函数指定文法文件和token文件
Parser::Parser()
{
	grammar_file = "D://cminus//grammar.txt";
	token_file = "D://cminus//token.txt";
	tokenfile.open(token_file);
	filepos = 0;
	//语法树根
	root.Parent = NULL;
	root.type = "non-terminal";
	root.value = "program";
}

void Parser::get_LL1_grammar()
{
	get_grammar();						//读文法
	print_grammar0();
	Eliminate_left_recursion();			//去左递归
	print_grammar1();
	get_all_Vn();						//标记所有非终结符
	get_left_common_factor();			//消除左公因子
	print_grammar2();
	reconsitution();					//重构存储文法的数据结构
	print_final_grammar();
	get_FIRST();						//计算FIRST集合
	print_FIRST();
	get_FOLLOW();						//计算FOLLOW集合
	print_FOLLOW();
	if (!judge_LL1_grammar())
	{
		std::cout << "不是LL(1)文法" << std::endl;
		//system("pause");
	}
	else
	{
		std::cout << "是LL(1)文法" << std::endl;
		//system("pause");
	}
	get_predict_table();				//构造预测分析表
	print_predictive_table();
}

//解析token,构造语法树
void Parser::Parse()
{
	std::ofstream outfile("D://cminus//matching_process.txt");
	outfile << std::setiosflags(std::ios::left);
	outfile << std::setw(30) << "Stack" << std::setw(20) << "Input" << "Action" << std::endl;
	std::stack<node*> match;
	//std::stack<std::string> match;

	node end;
	end.Parent = NULL;
	end.type = "";
	end.value = "$";

	match.push(&end);
	match.push(&root);

	std::string type;
	std::string value;
	std::string out;
	std::string token = get_next_token();
	get_token_value(token, value, type);

	while (true)
	{
		node &top = *match.top();
		match.pop();
		//token读取完了
		if (token == "$")
		{
			//匹配成功
			if (match.size() == 1)
			{
				outfile << std::setw(30) << "$" << std::setw(20) << "$" << "accept" << std::endl;
				match.pop();
				break;
			}
			else
			{
				//栈顶是非终结符,符号或保留字,匹配值存在value中
				bool choose = true;
				if (top.type == "RESERVED WORD" || top.type == "SYMBOL" || top.type == "non-terminal")
				{
					out = top.value + " ... " + "$";
					outfile << std::setw(30) << out << std::setw(20) << "$" << "output" << top.value;
				}
				//栈顶是标识符或运算符,匹配值存在type中
				else
				{
					out = top.type + " ... " + "$";
					outfile << std::setw(30) << out << std::setw(20) << "$" << "output" << top.value;
					choose = false;
				}

				//查表获取操作
				std::vector<std::string> &pro = choose ? predictive_table[std::make_pair(top.value, value)] : predictive_table[std::make_pair(top.type, value)];
				out.clear();
				vector_to_string(out, pro);
				outfile << " -> " << out << std::endl;

				//空的情况
				if (out == "empty")
				{
					continue;
				}

				//往栈中压入新值,并连接节点建树
				for (int i = pro.size() - 1; i >= 1; i--)
				{
					node *son;
					son = new node;
					//新节点是原栈顶的子节点
					top.sons.push_back(son);
					son->Parent = &top;

					//根据新压入符号不同构造节点
					if (is_Vn[pro[i]])
					{
						son->type = "non-terminal";
						son->value = pro[i];
					}
					else
					{
						if (pro[i] == "ID" || pro[i] == "NUM")
						{
							son->type = pro[i];
							son->value = "";
						}
						else
						{
							if (pro[i][0] >= 'a' && pro[i][0] <= 'z')
							{
								son->type = "RESERVED WORD";
								son->value = pro[i];
							}
							else
							{
								son->type = "SYMBOL";
								son->value = pro[i];
							}
						}
					}
					match.push(son);
				}
			}
		}		
		else
		{
			bool choose = true;
			if (top.type == "RESERVED WORD" || top.type == "SYMBOL" || top.type == "non-terminal")
			{
				out = top.value + " ... " + "$";
				outfile << std::setw(30) << out;
			}
			//栈顶是标识符或运算符,匹配值存在type中
			else
			{
				out = top.type + " ... " + "$";
				outfile << std::setw(30) << out;
				choose = false;
			}

			//token是关键字或符号
			if (type == "RESERVED WORD" || type == "SYMBOL")
			{
				out = value + " ... " + "$";
				outfile << std::setw(20) << out;
				//匹配,读取下一个token
				if (top.value == value)
				{
					outfile << "match" << std::endl;
					token.clear();
					token = get_next_token();
					get_token_value(token, value, type);
					continue;
				}
				//不匹配,查表
				else
				{
					std::vector<std::string> &pro = choose ? predictive_table[std::make_pair(top.value, value)] : predictive_table[std::make_pair(top.type, value)];
					if (pro.size() == 0)
					{
						outfile << "error" << std::endl;
						exit(1);
					}
					if (choose)
					{
						outfile << "output: " << top.value << " -> ";
					}
					else
					{
						outfile << "output: " << top.type << " -> ";
					}
					out.clear();
					vector_to_string(out, pro);
					outfile << out << std::endl;

					//空的情况
					if (out == "empty")
					{
						continue;
					}

					//往栈中压入新值,并连接节点建树
					for (int i = pro.size() - 1; i >= 0; i--)
					{
						node *son;
						son = new node;
						//新节点是原栈顶的子节点
						top.sons.push_back(son);
						son->Parent = &top;

						//根据新压入符号不同构造节点
						if (is_Vn[pro[i]])
						{
							son->type = "non-terminal";
							son->value = pro[i];
						}
						else
						{
							if (pro[i] == "ID" || pro[i] == "NUM")
							{
								son->type = pro[i];
								son->value = "";
							}
							else
							{
								//保留字
								if (pro[i][0] >= 'a' && pro[i][0] <= 'z')
								{
									son->type = "RESERVED WORD";
									son->value = pro[i];
								}
								//运算符
								else
								{
									son->type = "SYMBOL";
									son->value = pro[i];
								}
							}
						}
						match.push(son);
					}
				}
			}
			//token是标识符或数字
			else
			{
				out = type + " ... " + "$";
				outfile << std::setw(20) << out;
				//匹配,节点加入值,读取下一个token
				if (top.type == type)
				{
					outfile << "match" << std::endl;
					top.value = value;
					token.clear();
					token = get_next_token();
					get_token_value(token, value, type);
					continue;
				}
				else
				{
					std::vector<std::string> &pro = choose ? predictive_table[std::make_pair(top.value, type)] : predictive_table[std::make_pair(top.type, type)];

					if (choose)
					{
						outfile << "output: " << top.value << " -> ";
					}
					else
					{
						outfile << "output: " << top.type << " -> ";
					}

					out.clear();
					vector_to_string(out, pro);
					outfile << out << std::endl;

					//空的情况
					if (out == "empty")
					{
						continue;
					}

					//往栈中压入新值,并连接节点建树
					for (int i = pro.size() - 1; i >= 0; i--)
					{
						node *son;
						son = new node;
						//新节点是原栈顶的子节点
						top.sons.push_back(son);
						son->Parent = &top;

						//根据新压入符号不同构造节点
						if (is_Vn[pro[i]])
						{
							son->type = "non-terminal";
							son->value = pro[i];
						}
						//终结符,判断是否匹配
						else
						{
							if (pro[i] == "ID" || pro[i] == "NUM")
							{
								son->type = pro[i];
								son->value = "";
							}
							else
							{
								//保留字
								if (pro[i][0] >= 'a' && pro[i][0] <= 'z')
								{
									son->type = "RESERVED WORD";
									son->value = pro[i];
								}
								//运算符
								else
								{
									son->type = "SYMBOL";
									son->value = pro[i];
								}
							}
						}
						match.push(son);
					}
				}
			}
		}
	}
	outfile.close();
	print_tree();
}

//打印语法树
void Parser::print_tree()
{
	std::ofstream outfile("D://cminus//syntax_tree.txt");
	outfile << std::setiosflags(std::ios::left);
	deep_print(outfile, 0, &root);
	outfile.close();
}

//打印从文件中读取的文法
void Parser::print_grammar0()
{
	std::ofstream outfile("D://cminus//grammar0.txt");
	for (const auto &l : grammar)
	{
		bool first = true;
		for (const auto &t : l)
		{
			if (first)
			{
				outfile << t << " -> ";
				first = false;
			}
			else outfile << t << " | ";
		}
		outfile << std::endl << std::endl;
	}
	outfile.close();
}

//打印去除左递归后的文法
void Parser::print_grammar1()
{
	std::ofstream outfile("D://cminus//grammar1.txt");
	for (const auto &l : grammar)
	{
		bool first = true;
		for (const auto &t : l)
		{
			if (first)
			{
				outfile << t << " -> ";
				first = false;
			}
			else outfile << t << " | ";
		}
		outfile << std::endl << std::endl;
	}
	outfile.close();
}

//打印提取公因子后的文法
void Parser::print_grammar2()
{
	std::ofstream outfile("D://cminus//grammar2.txt");

	for (const auto &l : grammar)
	{
		bool first = true;
		for (const auto &t : l)
		{
			if (first)
			{
				outfile << t << " -> ";
				first = false;
			}
			else outfile << t << " | ";
		}
		outfile << std::endl << std::endl;
	}
	outfile.close();
}

//打印修改后的LL(1)文法
void Parser::print_final_grammar()
{
	std::ofstream outfile("D://cminus//ll(1)grammar.txt");
	for (auto &gm : final_grammar)
	{
		outfile << gm[0][0] << " -> ";
		for (int i = 1; i < gm.size(); i++)
		{
			for (int j = 0; j < gm[i].size(); j++)
			{
				outfile << gm[i][j] << " ";
			}
			outfile << "| ";
		}
		outfile << std::endl << std::endl;
	}
	outfile.close();
}

//打印各个文法的FIRST集合
void Parser::print_FIRST()
{
	std::ofstream outfile("D://cminus//FIRST.txt");
	for (const auto &gm : final_grammar)
	{
		outfile << gm[0][0] << " : { ";
		for (const auto &str : FIRST[gm[0][0]])
		{
			outfile << str << "| ";
		}
		outfile << "}"<<std::endl << std::endl;
	}
	outfile.close();
}

//打印各个文法的FOLLOW集合
void Parser::print_FOLLOW()
{
	std::ofstream outfile("D://cminus//FOLLOW.txt");
	for (const auto &gm : final_grammar)
	{
		const auto &Vn = gm[0][0];
		outfile << Vn << " :{ ";
		for (auto &str : FOLLOW[Vn])
		{
			outfile << str << " | ";
		}
		outfile << "}" << std::endl << std::endl;
	}
	outfile.close();
}

//打印预测分析表
void Parser::print_predictive_table()
{
	std::ofstream outfile("D://cminus//predictive_table.txt");
	for (auto &p : predictive_table)
	{
		outfile << p.first.first << "\t" << p.first.second << "\t";
		std::string pro;
		vector_to_string(pro, p.second);
		outfile << pro << std::endl << std::endl;
	}
	outfile.close();
}

void Parser::string_to_vector(std::string & s, std::vector<std::string>& v)
{
	std::string tmp;
	for (auto &c : s)
	{
		if (c == ' ')
		{
			v.push_back(tmp);
			tmp.clear();
		}
		else
		{
			tmp.push_back(c);
		}
	}
	if (tmp.length()) v.push_back(tmp);
}

void Parser::vector_to_string(std::string & s, std::vector<std::string>& v)
{
	for (int i = 0; i < v.size(); i++)
	{
		if (v[i] == "") continue;
		s += v[i];
		if (i != v.size() - 1)
		{
			s.push_back(' ');
		}
	}
}

//判断两个产生式有没有左公因子
bool Parser::has_common_prefix(std::vector<std::string>& gm1, std::vector<std::string>& gm2)
{
	std::set<std::string> S, s1, s2;
	s1 = get_left(gm1);
	s2 = get_left(gm2);
	S.insert(s1.begin(), s1.end());
	S.insert(s2.begin(), s2.end());
	return s1.size() + s2.size() != S.size();
}

//返回产生式的first集,查找左公因子用
std::set<std::string> Parser::get_left(std::vector<std::string> & tmp)
{
	std::set<std::string> res;
	if (tmp.size() == 0)
	{
		return res;
	}
	int i = 0;
	do
	{
		if (is_Vn[tmp[i]])
		{
			for (auto &gm : grammar)
			{
				auto p = gm.begin();
				if (tmp[i] == *p)
				{
					for (p++; p != gm.end(); p++)
					{
						std::vector<std::string> vs;
						string_to_vector(*p, vs);
						auto s = get_left(vs);
						res.insert(s.begin(), s.end());
					}
					break;
				}
			}
		}
		else
		{
			res.insert(tmp[i]);
		}
		if (i == tmp.size() - 1 && can_produce_empty[tmp[i]])
		{
			res.insert("empty");
			break;
		}
	} while (can_produce_empty[tmp[i++]]);
	return res;
}

//将token中的属性提取出来
void Parser::get_token_value(std::string & token, std::string & value, std::string & type)
{
	type.clear();
	value.clear();
	node *res;
	res = new node;
	for (int i = 0; i < token.length(); i++)
	{
		auto c = token[i];
		if (c == '<' && i == 0) continue;
		else if (c == ',')
		{
			type = value;
			value.clear();
			continue;
		}
		else if (c == '>' && i == token.length() - 1)
		{
			break;
		}
		else
		{
			value.push_back(c);
		}
	}
}

//递归打印语法树,缩进表示参差
void Parser::deep_print(std::ofstream & out, int r, node * t)
{
	for (int i = 0; i < r; i++)
	{
		out << "-";
	}
	std::string tmp;
	tmp = "type: " + t->type;
	out << std::setw(30) << tmp;
	tmp = "value: " + t->value;
	out << std::setw(30) << tmp;
	out << std::endl;
	for (auto &p : t->sons)
	{
		deep_print(out, r + 1, p);
	}
}

//读取原始文法,保存到内存中
void Parser::get_grammar()						
{
	f = fopen(grammar_file, "r");
	char input_buffer[BUFFERLENGTH];
	while (fgets(input_buffer, BUFFERLENGTH, f))
	{
		std::list<std::string> tmp;
		int len = strlen(input_buffer);
		std::string str;
		int pos = 0;
		for (int i = 0; i < len; i++)
		{
			if (input_buffer[i] == ' ' && str.length() == 0) continue;
			else if (input_buffer[i] == '|')
			{
				while (str.back() == ' ') str.pop_back();
				tmp.push_back(str);
				str.clear();
				pos = 0;
			}
			else if (input_buffer[i] == -95)
			{
				while (str.back() == ' ') str.pop_back();
				tmp.push_back(str);
				str.clear();
				pos = 0;
				i++;
			}
			else str.push_back(input_buffer[i]);
		}
		while (str.back() == ' ' || str.back() == '\n') str.pop_back();
		tmp.push_back(str);
		grammar.push_back(tmp);
	}
}

//消除左递归
void Parser::Eliminate_left_recursion()
{
	//TO DO:消除grammar里的左递归  
	for (auto p = grammar.begin(); p != grammar.end(); p++)
	{
		std::list<std::string> &A = *p;			//文法A
		//展开当前文法
		for (auto j = grammar.begin(); j != p ; j++)
		{
			//替换A产生式中的Vn			
			std::list<std::string> &B = *j;			//文法B
			std::string &Vn = B.front();			//文法B的开头
			auto pA = A.begin(); pA++;
			//auto pB = B.begin(); pB++;
			for (; pA != A.end(); pA++)				
			{
				std::string &production = *pA;		//文法A中的产生式
				std::string item(production);
				std::string tmp;					//提取production中的非终结符
				for (char c:production)
				{
					if (c != ' ')
					{
						tmp.push_back(c);
					}
					else							//提取出了一个完整的非终结符
					{
						if (Vn == tmp)				//该非终结符是前面的开始符号
						{
							//std::string prefix = production.substr(0,subbg);
							//std::string suffix = production.substr(subend);						
							pA = A.erase(pA);		//删除原产生式
							auto pB = B.begin(); pB++;
							for (; pB != B.end(); pB++)		//将production替换为B中的产生式
							{
								std::string newprd(item);
								newprd.replace(0,Vn.length(), *pB);
								A.insert(pA, newprd);		//加入替换后的产生式
							}
							pA = A.begin(); 
							break;
						}
						//std::cout << tmp << std::endl;
						break;
					}
				}
			}
		}
		//去除直接左递归
		do
		{
			std::list<std::string>::iterator pA = A.begin();
			auto Vn = *pA;
			pA++;
			/*
			A -> Aα| β
			vs1存α,vs2存β
			*/
			std::vector<std::string> vs1, vs2;
			for (; pA != A.end(); pA++)
			{
				std::string &production = *pA;
				std::string first;				//每个产生式的第一个字母判断分给α还是β
				bool flag = true;
				for (char c : production)
				{
					if (c == ' ')
					{
						if (first == Vn)		//是Aα
						{
							vs1.push_back(production.substr(Vn.length() + 1));
						}
						else					//是β
						{
							vs2.push_back(production);
						}
						flag = false;			//已经处理过当前产生式
						break;
					}
					else
					{
						first.push_back(c);
					}
				}
				if (flag) vs2.push_back(production);
			}
			if (vs1.empty()) continue;			//α是空的,即无左递归
			pA = A.begin(); pA++;
			/*
			将A -> Aα| β改为
			A -> βA1
			A1 -> αA1
			*/
			while (pA != A.end()) pA = A.erase(pA);		//清空A的产生式
			//如果β只有empty,直接改为 A -> αA | empty
			if (vs2[0] == "empty")
			{
				for (std::string s : vs1)
				{
					A.push_back(s + ' ' + Vn);
				}
				A.push_back("empty");
				continue;
			}
			std::string newprdt(Vn + "_1");				//附加产生式
			//βA1
			for (std::string s : vs2)
			{
				A.push_back(s + ' ' + newprdt);
			}
			//A1 -> αA1
			std::list<std::string> newgrammar;
			newgrammar.push_back(newprdt);				//加入A1
			for (std::string s : vs1)
			{
				newgrammar.push_back(s + ' ' + newprdt);
			}
			newgrammar.push_back("empty");
			p++;
			p = grammar.insert(p, newgrammar);
		} while (false);
		
	}
}

//获取FIRST集合
void Parser::get_FIRST()
{
	FIRST.clear();
	for (auto &gm : final_grammar)
	{
		std::string &Vn = gm[0][0];
		if (FIRST[Vn].empty())
		{
			FIRST[Vn] = cal_first(Vn);
		}
	}
}

//递归计算当前非终结符的fist集合
std::set<std::string> Parser::cal_first(std::string Vn)
{
	std::set<std::string> res;
	for (auto &gm : final_grammar)
	{
		if (gm[0][0] == Vn)
		{
			//每个产生式的FIRST集合
			for (int j = 1; j < gm.size(); j++)
			{
				auto &pro = gm[j];
				//从每个产生式的第一个非终结符开始计算FISRST,如果里面有empty,再计算下一个,若最后一个也有empty,将empty加入FIRST集合
				for (int i = 0; i < pro.size(); i++)
				{
					//判断是不是非终结符
					if (is_Vn[pro[i]])
					{
						FIRST[pro[i]] = cal_first(pro[i]);
						res.insert(FIRST[pro[i]].begin(), FIRST[pro[i]].end());
						if (FIRST[pro[i]].find("empty") != FIRST[pro[i]].end())
						{
							if (i == pro.size() - 1)
							{
								res.insert("empty");
								break;
							}
							else continue;
						}
						else break;
					}
					else
					{
						if (pro[i] != "empty")
						{
							res.insert(pro[i]);
						}
						//产生式只有一个empty的情况
						else if (i == 0)
						{
							res.insert("empty");
						}
						break;
					}
				}
			}
			break;
		}
	}
	return res;
}

//获取FOLLOW集合
void Parser::get_FOLLOW()
{
	FOLLOW[final_grammar[0][0][0]].insert("$");
	int t = final_grammar.size();
	while (t--)
	{
		for (auto &gm : final_grammar)
		{
			/*
			对于每条文法规则,不断寻找A -> αBβ结构,
			将FIRST(β)中除empty外的所有符号加入FOLLOW(B)
			*/
			auto p = gm.begin();
			for (p++; p != gm.end(); p++)
			{
				auto &pro = *p;
				/*
				标记该符号后面的部分能否产生empty
				从而判断是否应用规则:对于A -> αBβ
				将FOLLOW(A)中的符号加入FOLLOW(B)中
				flag为true时是应用,false时是不应用
				*/
				bool flag = true;
				//从后往前判断每个文法,不能产生empty时将flag改为false
				for (int i = pro.size() - 1; i >= 0; i--)
				{
					//判断是否是非终结符
					if (is_Vn[pro[i]])
					{
						
						if (flag)
						{
							//因为gm[0][0]的follow集合可能不全,故需要反复调用该函数补全各个集合
							FOLLOW[pro[i]].insert(FOLLOW[gm[0][0]].begin(), FOLLOW[gm[0][0]].end());
							if (FIRST[pro[i]].find("empty") == FIRST[pro[i]].end()) flag = false;
						}
						//将后半部分first加入follow
						for (int j = i + 1; j < pro.size(); j++)
						{
							if (is_Vn[pro[j]])
							{
								FOLLOW[pro[i]].insert(FIRST[pro[j]].begin(), FIRST[pro[j]].end());
								if (FIRST[pro[j]].find("empty") == FIRST[pro[j]].end())
								{
									flag = false;
									break;
								}
							}
							else
							{
								FOLLOW[pro[i]].insert(pro[j]);
								break;
							}
						}	
					}
					else
					{
						flag = false;
					}
				}
			}
		}
	}
	//FOLLOW集合没有empty
	for (auto &gm : final_grammar)
	{
		auto &Vn = gm[0][0];
		auto p = FOLLOW[Vn].find("empty");
		if (p != FOLLOW[Vn].end()) FOLLOW[Vn].erase(p);
	}
}

//判断是不是LL(1)文法
bool Parser::judge_LL1_grammar()
{
	bool res = true;
	for (auto &gm : final_grammar)
	{
		auto &Vn = gm[0][0];
		bool flag = false;
		//first(Vn)里有empty,需要比较follow(Vn)与first(A),A是first集里没有empty的产生式
		if (FIRST[Vn].find("empty") != FIRST[Vn].end())
		{
			flag = true;
		}
		for (int i = 1; i < gm.size(); i++)
		{
			//整个产生式的FIRST集合
			std::set<std::string> s1;
			for (auto &v : gm[i])
			{
				//是非终结符
				if (is_Vn[v])
				{
					s1.insert(FIRST[v].begin(), FIRST[v].end());
					//里面不含empty
					if (FIRST[v].find("empty") == FIRST[v].end()) break;
				}
				else
				{
					s1.insert(v);
					break;
				}
			}
			if (flag && s1.find("empty") == s1.end()) res &= cmp_set(s1, FOLLOW[Vn]);
			for (int j = i + 1; j < gm.size(); j++)
			{
				std::set<std::string> s2;
				for (auto &v : gm[j])
				{
					if (is_Vn[v])
					{
						s2.insert(FIRST[v].begin(), FIRST[v].end());
						if (FIRST[v].find("empty") == FIRST[v].end()) break;
					}
					else
					{
						s2.insert(v);
						break;
					}
				}
				res &= cmp_set(s1, s2);
			}
		}
	}
	return res;
}

//比较两个集合有没有交集
bool Parser::cmp_set(const std::set<std::string> s1, const std::set<std::string> s2)
{
	int l1 = s1.size();
	int l2 = s2.size();
	std::set<std::string> s;
	s.insert(s1.begin(), s1.end());
	s.insert(s2.begin(), s2.end());
	int l = s.size();
	if (l1 + l2 != l) {
		std::cout << "error" << std::endl;
	}
	return (l1 + l2) == l;
}

//计算预测分析表
void Parser::get_predict_table()
{
	/*
	对于A -> α,将产生式加入pair(A, a∈FIRST(α))中
	如果empty在FIRST(α)中,将产生式加入pair(A, b∈FOLLOW(A))中
	*/
	for (auto &gm : final_grammar)
	{
		auto &Vn = gm[0][0];
		for (int i = 1; i < gm.size(); i++)
		{
			//gm[i]即为α
			auto &pro = gm[i];
			std::set<std::string> s;		//α的FIRST集合
			bool flag = false;				//FIRST里有没有empty
			for (auto &v : pro)
			{
				//是非终结符
				if (is_Vn[v])
				{
					s.insert(FIRST[v].begin(), FIRST[v].end());
					//里面不含empty
					if (FIRST[v].find("empty") == FIRST[v].end()) break;
				}
				else
				{
					s.insert(v);
					break;
				}
			}
			if (s.find("empty") != s.end())
			{
				flag = true;
			}
			for (auto &Vt : s)
			{
				predictive_table[std::make_pair(Vn, Vt)] = pro;
			}
			if (flag)
			{
				for (auto &Vt : FOLLOW[Vn])
				{
					predictive_table[std::make_pair(Vn, Vt)] = pro;
				}
			}
		}

	}
}

//提取左公因子
void Parser::get_left_common_factor()
{
	for (auto &gm : grammar)
	{
		int sign = 2;
		//reconsitution();
		//get_FIRST();		
	loop:
		auto p = gm.begin();
		p++;
		//两层循环比较该非终结符所有的产生式
		for (; p != gm.end(); p++)
		{
			
			auto p2 = p;
			for ( p2++; p2 != gm.end(); p2++)
			{
				
				//把产生式内每个标识符存到vector里方便访问
				std::vector<std::string> pro1, pro2;
				string_to_vector(*p, pro1);
				string_to_vector(*p2, pro2);
				//判断两个产生式有没有左公因式
				if (!has_common_prefix(pro1, pro2)) continue;
				else
				{	
					//产生式相同,提取出来
					if (pro1[0] == pro2[0])
					{
						std::vector<std::string> common;
						int i = 0;
						while (i < pro1.size() && i < pro2.size() && pro1[i] == pro2[i])
						{
							common.push_back(pro1[i]);
							pro1[i] = "";
							pro2[i] = "";
							i++;
						}
						std::string common_fact;
						vector_to_string(common_fact, common);
						std::string npro1, npro2;
						vector_to_string(npro1, pro1);
						vector_to_string(npro2, pro2);
						std::string nVn(gm.front());
						nVn = nVn + '_' + (char)(sign + 48);
						sign++;
						//将有公因子的产生式删除
						gm.erase(p);
						gm.erase(p2);
						//替换
						gm.push_back(common_fact + ' ' + nVn);
						//新的文法规则
						std::list<std::string> tmp;
						tmp.push_back(nVn);
						if (npro1 != "")
						{
							tmp.push_back(npro1);
						}
						else
						{
							tmp.push_back("empty");
						}
						if (npro2 != "")
						{
							tmp.push_back(npro2);
						}	
						else
						{
							tmp.push_back("empty");
						}
						grammar.push_back(tmp);
						is_Vn[nVn] = true;
					}
					//说明是该产生式的产生式有公因子,先展开
					else
					{
						//展开p指向的产生式
						if (is_Vn[pro1[0]])
						{
							for (auto & g: grammar)
							{
								if (g.front() == pro1[0])
								{
									std::string npro;
									pro1[0] = "";
									vector_to_string(npro, pro1);
									gm.erase(p);
									auto it = g.begin(); it++;
									for (; it != g.end(); it++)
									{
										if (npro != "")
										{
											gm.push_back(*it + ' ' + npro);
										}
										else
										{
											gm.push_back(*it);
										}
									}
									break;
								}
							}
						}
						//展开p2指向的产生式
						if (is_Vn[pro2[0]])
						{
							for (auto & g : grammar)
							{
								if (g.front() == pro2[0])
								{
									std::string npro;
									pro2[0] = "";
									vector_to_string(npro, pro2);
									gm.erase(p2);
									auto it = g.begin(); it++;
									for (; it != g.end(); it++)
									{
										if (npro != "")
										{
											gm.push_back(*it + ' ' + npro);
										}
										else
										{
											gm.push_back(*it);
										}
									}
									break;
								}
							}
						}
					}	
					goto loop;
				}
			}
		}
	}
}

//标记所有非终结符与终结符
void Parser::get_all_Vn()
{
	//往map中加入所有终结符与非终结符
	for (auto gm : grammar)
	{
		for (auto V : gm)
		{
			is_Vn[V] = false;
		}
	}
	//将非终结符标记为true
	for (auto gm : grammar)
	{
		is_Vn[*gm.begin()] = true;
	}
}

//文法不再改变,换个数据结构,方便后面使用
void Parser::reconsitution()
{
	final_grammar.clear();
	std::vector<std::vector<std::string>> tmp;
	for (auto &gm : grammar)
	{
		tmp.clear();
		std::vector<std::string> vs;
		for (auto &pro : gm)
		{
			vs.clear();
			std::string sub;
			for (auto &c : pro)
			{
				if (c == ' ')
				{
					vs.push_back(sub);
					sub.clear();
				}
				else
				{
					sub.push_back(c);
				}
			}
			vs.push_back(sub);
			tmp.push_back(vs);
		}
		final_grammar.push_back(tmp);
	}
	//grammar.clear();
}


//不断获取下一个token建立语法树
std::string Parser::get_next_token()
{
	std::string str;
	if (std::getline(tokenfile, str))
	{
		return str;
	}
	else
	{
		return std::string("$");
	}
}
//main.cpp
#include"scanner.h"
#include"Parser.h"

int main()
{
	const char *source_file = "D://cminus//in.c";
	Scanner s1(source_file);
	s1.GetToken();
	Parser p1;
	p1.get_LL1_grammar();
	p1.Parse();
	return 0;
}

测试

测试代码

int read(void){} 
void write(int x){}
int fact(int x)
{
	 
	if(x>1)
	{
		return x*fact(x-1);
	}		
	else
	{
		return 1;
	}		
}
void main(void)
{
	int x;
	x = read();
	if (x>0) 
	{
		write(fact(x));
	}
}

原始文法

program → declaration-list
declaration-list → declaration-list declaration | declaration
declaration → var-declaration | fun-declaration
var-declaration → type-specifier ID ; | type-specifier ID [ NUM ] ;
type-specifier → int | void
fun-declaration → type-specifier ID ( params ) compound-stmt
params → param-list | void
param-list→ param-list , param | param
param → type-specifier ID | type-specifier ID [ ]
compound-stmt → { local-declarations statement-list }
local-declarations → local-declarations var-declaration | empty
statement-list → statement-list statement | empty
statement → expression-stmt | compound-stmt| selection-stmt | iteration-stmt | return-stmt
expression-stmt → expression ; | ;
selection-stmt → if ( expression ) { statement } | if ( expression ) { statement } else statement
iteration-stmt → while ( expression ) statement
return-stmt → return ; | return expression ;
expression → var = expression | simple-expression
var → ID | ID [ expression ]
simple-expression → additive-expression relop additive-expression | additive-expression
relop → <= | < | > | >= | == | !=
additive-expression → additive-expression addop term | term
addop → + | -
term → term mulop factor | factor
mulop → * | /
factor → ( expression ) | var | call | NUM
call → ID ( args )
args → arg-list | empty
arg-list → arg-list , expression | expression

修改后的LL(1)文法

program -> declaration-list | 

declaration-list -> declaration declaration-list_1 | 

declaration-list_1 -> declaration declaration-list_1 | empty | 

declaration -> int ID declaration_3 | void ID declaration_4 | 

var-declaration -> type-specifier ID var-declaration_2 | 

type-specifier -> int | void | 

fun-declaration -> int ID ( params ) compound-stmt | void ID ( params ) compound-stmt | 

params -> int ID params_3 | void params_4 | 

param-list -> param param-list_1 | 

param-list_1 -> , param param-list_1 | empty | 

param -> int ID param_2 | void ID param_3 | 

compound-stmt -> { local-declarations statement-list } | 

local-declarations -> var-declaration local-declarations | empty | 

statement-list -> statement statement-list | empty | 

statement -> expression-stmt | compound-stmt | selection-stmt | iteration-stmt | return-stmt | 

expression-stmt -> expression ; | ; | 

selection-stmt -> if ( expression ) { statement } selection-stmt_2 | 

iteration-stmt -> while ( expression ) statement | 

return-stmt -> return return-stmt_2 | 

expression -> ( expression ) term_1 additive-expression_1 expression_3 | NUM term_1 additive-expression_1 expression_3 | ID expression_6 | 

var -> ID var_2 | 

simple-expression -> additive-expression simple-expression_2 | 

relop -> <= | < | > | >= | == | != | 

additive-expression -> term additive-expression_1 | 

additive-expression_1 -> addop term additive-expression_1 | empty | 

addop -> + | - | 

term -> factor term_1 | 

term_1 -> mulop factor term_1 | empty | 

mulop -> * | / | 

factor -> ( expression ) | NUM | ID factor_2 | 

call -> ID ( args ) | 

args -> arg-list | empty | 

arg-list -> expression arg-list_1 | 

arg-list_1 -> , expression arg-list_1 | empty | 

declaration_2 -> ; | [ NUM ] ; | 

declaration_3 -> ( params ) compound-stmt | declaration_2 | 

declaration_4 -> ( params ) compound-stmt | declaration_2 | 

var-declaration_2 -> ; | [ NUM ] ; | 

params_2 -> empty | ID param-list_1 | 

params_3 -> param-list_1 | [ ] param-list_1 | 

params_4 -> empty | ID params_4_2 | 

param_2 -> empty | [ ] | 

param_3 -> empty | [ ] | 

selection-stmt_2 -> empty | else statement | 

return-stmt_2 -> ; | expression ; | 

expression_2 -> = expression | [ expression ] = expression | 

expression_3 -> relop additive-expression | empty | 

expression_4 -> expression_2 | term_1 additive-expression_1 expression_3 | 

expression_5 -> [ expression ] term_1 additive-expression_1 expression_3 | ( args ) term_1 additive-expression_1 expression_3 | 

expression_6 -> term_1 additive-expression_1 expression_3 | ( args ) term_1 additive-expression_1 expression_3 | = expression | [ expression ] expression_6_2 | 

var_2 -> empty | [ expression ] | 

simple-expression_2 -> relop additive-expression | empty | 

factor_2 -> var_2 | ( args ) | 

params_4_2 -> [ ] param-list_1 | param-list_1 | 

expression_6_2 -> term_1 additive-expression_1 expression_3 | = expression | 

FIRST集合

program : { int| void| }

declaration-list : { int| void| }

declaration-list_1 : { empty| int| void| }

declaration : { int| void| }

var-declaration : { int| void| }

type-specifier : { int| void| }

fun-declaration : { int| void| }

params : { int| void| }

param-list : { int| void| }

param-list_1 : { ,| empty| }

param : { int| void| }

compound-stmt : { {| }

local-declarations : { empty| int| void| }

statement-list : { (| ;| ID| NUM| empty| if| return| while| {| }

statement : { (| ;| ID| NUM| if| return| while| {| }

expression-stmt : { (| ;| ID| NUM| }

selection-stmt : { if| }

iteration-stmt : { while| }

return-stmt : { return| }

expression : { (| ID| NUM| }

var : { ID| }

simple-expression : { (| ID| NUM| }

relop : { !=| <| <=| ==| >| >=| }

additive-expression : { (| ID| NUM| }

additive-expression_1 : { +| -| empty| }

addop : { +| -| }

term : { (| ID| NUM| }

term_1 : { *| /| empty| }

mulop : { *| /| }

factor : { (| ID| NUM| }

call : { ID| }

args : { (| ID| NUM| empty| }

arg-list : { (| ID| NUM| }

arg-list_1 : { ,| empty| }

declaration_2 : { ;| [| }

declaration_3 : { (| ;| [| }

declaration_4 : { (| ;| [| }

var-declaration_2 : { ;| [| }

params_2 : { ID| empty| }

params_3 : { ,| [| empty| }

params_4 : { ID| empty| }

param_2 : { [| empty| }

param_3 : { [| empty| }

selection-stmt_2 : { else| empty| }

return-stmt_2 : { (| ;| ID| NUM| }

expression_2 : { =| [| }

expression_3 : { !=| <| <=| ==| >| >=| empty| }

expression_4 : { !=| *| +| -| /| <| <=| =| ==| >| >=| [| empty| }

expression_5 : { (| [| }

expression_6 : { !=| (| *| +| -| /| <| <=| =| ==| >| >=| [| empty| }

var_2 : { [| empty| }

simple-expression_2 : { !=| <| <=| ==| >| >=| empty| }

factor_2 : { (| [| empty| }

params_4_2 : { ,| [| empty| }

expression_6_2 : { !=| *| +| -| /| <| <=| =| ==| >| >=| empty| }

FOLLOW集合

program :{ $ | }

declaration-list :{ $ | }

declaration-list_1 :{ $ | }

declaration :{ $ | int | void | }

var-declaration :{ ( | ; | ID | NUM | if | int | return | void | while | { | } | }

type-specifier :{ ID | }

fun-declaration :{ }

params :{ ) | }

param-list :{ }

param-list_1 :{ ) | }

param :{ ) | , | }

compound-stmt :{ $ | ( | ; | ID | NUM | if | int | return | void | while | { | } | }

local-declarations :{ ( | ; | ID | NUM | if | return | while | { | } | }

statement-list :{ } | }

statement :{ ( | ; | ID | NUM | if | return | while | { | } | }

expression-stmt :{ ( | ; | ID | NUM | if | return | while | { | } | }

selection-stmt :{ ( | ; | ID | NUM | if | return | while | { | } | }

iteration-stmt :{ ( | ; | ID | NUM | if | return | while | { | } | }

return-stmt :{ ( | ; | ID | NUM | if | return | while | { | } | }

expression :{ ) | , | ; | ] | }

var :{ }

simple-expression :{ }

relop :{ ( | ID | NUM | }

additive-expression :{ != | ) | , | ; | < | <= | == | > | >= | ] | }

additive-expression_1 :{ != | ) | , | ; | < | <= | == | > | >= | ] | }

addop :{ ( | ID | NUM | }

term :{ != | ) | + | , | - | ; | < | <= | == | > | >= | ] | }

term_1 :{ != | ) | + | , | - | ; | < | <= | == | > | >= | ] | }

mulop :{ ( | ID | NUM | }

factor :{ != | ) | * | + | , | - | / | ; | < | <= | == | > | >= | ] | }

call :{ }

args :{ ) | }

arg-list :{ ) | }

arg-list_1 :{ ) | }

declaration_2 :{ $ | int | void | }

declaration_3 :{ $ | int | void | }

declaration_4 :{ $ | int | void | }

var-declaration_2 :{ ( | ; | ID | NUM | if | int | return | void | while | { | } | }

params_2 :{ }

params_3 :{ ) | }

params_4 :{ ) | }

param_2 :{ ) | , | }

param_3 :{ ) | , | }

selection-stmt_2 :{ ( | ; | ID | NUM | if | return | while | { | } | }

return-stmt_2 :{ ( | ; | ID | NUM | if | return | while | { | } | }

expression_2 :{ }

expression_3 :{ ) | , | ; | ] | }

expression_4 :{ }

expression_5 :{ }

expression_6 :{ ) | , | ; | ] | }

var_2 :{ != | ) | * | + | , | - | / | ; | < | <= | == | > | >= | ] | }

simple-expression_2 :{ }

factor_2 :{ != | ) | * | + | , | - | / | ; | < | <= | == | > | >= | ] | }

params_4_2 :{ ) | }

expression_6_2 :{ ) | , | ; | ] | }

预测分析表(部分)

additive-expression	(	term additive-expression_1

additive-expression	ID	term additive-expression_1

additive-expression	NUM	term additive-expression_1

additive-expression_1	!=	empty

additive-expression_1	)	empty

additive-expression_1	+	addop term additive-expression_1

additive-expression_1	,	empty

additive-expression_1	-	addop term additive-expression_1

additive-expression_1	;	empty

additive-expression_1	<	empty

additive-expression_1	<=	empty

additive-expression_1	==	empty

additive-expression_1	>	empty

additive-expression_1	>=	empty

additive-expression_1	]	empty

additive-expression_1	empty	empty

addop	+	+

addop	-	-

arg-list	(	expression arg-list_1

arg-list	ID	expression arg-list_1

arg-list	NUM	expression arg-list_1

arg-list_1	)	empty

arg-list_1	,	, expression arg-list_1

arg-list_1	empty	empty

args	(	arg-list

args	)	empty

args	ID	arg-list

args	NUM	arg-list

args	empty	empty

匹配token过程(部分)

Stack                         Input               Action
program ... $                 int ... $           output: program -> declaration-list
declaration-list ... $        int ... $           output: declaration-list -> declaration declaration-list_1
declaration ... $             int ... $           output: declaration -> int ID declaration_3
int ... $                     int ... $           match
ID ... $                      ID ... $            match
declaration_3 ... $           ( ... $             output: declaration_3 -> ( params ) compound-stmt
( ... $                       ( ... $             match
params ... $                  void ... $          output: params -> void params_4
void ... $                    void ... $          match
params_4 ... $                ) ... $             output: params_4 -> empty
) ... $                       ) ... $             match
compound-stmt ... $           { ... $             output: compound-stmt -> { local-declarations statement-list }
{ ... $                       { ... $             match
local-declarations ... $      } ... $             output: local-declarations -> empty
statement-list ... $          } ... $             output: statement-list -> empty
} ... $                       } ... $             match
declaration-list_1 ... $      void ... $          output: declaration-list_1 -> declaration declaration-list_1
declaration ... $             void ... $          output: declaration -> void ID declaration_4
void ... $                    void ... $          match
ID ... $                      ID ... $            match
declaration_4 ... $           ( ... $             output: declaration_4 -> ( params ) compound-stmt
( ... $                       ( ... $             match
params ... $                  int ... $           output: params -> int ID params_3
int ... $                     int ... $           match
ID ... $                      ID ... $            match
params_3 ... $                ) ... $             output: params_3 -> param-list_1
param-list_1 ... $            ) ... $             output: param-list_1 -> empty
) ... $                       ) ... $             match
compound-stmt ... $           { ... $             output: compound-stmt -> { local-declarations statement-list }
{ ... $                       { ... $             match
local-declarations ... $      } ... $             output: local-declarations -> empty
statement-list ... $          } ... $             output: statement-list -> empty
} ... $                       } ... $             match
declaration-list_1 ... $      int ... $           output: declaration-list_1 -> declaration declaration-list_1
declaration ... $             int ... $           output: declaration -> int ID declaration_3
int ... $                     int ... $           match
ID ... $                      ID ... $            match
declaration_3 ... $           ( ... $             output: declaration_3 -> ( params ) compound-stmt
( ... $                       ( ... $             match
params ... $                  int ... $           output: params -> int ID params_3
int ... $                     int ... $           match
ID ... $                      ID ... $            match
params_3 ... $                ) ... $             output: params_3 -> param-list_1
param-list_1 ... $            ) ... $             output: param-list_1 -> empty
) ... $                       ) ... $             match
compound-stmt ... $           { ... $             output: compound-stmt -> { local-declarations statement-list }
{ ... $                       { ... $             match
local-declarations ... $      if ... $            output: local-declarations -> empty
statement-list ... $          if ... $            output: statement-list -> statement statement-list
statement ... $               if ... $            output: statement -> selection-stmt
selection-stmt ... $          if ... $            output: selection-stmt -> if ( expression ) { statement } selection-stmt_2
if ... $                      if ... $            match
( ... $                       ( ... $             match
expression ... $              ID ... $            output: expression -> ID expression_6
ID ... $                      ID ... $            match
expression_6 ... $            > ... $             output: expression_6 -> term_1 additive-expression_1 expression_3
term_1 ... $                  > ... $             output: term_1 -> empty
additive-expression_1 ... $   > ... $             output: additive-expression_1 -> empty
expression_3 ... $            > ... $             output: expression_3 -> relop additive-expression
relop ... $                   > ... $             output: relop -> >
> ... $                       > ... $             match

最终构建的语法树(部分)

(不会弄成可视化的,然后因为中间节点太多,整棵树比较庞大,用了缩进的方式表示层次,但还是很难看)

type: non-terminal            value: program                
-type: non-terminal            value: declaration-list       
--type: non-terminal            value: declaration-list_1     
---type: non-terminal            value: declaration-list_1     
----type: non-terminal            value: declaration-list_1     
-----type: non-terminal            value: declaration-list_1     
-----type: non-terminal            value: declaration            
------type: non-terminal            value: declaration_4          
-------type: non-terminal            value: compound-stmt          
--------type: SYMBOL                  value: }                      
--------type: non-terminal            value: statement-list         
---------type: non-terminal            value: statement-list         
----------type: non-terminal            value: statement-list         
----------type: non-terminal            value: statement              
-----------type: non-terminal            value: selection-stmt         
------------type: non-terminal            value: selection-stmt_2       
------------type: SYMBOL                  value: }                      
------------type: non-terminal            value: statement              
-------------type: non-terminal            value: expression-stmt        
--------------type: SYMBOL                  value: ;                      
--------------type: non-terminal            value: expression             
---------------type: non-terminal            value: expression_6           
----------------type: non-terminal            value: expression_3           
----------------type: non-terminal            value: additive-expression_1  
----------------type: non-terminal            value: term_1                 
----------------type: SYMBOL                  value: )                      
----------------type: non-terminal            value: args                   
-----------------type: non-terminal            value: arg-list               
------------------type: non-terminal            value: arg-list_1             
------------------type: non-terminal            value: expression             
-------------------type: non-terminal            value: expression_6           
--------------------type: non-terminal            value: expression_3           
--------------------type: non-terminal            value: additive-expression_1  
--------------------type: non-terminal            value: term_1                 
--------------------type: SYMBOL                  value: )                      
--------------------type: non-terminal            value: args                   
---------------------type: non-terminal            value: arg-list               
----------------------type: non-terminal            value: arg-list_1             
----------------------type: non-terminal            value: expression             
-----------------------type: non-terminal            value: expression_6           
------------------------type: non-terminal            value: expression_3           
------------------------type: non-terminal            value: additive-expression_1  
------------------------type: non-terminal            value: term_1                 
-----------------------type: ID                      value: x                      
--------------------type: SYMBOL                  value: (                      
-------------------type: ID                      value: fact                   
----------------type: SYMBOL                  value: (                      
---------------type: ID                      value: write                  
------------type: SYMBOL                  value: {                      
------------type: SYMBOL                  value: )                      
------------type: non-terminal            value: expression             
-------------type: non-terminal            value: expression_6           
--------------type: non-terminal            value: expression_3           
---------------type: non-terminal            value: additive-expression    
----------------type: non-terminal            value: additive-expression_1  
----------------type: non-terminal            value: term                   
-----------------type: non-terminal            value: term_1                 
-----------------type: non-terminal            value: factor                 
------------------type: NUM                     value: 0                      
---------------type: non-terminal            value: relop                  
----------------type: SYMBOL                  value: >                      
--------------type: non-terminal            value: additive-expression_1  
--------------type: non-terminal            value: term_1                 
-------------type: ID                      value: x                      
------------type: SYMBOL                  value: (                      
------------type: RESERVED WORD           value: if                     
---------type: non-terminal            value: statement              
----------type: non-terminal            value: expression-stmt        
-----------type: SYMBOL                  value: ;                      
-----------type: non-terminal            value: expression             
------------type: non-terminal            value: expression_6           
-------------type: non-terminal            value: expression             
--------------type: non-terminal            value: expression_6           
---------------type: non-terminal            value: expression_3           
---------------type: non-terminal            value: additive-expression_1  
---------------type: non-terminal            value: term_1                 
---------------type: SYMBOL                  value: )                      
---------------type: non-terminal            value: args                   
---------------type: SYMBOL                  value: (                      
--------------type: ID                      value: read                   
-------------type: SYMBOL                  value: =                      
------------type: ID                      value: x                      
--------type: non-terminal            value: local-declarations     
---------type: non-terminal            value: local-declarations     
---------type: non-terminal            value: var-declaration        
----------type: non-terminal            value: var-declaration_2      
-----------type: SYMBOL                  value: ;                      
----------type: ID                      value: x                      
----------type: non-terminal            value: type-specifier         
-----------type: RESERVED WORD           value: int                    
--------type: SYMBOL                  value: {                      
-------type: SYMBOL                  value: )                      
-------type: non-terminal            value: params                 
--------type: non-terminal            value: params_4               
--------type: RESERVED WORD           value: void                   
-------type: SYMBOL                  value: (                      
------type: ID                      value: main                   
------type: RESERVED WORD           value: void                   
----type: non-terminal            value: declaration            
-----type: non-terminal            value: declaration_3          
------type: non-terminal            value: compound-stmt          
-------type: SYMBOL                  value: }                      
-------type: non-terminal            value: statement-list         
--------type: non-terminal            value: statement-list         
--------type: non-terminal            value: statement              
---------type: non-terminal            value: selection-stmt         
----------type: non-terminal            value: selection-stmt_2       
-----------type: non-terminal            value: statement              
------------type: non-terminal            value: compound-stmt          
-------------type: SYMBOL                  value: }                      
-------------type: non-terminal            value: statement-list         
--------------type: non-terminal            value: statement-list         
--------------type: non-terminal            value: statement              
---------------type: non-terminal            value: return-stmt            
----------------type: non-terminal            value: return-stmt_2          
-----------------type: SYMBOL                  value: ;                      
-----------------type: non-terminal            value: expression             
------------------type: non-terminal            value: expression_3           
------------------type: non-terminal            value: additive-expression_1  
------------------type: non-terminal            value: term_1                 
------------------type: NUM                     value: 1                      
----------------type: RESERVED WORD           value: return                 
-------------type: non-terminal            value: local-declarations     
-------------type: SYMBOL                  value: {                      
-----------type: RESERVED WORD           value: else                   
----------type: SYMBOL                  value: }                      
----------type: non-terminal            value: statement              
-----------type: non-terminal            value: return-stmt            
------------type: non-terminal            value: return-stmt_2          
-------------type: SYMBOL                  value: ;                      
-------------type: non-terminal            value: expression             
--------------type: non-terminal            value: expression_6           
---------------type: non-terminal            value: expression_3           
---------------type: non-terminal            value: additive-expression_1  
---------------type: non-terminal            value: term_1                 
----------------type: non-terminal            value: term_1                 
----------------type: non-terminal            value: factor                 
-----------------type: non-terminal            value: factor_2               
------------------type: SYMBOL                  value: )                      
------------------type: non-terminal            value: args                   
-------------------type: non-terminal            value: arg-list               
--------------------type: non-terminal            value: arg-list_1             
--------------------type: non-terminal            value: expression             
---------------------type: non-terminal            value: expression_6           
----------------------type: non-terminal            value: expression_3           
----------------------type: non-terminal            value: additive-expression_1  
-----------------------type: non-terminal            value: additive-expression_1  
-----------------------type: non-terminal            value: term                   
------------------------type: non-terminal            value: term_1                 
------------------------type: non-terminal            value: factor                 
-------------------------type: NUM                     value: 1                      
-----------------------type: non-terminal            value: addop                  
------------------------type: SYMBOL                  value: -                      
----------------------type: non-terminal            value: term_1                 
---------------------type: ID                      value: x                      
------------------type: SYMBOL                  value: (                      
-----------------type: ID                      value: fact                   
----------------type: non-terminal            value: mulop                  
-----------------type: SYMBOL                  value: *                      
--------------type: ID                      value: x                      
------------type: RESERVED WORD           value: return                 
----------type: SYMBOL                  value: {                      
----------type: SYMBOL                  value: )                      
----------type: non-terminal            value: expression             
-----------type: non-terminal            value: expression_6           
------------type: non-terminal            value: expression_3           
-------------type: non-terminal            value: additive-expression    
--------------type: non-terminal            value: additive-expression_1  
--------------type: non-terminal            value: term                   
---------------type: non-terminal            value: term_1                 
---------------type: non-terminal            value: factor                 
----------------type: NUM                     value: 1                      
-------------type: non-terminal            value: relop                  
--------------type: SYMBOL                  value: >                      
------------type: non-terminal            value: additive-expression_1  
------------type: non-terminal            value: term_1                 
-----------type: ID                      value: x                      
----------type: SYMBOL                  value: (                      
----------type: RESERVED WORD           value: if                     
-------type: non-terminal            value: local-declarations     
-------type: SYMBOL                  value: {                      
------type: SYMBOL                  value: )                      
------type: non-terminal            value: params                 
-------type: non-terminal            value: params_3               
--------type: non-terminal            value: param-list_1           
-------type: ID                      value: x                      
-------type: RESERVED WORD           value: int                    
------type: SYMBOL                  value: (                      
-----type: ID                      value: fact                   
-----type: RESERVED WORD           value: int                    
---type: non-terminal            value: declaration            
----type: non-terminal            value: declaration_4          
-----type: non-terminal            value: compound-stmt          
------type: SYMBOL                  value: }                      
------type: non-terminal            value: statement-list         
------type: non-terminal            value: local-declarations     
------type: SYMBOL                  value: {                      
-----type: SYMBOL                  value: )                      
-----type: non-terminal            value: params                 
------type: non-terminal            value: params_3               
-------type: non-terminal            value: param-list_1           
------type: ID                      value: x                      
------type: RESERVED WORD           value: int                    
-----type: SYMBOL                  value: (                      
----type: ID                      value: write                  
----type: RESERVED WORD           value: void                   
--type: non-terminal            value: declaration            
---type: non-terminal            value: declaration_3          
----type: non-terminal            value: compound-stmt          
-----type: SYMBOL                  value: }                      
-----type: non-terminal            value: statement-list         
-----type: non-terminal            value: local-declarations     
-----type: SYMBOL                  value: {                      
----type: SYMBOL                  value: )                      
----type: non-terminal            value: params                 
-----type: non-terminal            value: params_4               
-----type: RESERVED WORD           value: void                   
----type: SYMBOL                  value: (                      
---type: ID                      value: read                   
---type: RESERVED WORD           value: int                    

全部源码

发布了28 篇原创文章 · 获赞 8 · 访问量 2795

猜你喜欢

转载自blog.csdn.net/IuSpet/article/details/101513465
今日推荐