[Compilation Principle] Experiment 2: Implement a recursive descent analyzer

Experiment content

Realize the recursive descent analysis program of textbook 3.2 grammar in high-level language.
Textbook 3.2 GrammarRequirement: You
can use the input string i1*(i2+i3) provided in the book, or other symbol strings defined by yourself; output all the contents in the stack and give the analysis result.

Development environment

windows 10
visual studio 2019

data structure

1 Symbol stack

The stack is a linear table with access restrictions, which only allows insert or delete operations at one end. Follow the "first in, first out" principle.
Stack diagram
The recursive descent analysis method is a top-down analysis method. Each non-terminal symbol of the grammar corresponds to a recursive process (function), and the stack can be used to represent the analysis process.

2 Analysis process

Serial number Symbol stack Characters to be analyzed expression
0 # i*(i+i)#
1 #E i*(i+i)# E→TE’
2 #E’T i*(i+i)# T→FT’
3 #E’T’F i*(i+i)# F→i
4 #E’T’ i*(i+i)# T’→*FT’
5 #E’T’F i*(i+i)# F→(E)
6 #AND YOU i*(i+i)# E→TE’
7 #E’T’E’T i*(i+i)# T→FT’
8 #E’T’E’T’F i*(i+i)# F→i
9 #E’T’E’T’ i*(i+i)# T’→ε
10 #AND YOU' i*(i+i)# E '→ + TE'
11 #E’T’E’T i*(i+i)# T→FT’
12 #E’T’E’T’F i*(i+i)# F→i
13 #E’T’E’T’ i*(i+i)# T’→ε
14 #AND YOU' i*(i+i)# E’→ε
15 #E’T’ i*(i+i)# T’→ε
16 #E’ i*(i+i)# E’→ε
17 #

 i*(i+i) analysis stack diagram

3 array

1. Analysis string array char token[token_size]: store the string to be analyzed.
2. Symbol stack char stack[10] = {'#'}: Store the contents of the symbol stack, initially press'#'.

Experimental steps

Analysis and design

The sample questions on the textbook are already analytically detailed:
1. Left recursion has been eliminated.
2. No backtracking.
We directly proceed to the construction of the recursive descent analyzer.
Rough flowchart of recursive descent analysis method

Programming

1 Global variables

① token_len: int type variable, storing the length of the string to be analyzed.
② step: int type variable, which records the current analysis steps.

2 stack

① char stack[10] = {'#'}: char type array, storing the contents of the symbol stack, initially pressing'#'.
② stack_top: an int type variable, which is the top pointer of the symbol stack.

3 array

token[token_size]: cha type array, storing the string to be analyzed.

4 functions

① Non-terminal symbol expression:
void E();
void E1();
void T();
void T1();
void F();
② Print function: void print().

5 pseudo code

See "Compilation Principle Tutorial (Fourth Edition)" P52-53

void match(token t){
	if (lookahead == t)
			lookahead = nexttoken;
	else
			error();
}

void E(){
	T();
	E();
}

void E'(){
	if (lookahead == '+'){
		match('+');
		T();
		E'();
	}
}

void T(){
	F();
	T'();
}

void T'(){
	if (lookahead == '*'){
		match('*');
		F();
		T'();
	}
}

void F(){
	if (lookahead == 'i')
		match('i');
	else if (lookahead == '('){
		match('(');
		E();
		if (lookahead == ')')
			match(')');
		else
			error();
	}
	else
		error();
}

Run and debug

Make the corresponding code modification according to the debugging situation.

operation result

i*(i+i) is a legal string
++ is an illegal string

Problems encountered and solutions

1 E'and T'stacking

Because E'does not belong to a character, it was originally planned to implement a string array when considering storage. There are a lot of problems, so we directly store E'according to the two characters of'E' and'\'', and T'is the same Rationale.

2 Implementation of the stack

Experience

Have a deeper understanding of the structure of the recursive descent analyzer, and more proficient in the realization of the stack.

Experiment code

1 main function main()

int main(void){
    
    
	printf("请输入字符串(以#结束):");		//输入待分析字符串
	while (token[lookahead] != '#'){
    
    		//如果没有按下#结束
		char scan_char;						//扫描字符
		do{
    
    									//若没有扫描到结束符#,就继续扫描
			scanf_s("%c", &scan_char,1);
			token[token_len] = scan_char;	//将扫描到的字符保存到待分析串数组中
			token_len++;
		} while (scan_char != '#');
		printf("\n");
		printf("序号\t");
		printf("符号栈\t\t");
		printf("待分析字符\t");
		printf("表达式\n");
		print();							//打印当前步骤信息
		printf("\n");						//换行
		stack[++stack_top] = 'E';			//将E()压入栈中,开始分析
		step++;								//分析步数+1
		E();
		if (token[lookahead] == '#')		//若待分析字符为#,说明符合文法
			printf("\n分析成功,合法字符串!");
		else
			printf("\n分析失败,非法字符串!");
	}
	return 0;
}

2 E()

/*****************************************************************************
 *函数名称:E()
 *函数类型:void
 *参数:void
 *功能:分析文法,输出信息
 *****************************************************************************/
void E(){
    
    
	print();					//打印当前步骤信息
	printf("E->TE'\n");			//输出表达式
	stack[stack_top] = 'E';		//将产生式右→左压栈
	stack[++stack_top] = '\'';	//代表E'
	stack[++stack_top] = 'T';
	step++;						//分析步数+1
	T();
	E1();
}

3 E1 ()

/*****************************************************************************
 *函数名称:E1()
 *函数类型:void
 *参数:void
 *功能:分析文法,输出信息
 *****************************************************************************/
void E1(){
    
    
	if (token[lookahead] == '+'){
    
    		//若待分析字符匹配‘+’
		print();						//打印当前步骤信息
		printf("E'->+TE'\n");			//输出表达式
		stack[--stack_top] = 'E';		//将产生式右→左压栈
		stack[++stack_top] = '\'';
		stack[++stack_top] = 'T';
		step++;							//分析步数+1
		lookahead++;					//因为匹配到一个终结符,所以分析下一个字符
		T();
		E1();
	}
	else{
    
    
		print();						//打印当前步骤信息
		printf("E'->ε\n");				//输出表达式
		stack[stack_top--] = NULL;		//出栈
		stack[stack_top--] = NULL;		//E'虽然为一个非终结符,但占两个字符,T'同
		step++;							//分析步数+1
	}
}

4 T()

/*****************************************************************************
 *函数名称:T()
 *函数类型:void
 *参数:void
 *功能:分析文法,输出信息
 *****************************************************************************/
void T(){
    
    
	print();					//打印当前步骤信息
	printf("T->FT'\n");			//输出表达式
	stack[stack_top] = 'T';		//将产生式右→左压栈
	stack[++stack_top] = '\'';
	stack[++stack_top] = 'F';
	step++;						//分析步数+1
	F();
	T1();
}

5 T1 ()

/*****************************************************************************
 *函数名称:T1()
 *函数类型:void
 *参数:void
 *功能:分析文法,输出信息
 *****************************************************************************/
void T1(){
    
    
	if (token[lookahead] == '*'){
    
    		//若待分析字符匹配‘*’
		print();						//打印当前步骤信息
		printf("T'->*FT'\n");			//输出表达式
		stack[--stack_top] = 'T';		//将产生式右→左压栈
		stack[++stack_top] = '\'';
		stack[++stack_top] = 'F';
		step++;							//分析步数+1
		lookahead++;					//因为匹配到一个终结符,所以分析下一个字符
		F();
		T1();
	}
	else{
    
    
		print();						//打印当前步骤信息
		printf("T'->ε\n");				//输出表达式
		stack[stack_top--] = NULL;		//出栈
		stack[stack_top--] = NULL;
		step++;							//分析步数+1
	}
}

6 F()

/*****************************************************************************
 *函数名称:F()
 *函数类型:void
 *参数:void
 *功能:分析文法,输出信息
 *****************************************************************************/
void F(){
    
    
	if (token[lookahead] == 'i'){
    
    		//若待分析字符匹配‘i’
		print();						//打印当前步骤信息
		printf("F->i\n");				//输出表达式
		stack[stack_top--] = NULL;		//出栈
		step++;							//分析步数+1
		lookahead++;					//因为匹配到一个终结符,所以分析下一个字符
	}
	else if (token[lookahead] == '('){
    
    	//若待分析字符匹配‘(’
		print();						//打印当前步骤信息
		printf("F->(E)\n");				//输出表达式
		stack[stack_top] = 'E';			//将产生式右→左压栈
		step++;							//分析步数+1
		lookahead++;					//因为匹配到一个终结符,所以分析下一个字符
		E();
		if (token[lookahead] == ')'){
    
    	//若待分析字符匹配‘)’
			lookahead++;				//因为匹配到一个终结符,所以分析下一个字符
		}
		else{
    
    
			printf("没有')'匹配!\n");
			return;
		}
	}
	else{
    
    
		printf("error\n");
		return;
	}
}

7 print()

/*****************************************************************************
 *函数名称:print()
 *函数类型:void
 *参数:void
 *功能:打印信息
 *****************************************************************************/
void print(){
    
    
	int i;
	printf("%d\t", step);						//打印分析第step步
	for (i = 0; i <= stack_top; i++)			//输出分析栈中内容
		printf("%c", stack[i]);
	if (stack_top < 7)							//每列对齐
		printf("\t\t");
	else
		printf("\t");
	for (i = 0; i < lookahead; i++){
    
    			//若字符已被分析,它的位置置空,保持列对齐
		token[i] = ' ';
		printf("%c", token[i]);
	}
	for (i = lookahead; i < token_len; i++)		//输出剩余待分析字符
		printf("%c", token[i]);
	printf("\t");								//为输出表达式做准备
}

Guess you like

Origin blog.csdn.net/qq_44714521/article/details/106972173