[Compiler theory] 2. parsing (syntax analysis)

There are three general types of parsers for grammars

universal top-down bottom-up
Cocke-Younger-Kasami algorithm, Earley's algorithm
parse any grammar, inefficient work only for subclasses of grammars work only for subclasses of grammars

syntactic errors strategies for error recovery:panic-modephrase-level recovery,error-productions,global-correction a compiler is expected to assist the programmer in locating and tracking down errors

error handler in a parser has goals:

  • Report the presence of errors clearly and accurately.
  • Recover from each error quickly enough to detect subsequent errors.
  • Add minimal overhead to the processing of correct programs.

### a, context-free grammar composed of a series of production

  • terminal symbols
  • nonterminals
  • productions
  • start symbol ####Derivation sentential form, sentence, left-sentential form, leftmost derivation, rightmost derivation

Every construct that can be described by a regular expression can be described by a grammar, but not vice-versa.

Immediate left recursion can be eliminated by the following technique

Algorithm 4.19, below, systematically eliminates left recursion from a grammar.

Two ###, top-down parsing top-down parsing method has three,-recursive This descent of parsing, Predictive parsing and nonrecursive predictive parsing, predictive parsing is a special recursive-descent parsing.

One class of predictive parsing grammar will be able to parse called LL (. 1) grammar, and a left-recursive This grammar is not necessarily ambiguous grammar LL (1) grammar.

If and only if a Grammar \ (G \) when the following conditions are satisfied, \ (G \) in order to become a LL (1) grammar:
If \ (A \ to \ alpha | \ beta \) are two different productions, then. 1) \ (FIRST (\ Alpha) \ CAP FIRST (\ Beta) = \ emptySet \) 2) If \ (\ Epsilon \ in FIRST (\ Beta) \) , then \ (FIRST (\ alpha) \ cap FOLLOW (\ Beta) = \ emptySet \) ; \ (\ Epsilon \ in FIRST (\ Alpha) \) Similarly.

Predictive parsing method of parsing table construction algorithm: input Grammar \ (G \) , the output of parsing table \ (M \) . Of Grammar \ (G \) all production of \ (A \ to \ alpha \ ) performing the following two steps:

  1. \ (\ FORALL A \ in FIRST (\ Alpha) \) , the $ A \ to \ alpha $ Add to \ (M [A, a] \) of
  2. If $ \ epsilon \ in FIRST (\ alpha) $, the \ (\ FORALL B \ in FOLLOW (\ Alpha) \) , the $ A \ to \ alpha $ Add to \ (M [A, b] \) of If \ (\ EXISTS A, A, ST M [A, A] = \ emptySet \) , then let M $ [A, A] = $ error

Note that for LL (. 1) Grammar, Table \ (M \) Each entry contains at most a production. Consider $ A \ to \ alpha | \ beta $, we have to explain that they could not appear in the same entry in. LL (1) grammar described condition 1, the two production through steps 1, can not appear in the same entry in LL (1) grammar described conditions 1 FIRST $ (\ Alpha), FIRST (\ Beta) \ (not may contain both \) \ Epsilon $, so up to two production through only one step can be added to the 2 \ (M \) , whereas LL (1) grammar and condition 2 described after step 2 to the \ (M \ ) in a production with another piece of production can not appear in the same entry in.

### three, bottum-up parsing Here are three kinds bottum-up method, SLR, conanical LR (LR for short) and LALR, they are based on shift-reduce method

We look at three kinds of LR parser in common, and then look at the differences between them

#### 1.LR type parser configured to process \ (Item \ the DFA to \ parsing to \) \ (Table \) \ (the DFA \) one \ (State \) corresponding to a plurality of \ (Item \) configured a \ (set \)

#### 2.LR parser type of structure and working principle LR parser type structure shown above, there is a storage state of the stack, a buffer store input, there is a decision to make parsing table. 1) parser each time a next terminal in accordance with the input state of the stack and the buffer action to query parsing table of the area, is determined next shift, reduce, accept or error. 2) If it is reduce, the pop-up will have to handle a number of state representative of the stack, and then the new state of the stack and nonterminal used in place of the handle to query parsing table goto region, the next state is pushed onto the stack and jumping top or error. ps: J enters state from any other state must be by the same grammar symbol X. pss: All of LR-type grammar is unambiguous.

#### 3.SLR, the difference between LR and LALR three paser uses a different item,

item parser grammar
LR(0) SLR SLR
LR(1) LR LR
LALR(1) LALR LALR

Guess you like

Origin www.cnblogs.com/-zyq/p/12355302.html