Content combing grammar and language, comprehension and summary.
We all know that a language certainly his grammar to describe, it's the complete definition includes both syntax and semantics. Language and syntax refers to a set of rules, which may be formed and used to generate an appropriate program. Currently widely used is context-free grammar, that is, with a context-free grammar describing as a tool of programming, such as using A: = B + C represents a legitimate assignment, then A: = B + is not a legal assignment statement.
What is grammar? Equivalent semantic analysis grammar linguistics (human language), that is the meaning of a sentence analysis indicated. It is generated according to the intermediate code or object code.
Is represented by EBNF (EBNF form is called a ABNF description): <Sentence> :: = <Subject> <predicate>
<Subject> :: = <pronouns> <noun>
<Pronouns > :: = me | you | him
<Noun > :: = Wang | Students | worker | English
<Predicate > :: = <verb> <direct object>
<Verb > :: = Yes | learning
<Direct object > :: = <pronouns> | <noun>
(1) formal grammar definition:
- G=(VT , VN , P , S)
- V T : a set of terminators, terminator is the symbolic language defined by the grammar, sometimes referred to as a token.
- V N : the set of non-terminal, non-terminal symbol is used to indicate the syntax component, sometimes referred to as "grammar variables", other grammatical elements can be introduced
- P: set of production
- S: start symbol
(2) a symbol string operations:
- Connected symbol string: εx = xε = x;
- The set of products: AB = {xy | x∈A, y∈B}; {ε} A = A {ε} = A;
- Symbol string exponentiation: x = abc, x ^ 2 = abcabc;
- Exponentiation collection
Positive closure with closure A + A *: A * = {ε} ∪A +
(3) Grammar Category:
(4) operation definition language
Language: determining a set at a particular symbol string in the alphabet. Empty set ε, {ε} is the set of fit this definition language.
Language calculation example:
- L∪D full set of letters and numbers
- The set of all strings of symbols consists of a letter followed by a LD digits
- L4 all symbols of four-letter string set
- L * a set of letter strings of all symbols (including ) of
- The set of all strings of symbols L (L∪D) * begin with a letter followed by letters, digits
- The set of all D + symbol string by one or several digits
(5) sentence, sentences and language:
- Sentence: S = *> x, x∈ (Vn∪Vt) *, where S = *> x generalized derivation.
- Sentence: S = *> x, x∈Vt *, where S = *> x is a generalized derivation, x must terminator closure (may ε).
- Language: L (G [S]) = {x | S = +> x and x is Vt *}, where S = +> x is derived using at least one rule.
(6) syntax tree seeking phrases, simple phrases and handles:
- The phrase: symbol string end node subtree formed.
- Simple subtree: Only one sub-tree branch.
- Simple symbol string end node subtree formed: Direct phrases (phrase simple).
- Handle: subtree leftmost leaf nodes of the tree only when all the sub-tree of two generations of father and son lined up from left to right, it is the sentence of the handle.
Ambiguity (7) Semantics
If the grammar G in the presence of a sentence is not only a syntax tree, called the sentence is ambiguous. If the grammar contains ambiguous sentences, claimed that the grammar is ambiguous.
Try to write PL / 0 language grammar.
EBNF symbols represent instructions.
- '<>' With the left and right angle brackets in the text indicates the syntax structure component, said grammatical unit or for non-terminator.
- Left portion ':: =' symbol is defined by the right portion, read "is defined as."
- '|' Represents "or" defined by a plurality of portions of the left and right portions.
- '{}' Represents a component in the syntax of braces may be repeated. When 0 is not added to the lower bound may be repeated any number of times, there are upper and lower bound
- When the number of repetitions to be limiting.
- '[]' Represents a component in the square brackets is optional.
- '()' Represents the component in parentheses priority.
- He said symbol called "pivot symbol", the definition of the grammar used in the above-described symbol marks as required grammar symbol 'enclosed.
PL / 0 language grammar EBNF said:
- <Program> :: = <Block>.
- <Block> :: = [<Constant Description section>] [<Variable Description section>] [<Process Description section>] <statement>
- <Constant Description part> :: = CONST <constant definitions> {, <constant definitions>};
- <Constant defines> :: = <identifier> = <unsigned integer>
- <Unsigned integer> :: = <number> {<number>}
- <Variable Description part> :: = VAR <identifier> {, <identifier>};
- <Identifier> :: = <letter> {<letters> | <number>}
- <Process Description section> :: = <process header> <Block> {; <Process Description section>};
- <Process header> :: = PROCEDURE <identifier>;
- <Statement> :: = <assignment statement> | <conditional statements> | <When the loop type> | <procedure call statement> | <statement read> | <Write statement> | <compound statement> | <empty>
- <Assignment statement> :: = <identifier>: = <expression>
- <Compound statement> :: = BEGIN <statement> {; <statement>} END
- <Condition> :: = <expression> <relational operator> <expression> | the ODD <expression>
- <Expression> :: = [+ | -] <item> {<Addition operator> <term>}
- <Item> :: = <factor> {<multiplication operator> <factor>}
- <Factor> :: = <identifier> | <unsigned integer> | '(' <expression> ')'
- <Addition operator> :: = + | -
- <Multiplication operator> :: = * | /
- <Relational operators Law>: === | # | <| <= |> |> =
- <Conditional statements>: = IF <condition> THEN <statement>
- <Procedure call statement> :: = CALL <identifier>
- <When the loop type> :: = WHILE <condition> the DO <statement>
- <Read statement> :: = READ '(' <identifier> {, <identifier>} '')
- <Write statement> :: = WRITE '(' <expression> {, <expression>} '')
- <Letters> :: = a | b | ,, | X | Y | Z
- <Number> :: = 0 | 1 | 2 | ,, | 8 | 9