Schema compiler technology (Chapter III lexical analysis)
1, lexical analysis overview
2, the relationship between the lexical analyzer and parser
(1) again as a separate lexical analysis
(2) as a subroutine lexical analysis
3, the output of the lexical analyzer
Of binary (word class <integer coding> word attributes)
Depending on the corresponding programming language
Keyword "word one yard."
Punctuation "word one yard."
An identifier, constant, and so the string "a class of a code"
Discarding other unrelated characters (comment, etc. whitespace)
4, to achieve the lexical analyzer
1) Regular expression: A method for description of the collection tool string
2) Alphabet: a finite set of symbols
The set {0, 1} is a binary alphabet
3) on an alphabet "string" or a "sentence": a finite symbol alphabet sequence
The length of the string s, denoted by | s |, s refers to the number of symbols appearing in
Empty string is a string of length 0 is expressed by ε
4) Language: given a set of strings in the alphabet may be an arbitrary number
5) recursive definition of regular expressions
Any number of times R *, R + least once