1, the main task
From left to right scan the source, after pre-treatment, in accordance with the rules of recognition of each lexical correct word into a corresponding binary formula (class number, the code), submitted to the syntax analysis.
2, preconditioning
- Disposal notes, spaces, tabs, line breaks
- After the end of the line statement, accompanied by a special character description
- To the identification code (reference numeral GOTO statement), reference numeral distinguished statement
- Outputs a source list for review
3, advanced search
For certain keywords without the protection of the language, we need to advance search
Note: The general high-level language does not require advanced search
4, the output format
Basic word (reserved word), flag word, constants, operators, delimiters
:( class of binary numbers, an inner code)
5, the scanner design
- The written language lexical rules
- Convert lexical rules to the state transition diagram
- Each transition diagram of the initial state together, constitute this language recognition automata
- Design scanner
-
- The scanner as parsing a process required when parsing a word, is called a scanner.
- Scanner from the initial state starting when identifying a word, it enters the final state , feeding of binary
- NOTE: Alternatively matrix for transforming the state of FIG state, computer-implemented easily.
6, summed up
- Regular grammar, regular collection, formal type of relationship
- Is a regular grammar production grammar is Chomsky 3 grammar, the grammar linear divided into left and right linear grammar
- Regular set is the set of all meet the formal grammar.
- The composition of the regular type, use a simple formula representation language
- Relationship: regular grammar and style is formal rules, informal set is a collection. Regular regular grammar and style are equivalent, interchangeable
- Regular grammar -> Regular Expression
- Determination of NFA: Method subset
- DFA minimization:
- NFA containing e of determination of the arc
- Conversion regular grammar, regular type, automaton