Lexical unitization of source code and the role of symbol table
The morphology of a word is closely related to the context of a sentence or the interpretation of the language itself, and the lexicon can make another name for the dictionary
In this aspect of compilation, the purpose of lexical analysis is to determine the meaning of a single word in the source code
The input of the lexical analyzer is the source code in our program. For the lexical analyzer, the input is only a long string of text content. The
output of the lexical analyzer is a stream of lexical units.
So, for the lexical of the compilation process Analysis phase, sometimes called tokenization phase
Each time a character is scanned from left to right, the job of the lexical analyzer is to identify and separate the elements of the input string, that is, to decompose the string into substrings, and each substring is a program word. after the lexical analyzer is split substring will be identified as translated lexical
lexical can make a key, for example if, then
it can be a relational operator, numbers less than or equal to this number
may be Is the assignment operator, such as the equal sign or the double equal sign in other languages
When the lexical analyzer scans the input string, it needs to find the spacer first.
This can determine the position of the end of a lexeme and the beginning of the next lexeme. The lexeme can be defined by the adjacent characters defined by regular expressions in the lexical analyzer program. Pattern recognition
For each lexical method, as a generated tag, it contains the lexeme category and the lexeme itself
The ability to detect errors lexical analyzer source code is very limited
he may be able to find some illegal characters, but for individual keywords spelled correctly or not powerless, misspelled keywords as identifiers could be mistaken knowledge
lexical analyzer does not Yes, and no ability to manage the order of occurrence or data type mismatches allowed by related keywords
The purpose of the symbol table is to provide the compiler with quick access to various information represented by each symbol name used in the source code throughout the compilation process