Parser for distributed database SQL engine

The SQL engine is one of the core systems of the Yunxi database. It mainly includes three parts: a parser, an optimizer, and an executor. It processes the commands from the client. The parser parses and compiles the commands into commands that the database can recognize and run. The optimizer The command is optimized, the quality of optimization directly affects the performance of the database, and the executor finally executes the command.

The life cycle of an SQL statement:

Figure 1 SQL execution flow

 

As can be seen from Figure 1, the execution process of a statement in the database is as follows:

(1) The database receives the statement text from the client;

(2) Obtain a set of entries through lexical analysis;

(3) The entry is parsed to obtain a syntax tree (Abstract Syntax Tree, AST);

(4) AST obtains expressions after semantic parsing for the optimizer to use;

(5) After rule optimization (Rule-Based Optimization, RBO), it is mainly for query rewriting, such as expression simplification, predicate pushdown, etc.;

(6) Obtain the optimal query expression through Cost-Based Optimization (CBO). This process mainly enumerates all paths and calculates the cost of each path, and selects the path with the least cost as the basis for plan construction;

(7) Build a logical plan and then build a physical plan;

(8) Actuator execution plan;

(9) Return the result.

 

SQL parser :

All statements entering the database need to go through the process of parsing before they can be recognized by the database. The parser mainly includes three parts: lexical parsing, syntax parsing, and semantic parsing.

Lexical analysis:

The task of lexical parsing is to read into the parsing program character by character from left to right, scan the character stream, and then identify the characters and cut them into entries according to the word formation rules. The rule of word cutting is to cut when encountering spaces. Encountered ";" to end lexical parsing.

Example:  SELECT  a  FROM test  WHERE  a >  4;

After lexical analysis, the SQL statement is cut into the following terms:

 

Grammar analysis:

The task of grammatical analysis is to combine the sequence of entries into various grammatical phrases based on the results of lexical analysis, and the formed sentences will be matched with the established grammatical rules. , AST), otherwise a syntax error is reported.

Example:

There are established rules for the following simple_select_clause:

Figure 2 simple_select_clause syntax rules

 

The red marks in the rules in Figure 2 are all terminal symbols, generally uppercase keywords and symbols, etc., and the lowercase ones are non-terminal symbols, which are generally used as the naming of the rules.

When parsing the grammar, the entries generated by the lexical parsing will be moved in one by one, and each entry will be matched with the rules. After the move is completed and the specification is successfully parsed, the corresponding syntax tree is generated.

For example, the lexical analysis results in the following terms:

Syntax analysis first move into the SELECT entry, no reduction and remaining entries, continue to move; move into entry a, a can be reduced to tartet_list, perform the reduction operation, use tartet_list to replace entry a, and the remaining entries, continue to move Enter; move into FROM, there is no reduction and the remaining entries, continue to move; move into test, test can be reduced to from_list, and the reduction operation is performed, and from_list is used to replace the entry test, and then from and from_list can also be reduced to from_clause, continue. Reduction operation, there are still remaining entries, continue to move; move into WHERE, no reduction and remaining entries, continue to move; move a, a can be reduced to expr, replace a with expr, the remaining entries, continue to move; Continue to move >, no reduction and remaining entries, continue to move; continue to move 4, 4 can be reduced to expr, replace 4 with expr, at this time expr>expr can be reduced to a_expr, use a_expr instead of expr>expr, then where and a_expr are reduced to where_clause, and finally SELECT and target_list and from_clause and where_clause are reduced to simple_select_clause, so far the parsing is completed, and then the corresponding syntax tree is generated.

 

Semantic parsing:

The task of semantic analysis is to check the validity of the syntax tree (AST) obtained by parsing, such as tables, columns, column types, functions, expressions, etc.

For example query statement: SELECT a FROM test WHERE a > 4;

Three places will be examined for the above example:

1. from_clause; check whether the table test in the statement exists;

2. target_list; check whether column a is an attribute of a relationship or view in the from clause;

3. where_clause; check whether column a is an attribute of a relation or view in the from clause and whether the type of column a can perform a comparison operation >4.

After the semantic parsing is completed, the corresponding expression will be generated for the optimizer to use.

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324136545&siteId=291194637