MYSQL04 Advanced_Logical architecture analysis, query cache, parser, optimizer, executor, storage engine

①. Logical architecture analysis

  • ①. The server processes client requests
    Insert image description here
  • ②. Connection layer
  1. Before the system (client) accesses the MySQL server, the first thing it does is to establish a TCP connection.
  2. After the three-way handshake establishes the connection successfully, the MySQL server performs identity authentication on the account and password transmitted by TCP. If the username or password for permission acquisition is incorrect, an Access denied for user error will be received. The client program ends and the username and password authentication is passed. It will be found out from the permission table that the permissions owned by the account are associated with the connection. Subsequent permission judgment logic will depend on the permissions read at this time.
  3. After the TCP connection receives the request, it must be assigned to a thread specifically to interact with this client. So there will be a thread pool to carry out the subsequent process. Each connection obtains a thread from the thread pool, eliminating the overhead of creating and destroying threads.
  • ③. Service layer - SQL Interface (SQL interface)
  1. Receive the user's SQL command and return the results that the user needs to query. For example, SELECT ... FROM is to call SQL Interface
  2. MySQL supports multiple SQL language interfaces such as DML (data manipulation language), DDL (data definition language), stored procedures, views, triggers, and custom functions.
  • ④. Service layer - Parser (parser)
  1. Perform syntax analysis and semantic analysis on SQL statements in the parser. Decompose the SQL statement into a data structure and pass this structure to subsequent steps. The subsequent delivery and processing of SQL statements is based on this structure. If an error is encountered during the decomposition, it means that the SQL statement is unreasonable.
  2. When the SQL command is passed to the parser, it will be verified and parsed by the parser, and a syntax tree will be created for it . The query syntax tree will be enriched based on the data dictionary and it will be verified whether the client has the authority to execute the query . After creating the syntax tree, MySQL will also optimize the syntax of the SQl query and rewrite the query.
  • ⑤. Service layer - Optimizer (query optimizer)
  1. After the SQL statement is parsed and before the query, the query optimizer is used to determine the execution path of the SQL statement and generate an execution plan .
  2. This execution plan indicates which indexes should be used for query (full table retrieval or index retrieval), what is the order of connections between tables, and finally the method provided by the storage engine will be called according to the steps in the execution plan to actually execute the query, and Query results are returned to the user
  3. It uses a "select-project-join" strategy for querying. For example:
  4. This SELECT query first selects based on the WHERE statement , instead of querying all tables and then filtering by gender. This SELECT query first performs attribute projection based on id and name , instead of extracting all attributes and then filtering, and connects these two query conditions to generate the final query result.
# 它使用" 选取-投影-连接 "策略进行查询。例如:
SELECT id,name FROM student WHERE gender = '女';
  • ⑥. Service layer - Caches and Buffers (query cache component)
  1. MySQL maintains some caches and buffers internally. For example, the Query Cache is used to cache the execution results of a SELECT statement. If the corresponding query results can be found in it, then there is no need to go through the entire process of query parsing, optimization, and execution. Instead, the query cache is used to cache the execution results of a SELECT statement. Feedback the results to the client
  2. This caching mechanism is composed of a series of small caches. For example, table cache, record cache, key cache, permission cache, etc.
  3. This query cache can be shared between different clients
  4. Starting with MySQL 5.7.20, the query cache is deprecated and will be removed in MySQL 8.0
  • ⑦. Engine layer: The plug-in storage engine layer (Storage Engines) is truly responsible for the storage and extraction of data in MySQL, and performs operations on the underlying data maintained at the physical server level. The server communicates with the storage engine through APIs. Different storage engines have different functions, so we can choose according to our actual needs.
    Insert image description here

  • ⑧. Storage layer
    All data, database, table definitions, the content of each row of the table, and indexes are stored in the file system in the form of files, and complete the interaction with the storage engine. Of course, some storage engines, such as InnoDB, also support direct management of raw devices without using a file system, but the implementation of modern file systems makes this unnecessary. Under the file system, you can use local disks and various storage systems such as DAS, NAS, and SAN.

  • ⑨. In order to be familiar with the SQL execution process, we can simplify it as follows:

  1. Connection layer: The client and server establish a connection, and the client sends SQL to the server.
  2. SQL layer (service layer): Query processing of SQL statements, regardless of the storage method of database files
  3. Storage engine layer: Deals with database files and is responsible for data storage and reading
    Insert image description hereInsert image description here

②. Service layer-query cache

  • ①. Query cache: If the server finds this SQL statement in the query cache, it will directly return the result to the client; if not, it will enter the parser stage. It should be noted that because query caching is often inefficient, this function was abandoned after MySQL 8.0.

  • ②. Query caching is to cache the query results in advance, so that you can get the results directly without executing them next time. It should be noted that the query cache in MySQL does not cache the query plan, but the corresponding results of the query. This means that the robustness of query matching is greatly reduced, and only the same query operation will hit the query cache. Any difference in characters between the two query requests (for example: spaces, comments, capitalization) will cause the cache to miss. Therefore, MySQL’s query cache hit rate is not high.

  • ③. If the query request contains certain system functions, user-defined variables and functions, and some system tables, such as tables in the mysql, information_schema, performance_schema database, then this request will not be cached. Taking some system functions as an example, two calls of the same function may produce different results. For example, the function NOW will produce the latest current time every time it is called. If this function is called in a query request, even if the query The requested text information is the same, so two queries at different times should also get different results. If it is cached in the first query, it will be wrong to directly use the results of the first query in the second query.

  • ④. Since it is a cache, there will be times when its cache becomes invalid. MySQL's cache system will monitor each table involved. As long as the structure or data of the table is modified, such as INSERT, UPDATE, DELETE, TRUNCATE TABLE, ALTERTABLE, DROP TABLE or DROP DATABASE statements are used on the table, then use this All cached queries for the table will become invalid and removed from cache! For databases with heavy update pressure, the hit rate of the query cache will be very low.

  • ⑤. In short, because the query cache often does more harm than good, the query cache fails very frequently.

  • ⑤. It is generally recommended that you use query cache in static tables. What is a static table? It is a table that we rarely update. For example, if there is a system configuration table or dictionary table, then the query on this table is suitable for query cache. Fortunately, MySQL also provides this "use on demand" method. You can set the my.cnf parameter query_cache_type to DEMAND

my.cnf
#query_cache_type有3个值 0代表关闭查询缓存OFF,1代表开启ON,2(DEMAND)代表当sql语句中有SQL_CACHE关键词时才缓存
query_cache_type=2
  • ⑥. In this way, the query cache is not used for the default SQL statements. For statements that you are sure you want to use the query cache, you can use SQL_CACHE to explicitly specify it, like the following statement
mysql> select SQL_CACHE * from test where ID=5# 查看当前mysql实例是否开启缓存机制
mysql> show global variables like "%query_cache_type%";
  • ⑦. Monitor the hit rate of query cache:
  1. Qcache_free_blocks: Indicates how many remaining blocks are currently in the query cache. If the value is larger, it means that there are too many memory fragments in the query cache and may be sorted out at a certain time.
  2. Qcache_free_memory: The memory size of the query cache. Through this parameter, you can clearly know whether the query memory of the current system is enough, whether it is too much or not enough. The DBA can make adjustments according to the actual situation.
  3. Qcache_hits: Indicates how many cache hits there are. We can mainly use this value to verify the effect of our query cache. The larger the number, the better the cache effect.
  4. Qcache_inserts: Indicates how many misses and then inserts, which means that the new SQL request is not found in the cache and has to perform query processing. After executing the query processing, the result is inserted into the query cache. The more times this happens, the less the query cache is used, and the effect is not ideal. Of course, after the system is just started, the query cache is empty, which is normal.
  5. Qcache_lowmem_prunes: This parameter records how many queries were removed from the query cache due to insufficient memory. Through this value, users can adjust the cache size appropriately.
  6. Qcache_not_cached: Indicates the number of queries that are not cached due to the setting of query_cache_type
  7. Qcache_queries_in_cache: The number of queries cached in the current cache.
  8. Qcache_total_blocks: The number of blocks currently cached
mysql> show status like'%Qcache%'; //查看运行的缓存信息

Insert image description here

  • ⑧. Mysql8.0 has removed the query cache function

③. Service layer - parser

  • ①. Parser: perform syntactic analysis and semantic analysis of SQL statements in the parser

  • ②. If the query cache is not hit, the actual execution of the statement will begin. First, MySQL needs to know what you want to do, so it needs to parse the SQL statement. The analysis of SQL statements is divided into lexical analysis and syntactic analysis.

  • ③. The analyzer first performs "lexical analysis" . What you input is an SQL statement composed of multiple strings and spaces. MySQL needs to identify what the strings in it are and what they represent.

  • ④. MySQL recognizes that this is a query statement from the "select" keyword you entered. It also needs to recognize the string "T" as "table name T" and the string "ID" as "column ID".

  • ⑤. Next, do "grammatical analysis" . Based on the results of lexical analysis, the syntax analyzer (such as Bison) will determine whether the SQL statement you entered satisfies MySQL syntax according to the syntax rules.

# 错误的SQL
select department_id,job_id, avg(salary) from employees group by department_id;
  • ⑥. If the SQL statement is correct, such a syntax tree will be generated. At this point, the parser's task is basically completed.
    Insert image description here

④. Service layer - optimizer

  • ①. Optimizer: The optimizer will determine the execution path of the SQL statement, such as whether it is based on full table retrieval or index retrieval, etc.

  • ②. After going through the interpreter, MySQL will know what you want to do. Before execution begins, it must be processed by the optimizer. A query can be executed in many ways, all returning the same results. The role of the optimizer is to find the best execution plan.
    For example: the optimizer decides which index to use when there are multiple indexes in the table; or when a statement has multiple table associations (joins), it decides on each table. The connection sequence, as well as expression simplification, subqueries converted to joins, outer joins converted to inner joins, etc.

# 举例:如下语句是执行两个表的 join:
select * from test1 join test2 using(ID)
where test1.name='zhangwei' and test2.name='mysql高级课程';
方案1:可以先从表 test1 里面取出 name='zhangwei'的记录的 ID 值,再根据 ID 值关联到表 test2,再判
断 test2 里面 name的值是否等于 'mysql高级课程'。

方案2:可以先从表 test2 里面取出 name='mysql高级课程' 的记录的 ID 值,再根据 ID 值关联到 test1,
再判断 test1 里面 name的值是否等于 zhangwei。

这两种执行方法的逻辑结果是一样的,但是执行的效率会有不同,而优化器的作用就是决定选择使用哪一个方案。优化
器阶段完成后,这个语句的执行方案就确定下来了,然后进入执行器阶段。
如果你还有一些疑问,比如优化器是怎么选择索引的,有没有可能选择错等。后面讲到索引我们再谈。
  • ③. In the query optimizer, it can be divided into logical query optimization stage and physical query optimization stage.
  1. Logical query optimization makes SQL queries more efficient by changing the content of SQL statements, and at the same time provides more candidate execution plans for physical query optimization. The commonly used method is to equivalently transform the SQL statement and rewrite the query, and the mathematical basis of query rewriting is relational algebra. Rewrite equivalent predicates and simplify conditions for conditional expressions, rewrite views, optimize subqueries, and eliminate outer joins and nested joins for connection semantics.
  2. Physical query optimization is query rewriting based on relational algebra, and each step of relational algebra corresponds to physical calculations. There are often multiple algorithms for these physical calculations, so it is necessary to calculate the costs of various physical paths and choose the one with the lowest cost. Implementation plan. At this stage, for single-table and multi-table connection operations, indexes need to be used efficiently to improve query efficiency.

⑤. Service layer-executor

  • ①. Before execution, it is necessary to determine whether the user has permission. If not, a permission error will be returned. If you have permission, execute the SQL query and return the results. In versions below MySQL 8.0, if the query cache is set up, the query results will be cached and the
    InnoDB engine interface will be called to get the first row of the table and determine whether the ID value is 1. If not, skip it. If so, then Store this row in the result set; call the engine interface to fetch the "next row" and repeat the same judgment logic until the last row of the table is fetched. The executor returns a record set composed of all rows that meet the conditions during the above traversal process to the client as a result set.
select * from test where id=1;

Insert image description here

  • ②. The flow of SQL statements in MySQL is: SQL statement → query cache → parser → optimizer → executor
    Insert image description here

⑥. MySQL8 execution principle

  • ①. Confirm whether profiling is turned on
    to understand the underlying execution process of the query statement: select @profiling or show variables like '%profiling' to check whether the plan is turned on. Turning it on allows MySQL to collect data in SQL
select @profiling;
show variables like '%profiling'
mysql> select @@profiling;
+-------------+
| @@profiling |
+-------------+
|           0 |
+-------------+
1 row in set, 1 warning (0.00 sec)
  • ②. profiling=0 means closed, we need to open profiling, that is, set it to 1
mysql>set profiling=1;
Query OK, 0 rows affected, 1 warning (0.00 sec)
  • ③. Execute the same SQL query multiple times
mysql> SELECT * FROM employees;
mysql> SELECT * FROM employees;
  • ④. View all profiles generated by the current session
mysql> show profiles; # 显示最近的几次查询
+----------+------------+-------------------------+
| Query_ID | Duration   | Query                   |
+----------+------------+-------------------------+
|        1 | 0.00085275 | SELECT * FROM employees |
|        2 | 0.00090275 | SELECT * FROM employees |
+----------+------------+-------------------------+
2 rows in set, 1 warning (0.00 sec)
  • ⑤. View profile - display the execution plan and view the execution steps of the program
mysql> show profile;
+--------------------------------+----------+
| Status                         | Duration |
+--------------------------------+----------+
| starting                       | 0.000086 |
| Executing hook on transaction  | 0.000003 |
| starting                       | 0.000027 |
| checking permissions           | 0.000020 |
| Opening tables                 | 0.000307 |
| init                           | 0.000007 |
| System lock                    | 0.000007 |
| optimizing                     | 0.000004 |
| statistics                     | 0.000012 |
| preparing                      | 0.000012 |
| executing                      | 0.000267 |
| end                            | 0.000004 |
| query end                      | 0.000003 |
| waiting for handler commit     | 0.000006 |
| closing tables                 | 0.000007 |
| freeing items                  | 0.000125 |
| cleaning up                    | 0.000009 |
+--------------------------------+----------+
  • ⑥. Query the specified Query ID, such as:
mysql> show profile for query 2;
+--------------------------------+----------+
| Status                         | Duration |
+--------------------------------+----------+
| starting                       | 0.000086 |
| Executing hook on transaction  | 0.000003 |
| starting                       | 0.000027 |
| checking permissions           | 0.000020 |
| Opening tables                 | 0.000307 |
| init                           | 0.000007 |
| System lock                    | 0.000007 |
| optimizing                     | 0.000004 |
| statistics                     | 0.000012 |
| preparing                      | 0.000012 |
| executing                      | 0.000267 |
| end                            | 0.000004 |
| query end                      | 0.000003 |
| waiting for handler commit     | 0.000006 |
| closing tables                 | 0.000007 |
| freeing items                  | 0.000125 |
| cleaning up                    | 0.000009 |
+--------------------------------+----------+
17 rows in set, 1 warning (0.00 sec)
  • ⑦. In addition to checking the CPU, IO blocking and other parameters, you can also check the utilization of the following parameters
Syntax:
SHOW PROFILE [type [, type] ... ]
	[FOR QUERY n]
	[LIMIT row_count [OFFSET offset]]

type: {
	| ALL -- 显示所有参数的开销信息
	| BLOCK IO -- 显示IO的相关开销
	| CONTEXT SWITCHES -- 上下文切换相关开销
	| CPU -- 显示CPU相关开销信息
	| IPC -- 显示发送和接收相关开销信息
	| MEMORY -- 显示内存相关开销信息
	| PAGE FAULTS -- 显示页面错误相关开销信息
	| SOURCE -- 显示和Source_function,Source_file,Source_line 相关的开销信息
	| SWAPS -- 显示交换次数相关的开销信息
}
mysql> show profile cpu for query 1;
+--------------------------------+----------+----------+------------+
| Status                         | Duration | CPU_user | CPU_system |
+--------------------------------+----------+----------+------------+
| starting                       | 0.000046 | 0.000005 |   0.000037 |
| Executing hook on transaction  | 0.000004 | 0.000000 |   0.000003 |
| starting                       | 0.000007 | 0.000001 |   0.000006 |
| checking permissions           | 0.000005 | 0.000000 |   0.000005 |
| Opening tables                 | 0.000026 | 0.000003 |   0.000023 |
| init                           | 0.000005 | 0.000001 |   0.000003 |
| System lock                    | 0.000006 | 0.000000 |   0.000006 |
| optimizing                     | 0.000004 | 0.000001 |   0.000003 |
| statistics                     | 0.000158 | 0.000017 |   0.000141 |
| preparing                      | 0.000046 | 0.000005 |   0.000041 |
| executing                      | 0.000363 | 0.000040 |   0.000325 |
| end                            | 0.000006 | 0.000000 |   0.000004 |
| query end                      | 0.000003 | 0.000001 |   0.000002 |
| waiting for handler commit     | 0.000027 | 0.000003 |   0.000025 |
| closing tables                 | 0.000007 | 0.000000 |   0.000006 |
| freeing items                  | 0.000107 | 0.000012 |   0.000096 |
| cleaning up                    | 0.000034 | 0.000004 |   0.000030 |
+--------------------------------+----------+----------+------------+
17 rows in set, 1 warning (0.00 sec)
  • ⑧. After version 8.0, MySQL no longer supports cached queries. Once the data table is updated, the cache will be cleared. Therefore, it is only valuable to use cached queries when the data table is static or rarely changes. Otherwise, if the data table is frequently updated, it will increase the query time of SQL.

Guess you like

Origin blog.csdn.net/TZ845195485/article/details/131565381