MySQL query execution process

1. Query execution path

Insert picture description here
The figure above is an overview of the MySQL query execution process. The main process is the following steps:

  1. The client sends the query request to the server
  2. The server first checks whether the query cache is enabled, if it is enabled and the current query hits the cache, it returns the result from the cache, otherwise proceed to the next step
  3. The server parses and preprocesses the SQL statement, and then the query optimizer generates the corresponding execution plan
  4. The execution engine calls the storage engine API to execute the query according to the execution plan
  5. The server sends the result back to the client

2. Detailed query execution steps

2.1 The client initiates a query request

  1. You first need to understand the communication protocol MySQL client and server side is 半双工, that is,Either the client sends data to the server at any time, or the server sends data to the client, these two actions cannot happen at the same time. This protocol makes MySQL communication simple and fast, but the disadvantage is that once one end starts to send data, the other end has to receive all the information before it can respond, so flow control cannot be performed.

  2. First use when the client initiates a queryA single packetSend the query to the server, because 半双工机制once the client sends the query, all that remains is to wait for the result

    If a query is too large, "MySQL server has gone away"an error sometimes occurs . The reason may be that the transmitted data is too large and the connection is disconnected. You can use the command to SHOW VARIABLES LIKE "%max_allowed_packet%"view the maximum data allowed by the server. The value can be my.cnf文件(WIN 环境 my.ini)configured in

  3. The server responds with a lot of data,Consists of multiple packets. When the server sends a response, the client must receive the complete result set. It cannot only extract a few rows of data and ask the server to stop sending the remaining data, so it LIMITis sometimes necessary to limit the number of rows of data required.

    MySQL needs all the data to be sent to the client before it can release the resources occupied by this query, so the client receives all the results and caches them to reduce the pressure on the server and allow the server to release resources earlier. In contrast, the memory pressure is transferred to the client, and the client is responsible for caching data and fetching it from memory on demand

Check status

Each MySQL connection at any point in time a state has identified an ongoing thing, you can use the SHOW FULL PROCESSLISTcommand to see which threads are running, their status query, Command column shows the state

mysql> SHOW FULL PROCESSLIST;
+----+-----------------+-----------+------+---------+-------+------------------------+-----------------------+
| Id | User            | Host      | db   | Command | Time  | State                  | Info                  |
+----+-----------------+-----------+------+---------+-------+------------------------+-----------------------+
|  4 | event_scheduler | localhost | NULL | Daemon  | 86599 | Waiting on empty queue | NULL                  |
|  9 | root            | localhost | NULL | Query   |     0 | starting               | SHOW FULL PROCESSLIST |
+----+-----------------+-----------+------+---------+-------+------------------------+-----------------------+

MySQL server connection status Description
Sleep The thread is waiting for the client to send it a new request
Query The thread is executing a query or sending data to the client
Locked The resources required by the query are locked by other queries, which usually means waiting for table locks
Analyzing and statistics The thread is collecting statistics on the storage engine and generating an execution plan for the query
Copying to tmp table [on disk] The query is being executed and the result set is copied to a temporary table. If there on diskis a temporary result set greater than tmp_table_size, the thread puts the temporary table from the memory to the disk
Sending data The thread is processing rows for the SELECT statement and is sending data to the client
Sorting for group Threads are being classified to meet GROUP BY requirements
Sorting for order Threads are being classified to meet ORDER BY requirements

2.2 Query cache

Before parsing the SQL statement, if it is turned on 查询缓存, MySQL will check the query cache,Perform case-sensitive hash lookups. The search rules are strict. Even if there is only one byte difference between the query and the query in the cache, it means that there is no match, and the query will enter the next stage.

The MySQL query cache retains the complete results of the query returned to the client, When the cache hits, the server will first check the permissions, and directly return the saved results after the check passes, skipping the parsing, optimization and execution steps. The query cache will track each table used by the query. If the data in these tables is changed, the cached data related to this table will be invalid

MySQL's method of determining a cache hit is simple:The cache is stored in a reference table and is referenced by a hash value. This hash value contains the query itself, the database to be queried, the client protocol version, and other information that may affect the returned result, so as to correctly hit the cache. It should be noted that when there are uncertain data in the query statement, it will not be cached, such as containing functions NOW()、CURRENT_DATE()等. In addition, the query cache is based on the complete SELECT statement and is only checked when the SQL statement is just received, soNeither subqueries nor stored procedures can use the query cache

2.3 SQL parsing and preprocessing

MySQL parses the SQL statement according to the keywords, decomposes the query into individual identifiers, and generates a corresponding解析树

  1. 解析器Responsible for ensuring that the identifiers in the query are valid, it will use MySQL grammar rules to verify and parse the parse tree, such as verifying whether the wrong keywords are used, the order of the keywords is correct, or whether there are quotation marks on the string Closed etc.
  2. 预处理器According to some MySQL rules, further check whether the parse tree is legal, for example, check whether the data table and data column exist, and also parse the field name and alias to see if they are ambiguous. Finally, the preprocessor checks the permissions

2.4 Query optimizer optimization

查询优化器Responsible for turning 解析树into 执行计划, a query can usually be executed in many ways,The task of the optimizer is to find the best way

MySQL’s query optimizer is based on cost decisions, it willTry to predict the cost of a query using a certain execution plan and choose the one with the least cost, You can SHOW STATUS LIKE "Last_query_cost"know the cost of the current query calculated by MySQL in the current session through the command

MySQL query optimization uses many optimization strategies to generate an optimal execution plan. They can be divided into two types.Static optimization and dynamic optimization

  • 静态优化
    Analyze the parse tree directly and complete the optimization. For example, the optimizer can convert the where condition into another equivalent form through some simple algebraic transformations. Static optimization does not depend on special values, such as some constants introduced in the where condition. Static optimization has been effective after the first completion, even if the query is repeated with different parameters, it will not change. It can be considered as a kind ofCompile-time optimization
  • 动态优化
    Dynamic optimization is related to the context of the query, and may also be related to many other factors, such as the value in the where condition, the number of data rows corresponding to the entry in the index, etc. These need to be re-evaluated each time the query is made, which can be considered asRuntime optimization

The following are some of the optimization types that MySQL can handle:

Optimization type Description
Redefine the order of association tables The association of data tables is not always done in the order specified in the query
Convert external connections into internal connections Not all outer join statements must be executed in an external join manner. Many factors, such as where conditions and library table structure may make the outer join equivalent to an inner join. MySQL can recognize this and rewrite the query so that it can adjust the order of associations in order to adapt to other optimizations, such as sorting
Use equivalent transformation rules MySQL can use some equivalent transformations to simplify and standardize expressions. It can merge and reduce some comparisons, and it can also remove some constant and untrue judgments. For example: (5=5 and a>5) will be rewritten as a>5. Similarly, if there is (a< b and b=c) and a=5, it will be rewritten as b>5 and b=c and a=5
Optimize count(), min() and max() Whether indexes and columns are empty can usually help MySQL optimize such expressions. For example, to find the minimum value of a column, you only need to query the leftmost record of the corresponding B-tree index, and MySQL can directly obtain the first row of the index. This can be used when the optimizer generates the execution plan. In the B-tree index, the optimizer will treat this expression as a constant. Similarly, if you want to find a maximum value, you only need to read the last record of the B-tree index. If MySQL uses this type of optimization, you can see "select tables optimized away" in the explain. It can be seen from the literal meaning that it means that the optimizer has removed the table from the execution plan and replaced it with a constant
Estimate and convert to constant expression When MySQL detects that an expression can be converted into a constant, it will always treat the expression as a constant for optimization. Mathematical expressions are a typical example
Covering index scan When the column in the index contains all the columns that need to be used in the query, MySQL can use the index to return the required data without querying the corresponding data row
Subquery optimization In some cases, MySQL can convert subqueries into a more efficient form, thereby reducing multiple queries to access data multiple times
Early termination of inquiry When it is found that the query requirements have been met, MySQL can always terminate the query immediately. A typical example is when the limit clause is used. In addition, there are several situations in which MySQL will terminate the query prematurely. For example, when an invalid condition is found, MySQL can immediately return an empty result.
Equivalence propagation If two columns are related by an equation, MySQL can pass the where condition of one column to the other column
List in() comparison MySQL sorts the data in the in() list first, and then uses binary search to determine whether the values ​​in the list meet the conditions. This is an o(log n) complexity operation, which is equivalent to a query of or The complexity is o(n). When there are a large number of values ​​in the in() list, MySQL's processing speed will be faster

2.5 Query execution engine

In the query optimization stage, MySQL will generate an execution plan corresponding to the query, and MySQL’s query execution engine willStep by step execution according to the instructions given by this execution plan to complete the entire query. In this process, a large number of operations need “handler API”to be completed by calling the so-called interface implemented by the storage engine

MySQL creates a handler instance for each table in the optimization phase. The optimizer can obtain table related information according to the interface of these instances, including all column names of the table, index statistics, etc.

2.6 Results returned

The last stage of query execution is to return the results to the client.Even if the query does not need to return results to the client, MySQL will still return some information about the query, such as the number of rows affected by the query. If the query can be cached, then MySQL will store the results in the query cache at this stage

MySQL 将结果返回客户端是一个增量、逐步返回的过程For example, in relational table operations, once the server finishes processing the last relational table and starts to generate the first result, MySQL can begin to gradually return the result set to the client. There are two advantages to this treatment:

  1. The server does not need to store too many results, and will not consume too much memory because of returning too many results
  2. This processing allows the MySQL client to get the returned results in the first time

Guess you like

Origin blog.csdn.net/weixin_45505313/article/details/106540525