MySQL slow query exploration and analysis

background:

  During the performance testing process, the database is often one of the performance bottlenecks, and the SQL statements in the database bottleneck are a part worth exploring and analyzing. Among them, slow queries are the key optimization targets. In MySQL, slow queries refer to long query execution times. or consume

Query statements for more resources. Specifically, MySQL can define slow queries by setting a threshold. Usually by default, queries that exceed 2 seconds are considered slow queries, but this threshold can be adjusted according to specific circumstances.

The presence of a slow query can have a negative impact on the performance of a MySQL database because it takes up a lot of computing and I/O resources, causing other queries to have slower response times. Therefore, it is very important to detect and optimize slow queries in time.

The overall structure of mysql:

MySQL is a typical client-server architecture system, which mainly consists of the following components:

  1. Client: Client refers to a program or tool that connects to the MySQL server. They can communicate with the MySQL server through the network or local sockets. MySQL provides a variety of client tools, such as mysql command line tool, MySQL Workbench, phpMyAdmin, etc.

  2. Connection Manager: The connection manager is responsible for managing client connections and sessions. It receives the client's connection request and limits the number of connections, the maximum number of concurrencies, etc. according to the parameters in the configuration file to ensure the stability and security of the MySQL server.

  3. Query Parser: The query parser is responsible for parsing the SQL query statement submitted by the client and converting it into an internal data structure understandable by the MySQL server. During this process, the query parser checks whether the syntax and semantics of the query statement are correct, and whether the permissions are sufficient to execute the query.

  4. Optimizer: The optimizer is a key component of MySQL query execution. It is responsible for optimizing the query execution plan to obtain the best execution efficiency. The optimizer will analyze the query statement and select the optimal index, table access sequence, connection method, etc. to execute the query. MySQL provides a variety of optimizers, such as rule-based optimizer, cost-based optimizer, etc.

  5. Storage Engine: The storage engine is the core component for storing and managing data in the MySQL database. MySQL supports multiple storage engines, such as InnoDB, MyISAM, MEMORY, etc. Each storage engine has its unique characteristics and applicable scenarios. For example, InnoDB is suitable for high concurrency and transactional operations, and MyISAM is suitable for read-intensive operations.

  6. Cache: Caching is one of the important means of MySQL performance optimization. MySQL provides a variety of caching mechanisms, such as query cache, table cache, buffer pool, etc. The query cache can cache query results to reduce the cost of repeated queries; the table cache can cache table structures to speed up table access; the buffer pool can cache data on disk to increase the speed of data access.

In general, the MySQL architecture is composed of components such as client, connection manager, query parser, optimizer, storage engine and cache. Each component has its own unique role and function, working together to achieve efficient and stable operation of the MySQL database system.

What is the execution process of SQL query statement:

  Knowing the overall structure of mysql, how is a query statement executed:

  You will first connect to this database. At this time, the connector will receive you. After the connection is established, the execution logic will come to the query cache. If query caching is enabled, previously executed statements and their results may be in the form of key-value pairs.

The formula is cached directly in memory. If hit, value is returned directly to the client. If there is no hit, continue. After the execution is completed, the execution results will be stored in the query cache. If the query cache is not hit, enter the analyzer and perform lexical analysis + syntax analysis

The SQL statement is parsed, and syntax errors are reported from this link. The optimizer is to improve the execution performance of SQL. After going through the analyzer, MySQL knows what to do. Before execution begins, it must be processed by the optimizer. inside the table

When there are multiple indexes, decide which index to use; or when a statement has multiple table associations (join), decide the connection order of each table. After optimization, the optimizer enters the executor stage. The executor interacts with the storage layer, obtains the execution results and returns them.

What is an index: 

  An index is a data structure used to speed up database queries. It can quickly locate records that meet query conditions, thereby improving query efficiency and performance. To put it simply, the emergence of indexes is actually to improve the efficiency of data query, just like the table of contents of a book

Similarly, if you want to quickly find a certain knowledge point without using the directory, then I guess you will have to search for a while. Similarly, for a database table, the index is actually its "directory". In MySQL, indexes are usually based on B-

Tree (B-tree) or hash table implementation.

  Indexes mainly include primary key indexes and non-primary key indexes. The primary key index is an index built on the primary key column of the table, while the non-primary key index is an index built on other columns or column combinations. During the query process, the query methods and efficiencies of primary key indexes and non-primary key indexes are different. For the primary key index, MySQL can quickly locate the specified row record through the B-Tree index structure. Because the primary key index is unique and each value corresponds to a row record, matching row records can be found directly. For example, if you need to query the student record with ID 10, you can use the following SQL statement:

    SELECT * FROM students WHERE id = 10;

  MySQL will use the primary key index to quickly locate the row record with ID 10, which is very efficient. For non-primary key indexes, MySQL can also locate row records that meet the query conditions through the B-Tree index structure, but additional steps are required. First, MySQL will find the primary key value of the row record that meets the query conditions based on the non-primary key index, and then locate the actual row record through the primary key index. For example, if you need to query student records named "Tom", you can use the following SQL statement:

    SELECT * FROM students WHERE name = 'Tom';

MySQL will use the non-primary key index idx_students_name to find the primary key values ​​of all row records named "Tom", and then locate the actual row records based on the primary key index. This process is called "query back to the table" and requires additional IO operations and CPU calculations.

calculation, so the efficiency is relatively low. If the amount of data in the table is large, the cost of querying back to the table will be more significant.

The more indexes the better:  

  Building more indexes is not better, but may have a negative impact on database performance. First of all, indexes will take up storage space. If you create too many indexes, it will cause the database to take up more disk space. For large databases, this may

Can cause insufficient disk space. Second, indexes affect the performance of insert, update, and delete operations. When inserting, updating, and deleting operations are performed, MySQL needs to update data and indexes. If you create too many indexes, these operations will take more time.

time, thereby reducing database performance.

  Finally, indexes affect the efficiency of query operations. Although indexes can speed up query operations, if you create too many indexes, MySQL will need to select the optimal index among multiple indexes, which will increase the cost of the query and may cause

MySQL selects inappropriate indexes, thereby reducing query efficiency. Therefore, when building indexes, you need to make choices based on specific circumstances to avoid excessive indexing. Typically, you can consider creating indexes on frequently used columns, or when you need to optimize

Create an index on the column of the query. At the same time, you can monitor index usage to determine which indexes need to be optimized or deleted to improve database performance and efficiency.

How to spot slow queries:

  1. Enable the slow query log by setting the slow_query_log parameter, monitor the slow query log, and send a notification immediately if a new slow query is added. (recommend)

  2. Slow query log analysis tools: MySQL provides some tools, such as mysqldumpslow and mysqlsla, which can analyze slow queries based on query logs and find out information such as the query with the longest execution time and the most frequent query.

How to optimize full queries:

Situation 1: By explaining, you may find that SQL does not use any indexes at all, and the amount of data in the table is now extremely huge.

Solution: Build a suitable index

Case 2: View the key field in the SQL execution plan through explain. If you find that the Key selected by the optimizer is different from the Key you expected. That's obviously the optimizer choosing the wrong index.

Solution: The fastest solution is: force index, force the index to be specified, or improve query efficiency by adding indexes, optimizing indexes, reconstructing query statements, etc.

Case 3: The query statement is complex or there are a large number of subqueries

Solution: Complex query statements or the presence of a large number of subqueries will affect query performance. You can consider optimizing SQL statements to improve query efficiency. For example, you can use a JOIN statement to replace multiple subqueries, or use a WHERE clause to limit the number of rows returned.

Analysis and optimization practice:

Suppose there is a table called "orders" with the following columns:

  • id: INT, primary key column
  • customer_id: INT, customer number
  • status: ENUM('pending', 'completed', 'cancelled'), order status
  • order_date: DATETIME, order date
  • amount: DECIMAL(10,2), order amount

Now we need to query all unfinished orders with an order amount greater than 1,000 yuan. The query statement is as follows:

 SELECT * FROM orders WHERE status = 'pending' AND amount > 1000;

First, you can view the query plan to see how the query is executing by using the EXPLAIN statement:

EXPLAIN SELECT * FROM orders WHERE status = 'pending' AND amount > 1000;

After execution, it was found that the value of the type column was ALL, and a full table scan was performed; the key field was NULL, and no search was used. Next, you can create indexes on the status and amount columns. The index creation statement is as follows:

CREATE INDEX idx_orders_status ON orders (status);
CREATE INDEX idx_orders_amount ON orders (amount);

Then execute the query statement again, you can see that the query efficiency has been significantly improved and the query speed has been greatly accelerated.

The query plan before optimization is as follows:

+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+
| 1 | SIMPLE | orders| NULL | ALL | NULL | NULL | NULL | NULL | 10 | 10.00 | Using where |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------------+

The optimized query plan looks like this:

+----+-------------+-------+------------+------+---------------------------+------------------+---------+-------+------+----------+-----------------------+
| id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------------------+------------------+---------+-------+------+----------+-----------------------+
| 1 | SIMPLE | orders| NULL | ref | idx_orders_status,idx_orders_amount | idx_orders_status | 2 | const | 5 | 50.00 | Using index condition |
+----+-------------+-------+------------+------+---------------------------+------------------+---------+-------+------+----------+-----------------------+

It can be seen that the optimized query plan uses the idx_orders_status index, and the query efficiency is greatly improved.

Therefore, by creating indexes on the status and amount columns, query efficiency can be improved and the load and response time of the database can be reduced. However, it should be noted that the establishment of indexes needs to be selected and applied according to specific circumstances. Too many indexes will affect the insertion

performance of insert, update, and delete operations, so the number and manner of index creation need to be carefully considered.

Finally, I would like to thank everyone who reads my article carefully. Reciprocity is always necessary. Although it is not a very valuable thing, if you can use it, you can take it directly:

Take action, it is better to be on the road than to wait and see all the time. In the future, you will definitely thank yourself for working hard now! If you want to learn and improve but can't find the information and there is no one to answer your questions, please join the group in time: 731789136. There are various testing and development materials and technologies in which you can communicate together.

This information should be the most comprehensive and complete preparation warehouse for [software testing] friends. This warehouse has also accompanied tens of thousands of test engineers through the most difficult journey. I hope it can also help you!

Guess you like

Origin blog.csdn.net/NHB456789/article/details/135171376