MySQL query optimization tuning --SQL

How to design optimal database table structure, index of how to build the best, and how to extend query the database, which is indispensable for high performance. However, only these were not enough to get a good database performance, we need well-designed database queries, bad if the query design, even if no amount of increase from the read-only database, table structure and then design a reasonable, appropriate re-index, As long as the query can not be used to it and it did not achieve high performance queries. So query optimization, index optimization, database table structure optimization needs go hand in hand.

Library table structure during the design, we have to take into account the subsequent query how to use these tables, too, when writing SQL statements should take into account how far the index already exists, or how to add a new index improve query performance.

Want query to optimize performance problems, you need to be able to find these queries, the following look at how to obtain performance problems SQL.

1. Obtain the SQL performance problems

There are three methods to obtain SQL performance problems:

  • Feedback Getting SQL performance problems by the user;
  • Getting SQL performance issues by slow check the log;
  • Real-time access SQL performance problems exist;

1. Slow SQL query logs for performance issues

MySQL slow query log is a performance overhead is relatively low acquisition performance problems SQL solutions, disk space and its main performance overhead in disk IO and storage logs need. For disk IO, because the write log is stored sequentially, basically negligible overhead, so the main concern is disk space.

MySQL provides the following parameters for controlling slow query log:

slow_query_log:是否启动慢查询日志,默认不启动,on 启动;
slow_query_log_file:指定慢查询日志的存储路径及文件,默认情况下保存在 MySQL 的数据目录中;
long_query_time:指定记录慢查询日志 SQL 执行时间的阈值,单位秒,默认 10 秒,通常对于一个繁忙的系统来说,改为0.001秒比较合适; log_queries_not_using_indexes:是否记录未使用索引的 SQL; 

And different binary log, slow query log will record all eligible SQL, including query, data modification statement has been rolled back the SQL.

Slow query log records:

# Query_time: 0.000220             //执行时间,可以精确到毫秒,220毫秒
# Lock_time: 0.000120              //所使用锁的时间,可以精确到毫秒
# Rows_sent: 1                     //返回的数据行数
# Rows_examined: 1                 //扫描的数据行数
SET timestamp=1538323200; //执行sql的时间戳 SELECT c FROM test1 WHERE id =100; //sql 

Typically, in a busy system, a short time may produce slow query log several G, manual inspection is almost impossible, in order to quickly slow query log analysis, relevant tools must help.

Commonly used tool slow query log:

1, mysqldumpslow: a common, slow query log analysis tool MySQL official provided with the installation of the MySQL server is installed. It can be summarized in addition to other identical query SQL, and the order of the results according to parameters specified in the output analysis.

2, pt-query-digest: a tool for MySQL slow query analysis.

2. The real-time access SQL performance problems

For more timely discover the current performance problems, we can also get there by SQL performance problems in real-time method. The most convenient way is to use PROCESSLIST table in MySQL information_schema to achieve real-time database performance problems found in SQL. For example, this SQL query represents the current server execution time over one second of SQL:

SELECT id,user,host,db,command,time,state,info FROM information_schema.PROCESSLIST WHERE TIME>=1

Then we can periodically be performed by this SQL scripts, SQL execution in real-time to discover which is relatively slow.

2.SQL pre-analytical and execution plan

Found that there is a performance issue SQL queries, so here we take a look at why these SQL performance problems exist?

To understand this, let's look at the steps of a MySQL server SQL requests to go through the process what are:

1. The client sends a request through the MySQL SQL interface to the server, this step does not usually affect query performance;
2.MySQL server checks whether the SQL query cache hit, if hit, then immediately return the result is stored in the cache otherwise, the next stage;
3.MySQL parsed SQL server, pretreatment, corresponding execution plans generated by the SQL optimizer;
the implementation plan calls engine API to query the data storage;
5. the results returned to the client end.

This is the MySQL server during query processing the entire request. In the second to the fifth step, are likely to affect the response time of the query, the response speed of these processes may query the following influential factors to look at are what are:

Before parsing the query, if the query cache is on, then the MySQL query priority check whether the query data in the cache hit, this check is to look to achieve through a case-sensitive Hash. As the only full-Hash lookup values ​​match, the request queries and cache queries are different, even if only one byte, then it will not match the results in the cache, in this case, the query will go to the next stage process. If you just hit the query cache before returning query results, MySQL will check user rights, but also do not need to parse SQL statements as in the query cache, the information has been stored table of the current query needs to access, if permission is no problem, MySQL will skip all the other stages, to get the results directly from the cache and returned to the client, in this case the query is not resolved, it will not produce a query plan is not executed.

Can be found, return results directly from the query cache is not easy.

SQL query caching impact on the performance of:

  • If the query cache, once the data is updated, the data cache to be refreshed affect performance;
  • Every check whether the SQL query cache is hit, the cache should be locked, affect performance;

For a system to read and write frequently, the query cache is likely to reduce the efficiency of query processing. So in this case I suggest that you do not use the query cache.

Query caching effects of system parameters:

query_cache_type: 设置查询缓存是否可用,可以设置为ON、OFF、DEMAND,DEMAND表示只有在查询语句中使用 SQL_CACHE 和 SQL_NO_CACHE 来控制是否需要缓存。
query_cache_size: 设置查询缓存的内存大小,必须是1024字节的整数倍。 
query_cache_limit: 设置查询缓存可用存储的最大值,如果知道很大不会被缓存,可以在查询上加上 SQL_NO_CACHE 提高效率。
query_cache_wlock_invalidate: 设置数据表被锁后是否返回缓存中的数据,默认关闭。
query_cache_min_res_unit: 设置查询缓存分配的内存块最小单位。

For a frequently read and write systems, query_cache_type can be set to OFF, and the query_cache_size set to 0.

When the query cache is not enabled or it will miss the next stage, which is a need to convert a SQL execution plan, MySQL and then interact according to plan and execute this storage engine, this phase includes several sub-processes: SQL parsing, pretreatment, optimized SQL execution plan. In this process, any errors, such as syntax errors, are likely to suspend the process of inquiry.

In the parsing stage, mainly carried out by keyword MySQL statement parse, and generate a corresponding "parse tree." This phase, the parser will use MySQL MySQL syntax validation rules and resolve queries, including syntax checking whether the use of the right keywords, keyword sequence is correct and so on.

Preprocessing stage is further checked whether the parse tree based on MySQL legal rules, such as checking involved in the query tables and data columns exist, or alias name check whether there is ambiguity and so on.

If the syntax checking all passed, the query optimizer can generate a query plan.

MySQL generates an error can cause the execution plan of reasons:

  • Statistical information is not accurate;
  • Cost estimate the cost of the execution plan is not equivalent to the actual implementation of the plan;
  • MySQL query optimizer considers the best of the best may not be the same as you might think;
  • MySQL never consider other concurrent queries, which may affect the speed of the current query;
  • MySQL sometimes based on some fixed rules to generate the execution plan;
  • MySQL does not consider the cost of control, such as stored procedures, user-defined functions and the like.

MySQL's query optimizer can optimize SQL type:

  • Redefine the associated order table, the order will optimize the association table to determine based on the statistical information;
  • An outer connector into the connector, such as database tables and where the conditions can make a structure equivalent to the external connection connector;
  • Use equivalent transformation rules, such as (5 = 5 and a> 5) will be rewritten as a> 5;
  • Using the index and column is empty to optimize the count (), min () and max () function and the like of polymerization;
  • The expressions into a constant expression;
  • Use equivalent transformation rules, such as a covering index, MySQL query optimizer found when the column contains all the information needed to query the time index, MySQL can use the index to return the required data;
  • Sub-query optimization, such as the associated handle query into a query, reduce the number of lookup tables;
  • Early termination of the inquiry;
  • For in () conditions were optimized.

These are the MySQL query optimizer can automatically query made some optimization. After SQL query optimizer after the rewrite, the query optimizer will it generate a SQL execution plan, then you can call the MySQL server storage engine API to perform according to plan, access to data through the storage engine.

3. Identify the various stages of the query time-consuming process

The main purpose of the SQL query query optimization is to reduce the time consumed to accelerate the response speed of the query. The following describes how to measure the various stages of query processing time consumed.

For an existing SQL performance issues, the most consumed must know at what stage the query time and then can be targeted for optimization. Measure the various stages of query processing time consumed, there are two common methods:

  • Use profile;
  • Use performance_schema;

4. Specific SQL query optimization

The method described earlier can already get a SQL performance problems of time and get a SQL at all stages of the implementation of the consumed. After obtaining this information, we can be targeted to optimize the SQL, SQL optimization Here are a few of the particular case:

1 large table updates and deletes

For large tables of data modification is best to batch, for example, we want to remove / update one million rows in a table of 10 million rows, then we had better in several batches delete / update, delete only once / record update 5000 rows, to avoid blocking for a long time, and in order to reduce the pressure on the master copy brought from each delete / modify the data needs to be suspended after a few seconds. Here are examples of such work can be done a MySQL stored procedure:

DELIMITER $$
USE 'db_name'$$
DROP PROCEDURE IF EXISTS 'p_delete_rows'$$
CREATE DEFINER='mysql'@'127.0.0.1' PROCEDURE 'p_delete_rows'() BEGIN DECLARE v_rows INT; SET v_rows = 1; WHERE v_rows > 0 DO DELETE FROM table_name WHERE id >= 9000 AND id <= 290000 LIMIT 5000; SELECT ROW_COUNT() INTO v_rows; SELECT SLEEP(5); END WHERE; END$$ DELIMITER; 

You can modify according to their own circumstances this stored procedure, or use their familiar development language to achieve this process, using this stored procedure requires only modify DELETE FROM table_name WHERE id> = 9000 AND id <= 290000 LIMIT 5000; content portion It can be.

2. How to modify the table structure of the large table

For InnoDB storage engines, when the field type column of the table to modify or change the width of the field will still lock table, also not solve the problem of delay from the master database.

solution:

Create a new table, the new table in the main server structure after the structure is modified, then cousin introduced into a new table data, and establishing a series of flip-flops in the cousin, modify the data synchronization update cousin to the new table, when data synchronization cousin and the new table, and then the cousin plus an exclusive lock, and then rename the new table to the cousin's name, it is best to delete the renamed cousin, thus completing great exemplar of structural modifications of the work. The benefits of this treatment is to minimize the delay from the master, and no need to add any locks before the renaming, just in time to add to rename a short lock, which is usually no effect on the application, the drawback is that the operation is relatively complex. Fortunately, there are tools to help us carry out this process, this tool is also a percona company MySQL toolset, called pt-online-schema-change:

pt-online-schema-change \
--alter="MODIFY c VARCHAR(150) NOT NULL DEFAULT ''" \
--user=root --password=password D=db_name,t=table_name \
--charset=utf8 --execute

This command is the width of the table table_name c db_name database column to VARCHAR (150).

3. How to optimize and not in <> query

MySQL query optimizer can automatically put some sub-query optimization for related queries, but there is not in and for <> This sub-query, it will not be automatically optimized, which resulted in the child will cycle several times to find table to confirm whether meet the filter conditions, the query happens if the child is a big table, then do so efficiency is very low, so we conduct SQL development, it is best to own this type of query associated with the query rewrite.

Before rewrite:

SELECT id,name,email 
FROM customer 
WHERE id 
NOT IN(SELECT id FROM payment)

Optimized rewrite:

SELECT a.id,a.name,a.email 
FROM customer a 
LEFT JOIN payment b ON a.id=b.id 
WHERE b.id IS NULL

Use LEFT JOIN associated NOT IN replaced by filtration, thus avoiding the payment of multiple queries table, which is a very common way of NOT IN optimization.

4. Use the summary table query optimization

The most common is the product of the number of comments, if we are when a user accesses the page, real-time access to product reviews the number, generally speaking, SQL query will look similar to the following:

SELECT COUNT(*) FROM product_comment WHERE product_id = 10001;

The SQL is to count all the comments product_id = 10001, assuming comment table has millions of records, then execute this SQL is very slow, if there are a large number of concurrent access, will bring a lot of pressure on the database . For such cases, we usually optimized use summary tables. The so-called summary table is ahead of the statistical data to be aggregated and recorded in the table have been prepared subsequent queries. For this query, we can use the following way to optimize:

CREATE TABLE product_comment_cnt(product_id INT, cnt INT);   //建立汇总表

//查询评论数 SELECT SUM(cnt) FROM( SELECT cnt FROM product_comment_cnt WHERE product_id = 10001 UNION ALL SELECT COUNT(*) FROM product_comment WHERE product_id = 10001 AND timestr > DATE(NOW()) ); 

Think there is any question do not understand or could not understand, the cut did not know how to learn, you can add group: 833 145 934 , programmer purely exchange circles, which may discuss technical issues (Spring, MyBatis, Netty source code analysis, high concurrency, high performance, distributed, the principle of micro-services architecture, JVM performance optimization), learning orientation distributed architecture, etc., as well as interview data sharing. Into the group if you see someone looking for you whisper or play advertising training classes, please inform the main group, the main group will be the first time mentioned in addition. This group talk dry, virtual engage.

 

Guess you like

Origin www.cnblogs.com/AIPAOJIAO/p/11099602.html