MySQL Performance Optimization Two Explain Detailed Explanation

An introduction to the Explain tool

Use the EXPLAIN keyword to simulate the optimizer to execute SQL statements, and analyze the performance bottleneck of your query statement or structure;

Add the explain keyword before the select statement, MySQL will set a flag on the query, and executing the query will return the execution plan information instead of executing this SQL;

Note: If from contains a subquery, the subquery will still be executed and the result will be placed in a temporary table;

1.1 Explain analysis example

The MySQL version used in this blog is 5.7.
Refer to the official documentation: https://dev.mysql.com/doc/refman/5.7/en/explain-output.html

#示例表:
2 DROP TABLE IF EXISTS `actor`;
3 CREATE TABLE `actor` (
4 `id` int(11) NOT NULL,
5 `name` varchar(45) DEFAULT NULL,
6 `update_time` datetime DEFAULT NULL,
7 PRIMARY KEY (`id`)
8 ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
9
10 INSERT INTO `actor` (`id`, `name`, `update_time`) VALUES (1,'a','2017‐12‐22
15:27:18'), (2,'b','2017‐12‐22 15:27:18'), (3,'c','2017‐12‐22 15:27:18');
11
12 DROP TABLE IF EXISTS `film`;
13 CREATE TABLE `film` (
14 `id` int(11) NOT NULL AUTO_INCREMENT,
15 `name` varchar(10) DEFAULT NULL,
16 PRIMARY KEY (`id`),
17 KEY `idx_name` (`name`)
18 ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
19
20 INSERT INTO `film` (`id`, `name`) VALUES (3,'film0'),(1,'film1'),(2,'film2');
21
22 DROP TABLE IF EXISTS `film_actor`;
23 CREATE TABLE `film_actor` (
24 `id` int(11) NOT NULL,
25 `film_id` int(11) NOT NULL,
26 `actor_id` int(11) NOT NULL,
27 `remark` varchar(255) DEFAULT NULL,
28 PRIMARY KEY (`id`),
 #建立联合索引
29 KEY `idx_film_actor_id` (`film_id`,`actor_id`)
30 ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
31
32 INSERT INTO `film_actor` (`id`, `film_id`, `actor_id`) VALUES (1,1,1),(2,1,2),(3,2,1);
 explain select * from actor;

insert image description here

Each table in the query will output one row, if there are two tables connected by join query, then two rows will be output;

1.2 explain two variants

explain extended
will provide some additional query optimization information on the basis of explain. Immediately afterwards, the optimized query statement can be obtained through the show warnings command, so as to see what the optimizer has optimized. In addition, there is a filtered column, which is a half ratio value. rows *filtered/100 can estimate the number of rows that will be connected with the previous table in explain (the previous table refers to the id value in explain that is smaller than the current table id value surface).

# 5.7版本后就不需要加extended了,加这个关键字会多显示partitions和filtered两列
# partitions显示表是否有分区
explain extended select * from film where id = 1;
show warnings;

insert image description here

insert image description here
The optimized sql may not be able to execute, but just roughly tells you how to optimize;

Compared with explain, explain partitions
has one more partitions field. If the query is based on a partitioned table, it will display the partitions that the query will access.

1.3 The meaning of all the columns in explain

Next we will show the information for each column in explain.

1.3.1 identity column

The number of the id column is the serial number of the select, there are several selects, there are several ids, and the order of the ids increases in the order in which the selects appear.

The larger the id column, the higher the execution priority. If the id is the same, it will be executed from top to bottom (that is, the id is not unique). If the id is NULL, it will be executed last.

1.3.2 select_type Column

select_type indicates whether the corresponding row is a simple or complex query.

  1. simple: simple query. Query does not contain subquery and union
 explain select * from film where id = 2;

insert image description here

  1. primary: the outermost select in complex queries
  2. subquery (subquery): the subquery included in the select (not in the from clause)
  3. derived (derived query): A subquery included in the from clause. MySQL will store the results in a temporary table , also known as a derived table (the English meaning of derived)

Use this example to understand primary, subquery and derived types

 #关闭mysql5.7新特性对衍生表的合并优化
 set session optimizer_switch='derived_merge=off'; 
 explain select (select 1 from actor where id = 1) from (select * from film where id = 1) der;

insert image description here
insert image description here

 #还原默认配置
 set session optimizer_switch='derived_merge=on'; 
  1. union: the second and subsequent select in the union
explain select 1 union all select 1;

insert image description here

1.3.3 table columns

This column indicates which table a row of explain is accessing.
When there is a subquery in the from clause, the table column is <derivenN>in the format , indicating that the current query depends on the query with id=N, so the query with id=N is executed first.
When there is a union, the value of the table column of the UNION RESULT <union1,2>, 1 and 2 represent the select row ids participating in the union.

1.3.4 type column

This column represents the association type or access type, that is, how MySQL decides to find the rows in the table, and the approximate scope of finding the data row records.

From the best to the worst, they are: system > const > eq_ref > ref > range > index > ALL

Generally speaking, it is necessary to ensure that the query reaches the range level, preferably ref;

NULL : Mysql is divided into an optimization phase and an execution phase when executing a query statement; mysql can decompose the query statement during the optimization phase, and there is no need to access tables or indexes during the execution phase. For example: selecting the minimum value in the index column can be done by searching the index alone (directly find the leftmost leaf node of the tree in the index tree B+ tree, no need to find the table), and there is no need to access the table during execution

explain select min(id) from film;

insert image description here
const, system : mysql can optimize a certain part of the query and convert it into a constant (you can see the result of show warnings). When all the columns of the primary key or unique key are compared with constants, so the table has at most one matching row, read once, and the speed is relatively fast. system is a special case of const , when only one tuple matches in the table, it is system

 explain extended select * from (select * from film where id = 1) tmp;

insert image description here

# 这个子查询根据主键查询,得到的是一个常量(const),效率非常高
select * from film where id = 1

For the select of the outer layer, the query is a table and this table has only one row of records, so the efficiency is higher, so it is the system level, the highest;

show warnings;

insert image description here
eq_ref : primary key association; all parts of the primary key (primary key) or unique key (unique key) index are used by the connection, and at most one record that meets the conditions will be returned. This is probably the best join type besides const, and simple select queries don't have this type.

> explain select * from film_actor left join film on film_actor.film_id = film.id;

insert image description here
ref : Compared with eq_ref, it does not use a unique index, but uses a normal index or a partial prefix of a unique index. The index needs to be compared with a certain value, and multiple eligible rows may be found.

  1. Simple select query, name is an ordinary index (non-unique index)
select * from film where name = 'film1';

insert image description here2. Association table query, idx_film_actor_id is the joint index of film_id and actor_id, here the left prefix film_id of film_actor is used

explain select film_id from film left join film_actor on film.id = film_actor.fi
lm_id;

insert image description here
range : Range scanning usually appears in operations such as in(), between ,> ,<, >= etc. Use an index to retrieve a given range of rows.

explain select * from actor where id > 1;

insert image description here
index : You can get the results by scanning the full index. Generally, you scan a secondary index (not the primary key index). This scan does not quickly search from the root node of the index tree, but directly traverses the leaf nodes of the secondary index And scanning, the speed is still relatively slow. This kind of query generally uses a covering index (described later), and the secondary index is generally relatively small, so this kind of query is usually faster than ALL.

explain select * from film;

If the results to be checked can be obtained in the secondary index, the secondary index
insert image description here
ALL is preferred : that is, the full table scan, which scans all the leaf nodes of your clustered index. Usually this needs to increase the index to optimize

explain select * from actor;

insert image description here

1.3.5 possible_keys column

This column shows which indexes the query may use for lookups.
When explaining, possible_keys may have a value, but the key shows NULL. This is because there is not much data in the table. Mysql thinks that the index is not very helpful for this query (mysql thinks that the full table scan is faster), so I chose the full table query .

If the column is NULL, there is no associated index. In this case, you can improve query performance by checking the where clause to see if you can create an appropriate index, and then use explain to see the effect.

1.3.6 key column

This column shows which index mysql actually uses to optimize access to the table.
This column is NULL if no index is used. If you want to force mysql to use or ignore the index in the possible_keys column, use forceindex and ignore index in the query.

1.3.7 key_len列

This column shows the number of bytes used by mysql in the index. This value can be used to calculate which columns in the (joint) index are used.
For example, the joint index idx_film_actor_id of the film_actor table consists of two int columns film_id and actor_id, and each int is 4 bytes. From the key_len=4 in the result, it can be inferred that the query uses the first column: film_id column to perform index lookup.

# key_len=4
explain select * from film_actor where film_id = 2;
# 此时key_len=8
explain select * from film_actor where film_id = 2 and actor_id = 2;

insert image description here
The key_len calculation rules are as follows:

  • String, char(n) and varchar(n), in versions after 5.0.3, n represents the number of characters, not the number of bytes. If it is utf-8, a number or letter occupies 1 byte, and a Chinese character 3 bytes

    • char(n): If the stored Chinese character length is 3n bytes
    • varchar(n): If storing Chinese characters, the length is 3n + 2 bytes, and the added 2 bytes are used to store the length of the string, because
      varchar is a variable-length string
  • value type

    • tinyint: 1 byte
    • smallint: 2 bytes
    • int: 4 bytes
    • bigint: 8 bytes
  • time type

    • date: 3 bytes
    • timestamp: 4 bytes
    • datetime: 8 bytes
  • If the field allows NULL, it needs 1 byte to record whether it is NULL

The maximum length of the index is 768 bytes. When the string is too long, mysql will do a process similar to the left prefix index, and extract the first half of the characters for indexing.

1.3.8 ref column

This column shows the columns or constants used in the table lookup value in the index recorded in the key column. The common ones are: const (constant), field name (for example: film.id)

1.3.9 rows

This column is the number of rows that mysql estimates to read and detect. Note that this is not the number of rows in the result set.

1.3.10 Extra column

This column displays additional information. Common important values ​​are as follows:

1.3.10.1 Using index: Using a covering index

Covering index definition : it is not an index, but a way to find SQL; the result set to be searched is all in the index tree (covered), that is, the desired result can be found through the index tree, no need to go back Table; the key in the result of the mysql execution plan explain uses the index. If the fields queried after the select can be obtained from the tree of this index, this situation can generally be said to use the covering index, and there is generally a using index in the extra ;The covering index is generally aimed at the auxiliary index. The entire query result can be obtained only through the auxiliary index. There is no need to find the primary key through the auxiliary index tree, and then use the primary key to obtain other field values ​​in the primary key index tree.

explain select film_id from film_actor where film_id = 1;

insert image description here

1.3.10.2 Using where

Use the where statement to process the results, and the query column is not covered by the index

explain select * from actor where name = 'a';

insert image description here

1.3.10.3 Using index condition

The query column is not completely covered by the index, and the where condition is the range of a leading column;

explain select * from film_actor where film_id > 1;

insert image description here

1.3.10.4 Using temporary

mysql needs to create a temporary table to process the query. When this happens, it is generally necessary to optimize. The first thing to think about is to use indexes to optimize.

  1. actor.name has no index, and a temporary table is created at this time to distinguish
explain select distinct name from actor;

insert image description here

  1. film.name has established an idx_name index, and extra is using index when querying at this time, and no temporary table is used
explain select distinct name from film;

insert image description here

1.3.10.5 Using filesort

External sorting will be used instead of index sorting. When the data is small, it will be sorted from memory, otherwise it needs to be sorted on disk. In this case, it is generally necessary to consider using indexes for optimization.

  1. actor.name does not create an index, it will browse the entire table of actors, save the sort keyword name and the corresponding id, then sort name and retrieve row records
explain select * from actor order by name;

insert image description here

  1. film.name has established an idx_name index, and extra is using index when querying at this time
explain select * from film order by name;

insert image description here

1.3.10.6 Select tables optimized away

Using some aggregate functions (such as max, min) to access a field that exists in the index is

explain select min(id) from film;

insert image description here

Secondary Index Best Practices

#示例表:
2 CREATE TABLE `employees` (
3 `id` int(11) NOT NULL AUTO_INCREMENT,
4 `name` varchar(24) NOT NULL DEFAULT '' COMMENT '姓名',
5 `age` int(11) NOT NULL DEFAULT '0' COMMENT '年龄',
6 `position` varchar(20) NOT NULL DEFAULT '' COMMENT '职位',
7 `hire_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '入职时间',
# 主键
8 PRIMARY KEY (`id`),
# 联合索引
9 KEY `idx_name_age_position` (`name`,`age`,`position`) USING BTREE
10 ) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8 COMMENT='员工记录表';
11
12 INSERT INTO employees(name,age,position,hire_time) VALUES('LiLei',22,'manager',NOW());
13 INSERT INTO employees(name,age,position,hire_time) VALUES('HanMeimei',
23,'dev',NOW());
14 INSERT INTO employees(name,age,position,hire_time) VALUES('Lucy',23,'dev',NOW());

2.1 Full value matching

Full value matching: all columns in the joint index are used

# name字段定义的类型是varchar(24):3N+2=3X24+2=74(上边有这个公式:3N+2)
 EXPLAIN SELECT * FROM employees WHERE name= 'LiLei';

insert image description here

# 在上边的基础上(74)加int类型的4=78
EXPLAIN  SELECT * FROM employees WHERE name= 'LiLei' AND age = 22;

insert image description here

# 效率最高;因为是全值匹配
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22 AND position ='manager';

insert image description here

2.2 Leftmost prefix rule

If multiple columns are indexed, the leftmost prefix rule must be followed. Refers to the query starting from the leftmost front column of the index and not skipping columns in the index.
The order of the joint index:

KEY `idx_name_age_position` (`name`,`age`,`position`)

Check for phrases:

EXPLAIN SELECT * FROM employees WHERE name = 'Bill' and age = 31;
EXPLAIN SELECT * FROM employees WHERE age = 30 AND position = 'dev';
EXPLAIN SELECT * FROM employees WHERE position = 'manager';

insert image description here

2.3 Do not do any operations on the index column

Calculations, functions, (automatic or manual) type conversions will cause index failure and turn to full table scans

# 走索引
EXPLAIN SELECT * FROM employees WHERE name = 'LiLei';
# 不走索引,因为在索引树里没法寻找
EXPLAIN SELECT * FROM employees WHERE left(name,3) = 'LiLei';

insert image description here
Add a normal index to hire_time:

ALTER TABLE `employees` ADD INDEX `idx_hire_time` (`hire_time`) USING BTREE ;
# 不走索引,因为通过date函数后得到的结果集在索引树里不一定能找到
EXPLAIN select * from employees where date(hire_time) ='2018‐09‐30';

insert image description here
Converted to a date range query, it is possible to use the index:

EXPLAIN select * from employees where hire_time >='2018‐09‐30 00:00:00' and hire_time <='2018‐09‐30 23:59:59';

insert image description here
Restore original index state

ALTER TABLE `employees` DROP INDEX `idx_hire_time`;

2.4 The storage engine cannot use the column to the right of the range condition in the index

#三个字段的索引都走了
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 22 AND position ='manager';
#只走了前两个索引:name字段相对等的前提下,age一定是有序的,但是age不相等的话,position就不是有序的了,所以position不走索引
EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age > 22 AND position ='manager';

insert image description here

2.5 Try to use covering index

Queries that only access the index (the index column contains the query column), reducing the select * statement

EXPLAIN SELECT name,age FROM employees WHERE name= 'LiLei' AND age = 23 AND position='manager';

insert image description here

EXPLAIN SELECT * FROM employees WHERE name= 'LiLei' AND age = 23 AND position ='manager';

insert image description here

2.6 Mysql cannot use indexes when using not equal (!= or <>), not in, not exists, which will lead to full table scan

< Less than, > Greater than, <=, >= These, the mysql internal optimizer will evaluate whether to use indexes as a whole based on multiple factors such as retrieval ratio and table size

EXPLAIN SELECT * FROM employees WHERE name != 'LiLei';

insert image description here

2.7 is null, is not null In general, indexes cannot be used

EXPLAIN SELECT * FROM employees WHERE name is null

insert image description here

2.8 like starts with a wildcard ('$abc...') Mysql index failure will become a full table scan operation

EXPLAIN SELECT * FROM employees WHERE name like '%Lei'

insert image description here

# 前边字符串是有序的,可以走索引
EXPLAIN SELECT * FROM employees WHERE name like 'Lei%'

insert image description here
Question: How to solve index like '%string%' not being used?
a) To use a covering index, the query field must be a covering index field

EXPLAIN SELECT name,age,position FROM employees WHERE name like '%Lei%';

insert image description here
b) You may need to use a search engine if you cannot use a covering index

2.9 The index fails if the string is not added with single quotes

EXPLAIN SELECT * FROM employees WHERE name = '1000';
EXPLAIN SELECT * FROM employees WHERE name = 1000;

insert image description here

2.10 Use less or or in

When using it to query, mysql does not necessarily use indexes. The internal optimizer of mysql will evaluate whether to use indexes as a whole based on multiple factors such as retrieval ratio and table size. For details, see range query optimization

EXPLAIN SELECT * FROM employees WHERE name = 'LiLei' or name = 'HanMeimei';

insert image description here

2.11 Range query optimization

Add a single value index to age

ALTER TABLE `employees` ADD INDEX `idx_age` (`age`) USING BTREE ;
EXPLAIN select * from employees where age >=1 and age <=2000;

insert image description here
The reason for not using the index: the internal optimizer of mysql will evaluate whether to use the index as a whole based on multiple factors such as the retrieval ratio and table size. For example, in this example, it may be that the optimizer finally chooses not to use the index due to the large amount of data in a single query.

Optimization method: a large range can be split into multiple small ranges

 explain select * from employees where age >=1 and age <=1000;
 explain select * from employees where age >=1001 and age <=2000;

insert image description here
Restore original index state

ALTER TABLE `employees` DROP INDEX `idx_age`;

Three MySQL index use summary

insert image description here
like KK%is equivalent to = constant, %KKand %KK%like is equivalent to range

‐‐ mysql5.7关闭ONLY_FULL_GROUP_BY报错
select version(), @@sql_mode;SET sql_mode=(SELECT REPLACE(@@sql_mode,'ONLY_FULL_GROUP_BY',''));

Guess you like

Origin blog.csdn.net/qq_33417321/article/details/121192241