Explain Detailed optimization practice with index

1. What is explain?

Use keywords to simulate explain the optimizer to execute SQL statements, so they know how to use an index MySQL to handle your SQL query and connection table, you can analyze performance bottlenecks query or structure to help us choose a better index and write a more optimized queries. (To put it plainly, it is to optimize the SQL tool )

2, how to use explain?

In front of your SQL queries plus explain, like this explain select * from table, MySQL will set a mark on the query, the query is executed, returns execution plan information, rather than execute this SQL (if from the contains a subquery, will execute the sub-query results into a temporary table).

3, explain examples using

It requires the use of three tables, respectively actor cast, film movie table, film_actor movie - the actors associated table.

TABLE `actor` the CREATE (
  ` id` int (. 11) the COMMENT the NOT NULL 'primary key ID',
  `name` VARCHAR (45) the COMMENT the DEFAULT NULL 'actor name',
  ` update_time` the COMMENT datetime the DEFAULT NULL 'modified',
  a PRIMARY KEY ( `id`)
) = ENGINE the InnoDB the DEFAULT the CHARSET = UTF8;

insert into `actor` (`id`, `name`, `update_time`) values('1','a','2020-02-11 22:56:00');
insert into `actor` (`id`, `name`, `update_time`) values('2','b','2020-02-11 22:56:00');
insert into `actor` (`id`, `name`, `update_time`) values('3','c','2020-02-11 22:56:00');

CREATE TABLE `film` (
  `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '主键id',
  `name` varchar(10) DEFAULT NULL COMMENT '电影名称',
  PRIMARY KEY (`id`),
  KEY `idx_name` (`name`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8;

insert into `film` (`id`, `name`) values('3','film0');
insert into `film` (`id`, `name`) values('1','film1');
insert into `film` (`id`, `name`) values('2','film2');

CREATE TABLE `film_actor` (
  `id` int(11) NOT NULL COMMENT '主键id',
  `film_id` int(11) NOT NULL COMMENT '电影id',
  `actor_id` int(11) NOT NULL COMMENT '演员id',
  `remark` varchar(255) DEFAULT NULL COMMENT '备注',
  PRIMARY KEY (`id`),
  KEY `idx_film_actor_id` (`film_id`,`actor_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

insert into `film_actor` (`id`, `film_id`, `actor_id`, `remark`) values('1','1','1',NULL);
insert into `film_actor` (`id`, `film_id`, `actor_id`, `remark`) values('2','1','2',NULL);
insert into `film_actor` (`id`, `film_id`, `actor_id`, `remark`) values('3','2','1',NULL);

After performing the above SQL, table data correspond to the following three:

The following shows explain the information in each column:

(. 1) ID column

id column number is a serial number of the select statement, there are several a few select id, and the id number is (in order of occurrence and growth of select id, the more the corresponding first select statement is executed, if id is equal to , down from the execution, id finally executed as NULL ).

MySQL will select queries into simple queries (SIMPLE) and complex queries (PRIMARY).

Complex queries are divided into three categories: simple subqueries, derived tables (from the sub-query statements), union queries.

1) Simple subqueries

Execute the SQL statement: EXPLAIN SELECT (SELECT 1 FROM actor LIMIT 1) FROM film

2) from the subquery clause

Execute the SQL statement: EXPLAIN SELECT id FROM (SELECT id FROM film) AS der

Analysis: This has a temporary table alias der, external select query references a temporary table when the query is executed.

3) union query

Execute the SQL statement: EXPLAIN SELECT 1 UNION ALL SELECT 1

Analysis: union anonymous result is always on a temporary table, SQL does not appear in temporary tables, so it's id is NULL. (Not recommended union, performance is not high)

(2)select_type列

This column shows the corresponding row is simple or complex queries, if a query is complex, but also what kind of the three complex query.

1) SIMPLE: simple query. Query does not contain sub-queries and union.

Execute the SQL statement: EXPLAIN SELECT * FROM film WHERE id = 2

2) PRIMARY: complex queries in the outermost select.

3) SUBQUERY: included in the sub-query select (not in the from clause).

4) DERIVED: included in the from clause subqueries. MySQL will store the result in a temporary table, also called derived tables (DERIVED English meaning).

执行SQL语句:EXPLAIN SELECT (SELECT 1 FROM actor WHERE id = 1) FROM (SELECT * FROM film WHERE id = 1) der

5) UNION: select the second and subsequent union of the.

6) UNION RESULT: select from the search results union temporary tables.

Execute the SQL statement: EXPLAIN SELECT 1 UNION ALL SELECT 1

(. 3) Table Column

This column indicates which table to explain the line being accessed.

When there is a subquery in the from clause, table columns are <DERIVED N> format, showing the current query dependent query id = N, so the first query execution id = N.

When a union, the value table column UNION RESULT <union 1,2>, 1 and 2 represent the row select id involved in the union.

(4) of the type column

(Tips: The following sections explain the complete theory it is possible to force or ignorant, it does not matter, continue to look down, there are practical examples)

This column shows the type of association or the type of access that MySQL decides how to find rows in the table, find the approximate range of data records.

SQL statements to query efficiency from best to worst were: System> const> eq_ref> ref> the Range> index> ALL .

In general, the inquiry was reached to ensure range level best to achieve ref .

NULL : MySQL can (ie optimization phase) before the SQL statement query decomposition analysis, during the implementation phase do not need to re-access the table or index. For example: Select the minimum value in the index column, can separate index lookup is done, does not need to access the table during execution frequency of occurrence is not high.

const, System : the MySQL able to optimize a part of the query and converts it to a constant (see the results show warnings of possible). When comparing all columns for the primary key index constants or unique index table up to a matching row 1 is read, faster. const system is a special case, only one table is a record matching system.

执行SQL语句:EXPLAIN EXTENDED SELECT * FROM (SELECT * FROM film WHERE id = 1) tmp

Analysis: The above sub-query SELECT * FROM film WHERE id = 1 statement back where id using the primary key index query, the primary key is unique, so the results must be only one record, to know exactly the result set is only one record of inquiry, its type is const type, performance has been very high; and the first select complex query table has only one record, so the result is certainly only one record (select sub-second before the lookup table may be multiple records), this special case of its type to system type, maximum performance.

执行SQL语句:EXPLAIN EXTENDED SELECT * FROM (SELECT * FROM film WHERE id = 1) tmp;  SHOW WARNINGS;

Analysis: The explain extended view the execution plan, this column gives the value of a percentage than explain more than a filtered, the value and rows of columns together, you can estimate the number of rows to be and explain in front of a table for connection the former refers to a table explain the value of the id column id small table than the current table. explain extended can also be used in conjunction with the show warnings, it can give a suggestion optimization, the real execution is executed piece of SQL optimization recommendations, but if it is very complex SQL, it is possible to optimize the results did not come out of your original SQL high performance.

eq_ref : all or part of the primary key indexes are unique indexes connection, it will only return a matching records. This is probably the best type of connection, simply select the query does not appear in this type outside const.

Execute the SQL statement: EXPLAIN SELECT * FROM film_actor LEFT JOIN film ON film_actor.film_id = film.id

Analysis: There are two records, indicating that there are two queries, equal id, from the execution down, indicating that Article 1 of the first query is executed film_actor table, Article 2 of the left join query film table. LEFT JOIN table and associated film.id film, due film.id is a unique index, film table can only associate a row, so Article 2 select the type to eq_ref.

ref : Compared eq_ref, without the use of a unique index, but the use of the prefix part of the general index or unique index, an index and a value to be compared to the number of records that meet the conditions may be found.

① simple select query, name is the general index (non-unique index)

Execute the SQL statement: EXPLAIN SELECT * FROM film WHERE NAME = "film1"

② association table query, idx_film_actor_id joint index film_id and actor_id, where the use of the index on the left of the prefix part film_id film_actor.

Execute the SQL statement: EXPLAIN SELECT * FROM film LEFT JOIN film_actor ON film.id = film_actor.film_id

Range : Range Scan usually occurs in (), between,>, <,> = , etc. operations. Using an index to retrieve a row given range.

Execute the SQL statement: EXPLAIN SELECT * FROM actor WHERE id> 1

index : full table scan index, which usually faster than ALL few. (Index is read from the index, and ALL is read from the hard disk)

Execute the SQL statement: EXPLAIN SELECT * FROM film; (film form All fields are added to the index)

ALL : namely, a full table scan, means that MySQL need to find the line from start to finish (not taking the index) needs. Often this requires adding indexes to optimize.

Execute the SQL statement: EXPLAIN SELECT * FROM actor; (actor table has a field not indexed)

(5)possible_keys列

This column shows the query which may use the index to find.

There are columns when possible possible_key explain, while the case key display NULL, this situation is not much because the data in the table, MySQL think this query index of little help, choose a full table query.

If the column is NULL, no relevant index. In this case, it is possible to improve query performance by checking whether the where clause can create an appropriate index, then view the results with explain.

(6) Key column

This column shows which index MySQL actually used to optimize access to the table.

如果没有使用索引,则该列是NULL。如果想强制MySQL使用或忽视possible_keys列中的索引,在查询中使用force index、ignore index。

(7)key_len列

这一列显示了MySQL在索引里使用的字节数,通过这个值可以算出具体使用了索引中的哪些列。

举例来说,film_actor表的联合索引idx_film_actor_id由film_id和actor_id两个int列组成,并且每个int是4字节。通过下面结果中的key_len=4可推断出只使用了第一个列flim_id来执行索引查找。

执行SQL语句:EXPLAIN SELECT * FROM film_actor WHERE film_id = 2

key_len计算规则如下:

① 字符串

  • char(n):n字节长度
  • varchar(n):2字节存储字符串长度,如果是UTF-8,则长度为3n+2

② 数值类型

  • tinyint:1字节
  • smallint:2字节
  • int:4字节
  • bigint:8字节

③ 时间类型

  • date:3字节
  • timestamp:4字节
  • datetime:8字节

④ 如果字段允许为NULL,需要1字节记录是否为NULL

(8)ref列

这一列显示了在key列记录的索引中,表查找值所用到的列或常量,常见的有:const(常量)、字段名(例:film.id)。

(9)rows列

这一列是MySQL估计要读取并检测的行数,注意这个不是结果集里的行数。

(10)Extra列 

这一列展示的是额外信息。常见的重要值如下:

Using index: 查询的列被索引覆盖,并且where筛选条件是索引的前导列(类似联合索引的最左前缀原则),是性能高的表现。一般是使用了覆盖索引(即索引包含了所有查询的字段)。对于InnoDB来说,如果是普通索引性能会有不少提高。

执行SQL语句:EXPLAIN SELECT film_id FROM film_actor WHERE film_id = 1

Using where:查询的列不完全被索引覆盖,where筛选条件非索引的前导列。(不走索引,性能较低)

执行SQL语句:EXPLAIN SELECT * FROM actor WHERE name = 'a'

 Using where; Using index:查询的列被索引覆盖,并且where筛选条件是索引列之一但不是索引的前导列,意味着无法直接通过索引来查找符合条件的数据。

执行SQL语句:EXPLAIN SELECT film_id FROM film_actor WHERE actor_id = 1

NULL:查询的列未被索引覆盖,并且where筛选条件是索引的前导列,意味着用到了索引,但是部分字段未被索引覆盖,必须通过“回表”来实现,不是纯粹地用到了索引,也不是完全没用到索引。

执行SQL语句:EXPLAIN SELECT * FROM film_actor WHERE film_id = 1

Using index condition:MySQL 5.6版本开始加入的新特性,与Using where类似,查询的列不完全被索引覆盖,where条件中是一个前导列的范围。

执行SQL语句:EXPLAIN SELECT * FROM film_actor WHERE film_id > 1

Using temporary:MySQL需要创建一张临时表来处理查询。出现这种情况一般是要进行优化的,首先要想到用索引来优化。

① actor.name没有索引,此时创建了一张临时表来distinct。(distinct:去除查询结果中的重复记录)

执行SQL语句:EXPLAIN SELECT DISTINCT NAME FROM actor

② film.name建立了idx_name索引,此时查询时extra是Using index,没有用临时表。

执行SQL语句:EXPLAIN SELECT DISTINCT NAME FROM film

Using filesort:MySQL会对结果使用一个外部索引排序,而不是按照索引次序从表里读取行。此时MySQL会根据连接类型浏览所有符合条件的记录,并保存排序关键字和行指针,然后排序关键字并按顺序检索行信息。这种情况下一般也是要考虑使用索引来优化。

① actor.name未创建索引,会浏览actor整个表,保存排序关键字name和对应的id,然后排序name并检索行记录。

执行SQL语句:EXPLAIN SELECT * FROM actor ORDER BY name

② film.name建立了idx_name索引,此时查询时extra是Using index,因为索引底层数据结构已经是排好序的。

执行SQL语句:EXPLAIN SELECT * FROM film ORDER BY name

4、索引优化最佳实践

使用了 employees 员工表:

CREATE TABLE `employees` (
  `id` int(11) NOT NULL AUTO_INCREMENT COMMENT '主键id',
  `name` varchar(24) NOT NULL COMMENT '员工姓名',
  `age` int(11) NOT NULL DEFAULT '0' COMMENT '员工年龄',
  `position` varchar(20) NOT NULL COMMENT '员工职位',
  `hire_time` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '入职时间',
  PRIMARY KEY (`id`),
  KEY `idx_name_age_position` (`name`,`age`,`position`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8;

insert into `employees` (`id`, `name`, `age`, `position`, `hire_time`) values('1','LiLei','22','manager','2020-02-13 14:22:55');
insert into `employees` (`id`, `name`, `age`, `position`, `hire_time`) values('2','HanMeimei','23','dev','2020-02-13 14:22:57');
insert into `employees` (`id`, `name`, `age`, `position`, `hire_time`) values('3','Lucy','23','dev','2020-02-13 14:22:59');

(1)全值匹配

执行SQL语句:EXPLAIN SELECT * FROM employees WHERE name = 'LiLei'

执行SQL语句:EXPLAIN SELECT * FROM employees WHERE name = 'LiLei' AND age =22

执行SQL语句:EXPLAIN SELECT * FROM employees WHERE name = 'LiLei' AND age =22 AND position = 'manager'

(2)索引最左前缀原则 

如果索引了多列,要遵循最左前缀原则。指的是查询从索引的最左前列开始并且不跳过索引中的列。

提问:为什么联合索引要想命中索引必须采用最左前缀原则?(命中索引:即是否用到了索引)

以下索引优化规则很多都可以结合下面这张图思考,联合索引底层的索引数据结构图(B+树),索引的排序首先按10002排序,接着是Staff,最后才是1996-08-03,如果不先拿第一个字段10002去比较,根本没法比较,导致无法命中索引。

提问:以下SQL命中索引?

① EXPLAIN SELECT * FROM employees WHERE age = 22 AND position = 'manager';
② EXPLAIN SELECT * FROM employees WHERE position = 'manager';
③ EXPLAIN SELECT * FROM employees WHERE name = 'LiLei';
④ EXPLAIN SELECT * FROM employees WHERE name = 'LiLei' AND position = 'manager';

分析:

①中的where条件后面age=22不是索引的最左前列,后面就不用看了,没有命中索引,②也是如此。

③中的name是索引idx_name_age_position的最左前列,命中索引。

④中的name命中索引,position没有命中索引,因为跳过索引中的age列,中间断了,age列还是需要全表扫描。

(3)不要在索引列上做任何操作(如计算、函数、自动或手动类型转换),否则会导致索引失效而转向全表扫描

执行SQL语句:EXPLAIN SELECT * FROM employees WHERE LEFT(name, 3) = 'LiLei'

(4)存储引擎不能使用索引中范围条件右边的列

执行SQL语句:EXPLAIN SELECT * FROM employees WHERE name = 'LiLei' AND age > 22 AND position = 'manager'

分析:长度为78,name为74,age是int类型,所以为4,即只有name和age命中索引,position没有命中索引,因为它属于age范围条件右边的索引列。

(5)尽量使用覆盖索引(只访问索引的查询,索引列包含查询列),减少 select * 语句

执行SQL语句:EXPLAIN SELECT name,age FROM employees WHERE name = 'LiLei'

执行SQL语句:EXPLAIN SELECT * FROM employees WHERE name = 'LiLei'

 (6)MySQL在使用不等于(!= 或者 <>)的时候无法使用索引,会导致全表扫描

执行SQL语句:EXPLAIN SELECT * FROM employees WHERE name != 'LiLei'

(7)is null,is not null也无法使用索引

执行SQL语句:

EXPLAIN SELECT * FROM employees WHERE name IS NULL

(8)like以通配符开头('$abc'),MySQL索引会失效导致全表扫描

执行SQL语句:EXPLAIN SELECT * FROM employees WHERE name LIKE '%Lei'

执行SQL语句:EXPLAIN SELECT * FROM employees WHERE name LIKE 'Lei%'

提问:如何解决like '%字符串%' 索引没有命中?

① 使用覆盖索引,查询字段必须是建立覆盖索引字段

执行SQL语句:EXPLAIN SELECT name,age,position FROM employees WHERE name LIKE '%Lei%'

② 当覆盖索引指向的字段是varchar(380)及以上的字段时,覆盖索引会失效!

(9)字符串不加单引号,索引失效(内部会做一个字符串转换函数)

执行SQL语句:EXPLAIN SELECT * FROM employees WHERE name = 1000

(10)少用or,用or连接时,索引失效

 执行SQL语句:EXPLAIN SELECT * FROM employees WHERE name = 'LiLei' OR name = 'Hanmeimei'

总结

like KK% 相当于等于常量,%KK 和 %KK% 相当于范围。

 

Guess you like

Origin www.cnblogs.com/ZekiChen/p/12295702.html