MySQL index extension

problem

Recently, I encountered a rather confusing problem in "45 Lectures on MySQL Practical Combat". The issue is:

Have the following table

CREATE TABLE `geek` ( 
  `a` int(11) NOT NULL, 
  `b` int(11) NOT NULL, 
  `c` int(11) NOT NULL, 
  `d` int(11) NOT NULL, 
  PRIMARY KEY (`a`,`b`), 
  KEY `c` (`c`), 
  KEY `ca` (`c`,`a`), 
  KEY `cb` (`c`,`b`) 
) ENGINE=InnoDB;

There are two SQLs as follows:

select * from geek where c=N order by a limit 1; 
select * from geek where c=N order by b limit 1;

Question: Which index is redundant?

The author believes that index c has the same effect as joint index (c, a, b).

I don’t understand it very much when I see this. Why do I use the primary key combined index a, b when I go to index c? We know that the secondary index leaf node stores the value of the primary key id. When querying, it will first find the value of the primary key id according to the secondary index, and then go back to the table to get the specific row in the clustered index tree of the primary key id. data. For example, k is a normal index, and Id is a primary key index. For SQL select * from T where k = 5, the execution process is to first go to the k index tree to find the id value of k=5: 500; then go to the ID index tree to find the row data corresponding to id=500, and return; then go to the k index tree to remove the data k =6, return if it doesn't match.


(Image source: "45 Lectures on MySQL Actual Combat")

Index expansion

The principle of index expansion is introduced on the MySQL official website:

InnoDB automatically extends each secondary index by appending the primary key columns to it.

InnoDB automatically expands it by adding primary key columns to each secondary index.

For example, the t1 table has a primary key index (i1, i2) and a secondary index k_d, but InnoDB internally expands this index k_d, and after the primary key value is appended to the index column, the expanded index becomes (d, i1 , i2). The MySQL optimizer uses extended auxiliary indexes to more efficiently join, sort, ref, and range queries.

The use_index_extensions flag of the optimizer_switch system variable allows you to control whether the optimizer considers the primary key column when determining how to use the secondary index of the InnoDB table. By default, the "Use Index" extension is enabled. If you disable the use of system extensions, you can use the following instructions:

SET optimizer_switch = 'use_index_extensions=off';

The following experiments are used to verify this conclusion. Note that this experiment is based on database version 5.7.

(1) Preparatory work, build tables and insert data

CREATE TABLE t1 (
  i1 INT NOT NULL DEFAULT 0,
  i2 INT NOT NULL DEFAULT 0,
  d DATE DEFAULT NULL,
  PRIMARY KEY (i1, i2),
  INDEX k_d (d)
) ENGINE = InnoDB;

Insert data

INSERT INTO t1 VALUES
(1, 1, '1998-01-01'), (1, 2, '1999-01-01'),
(1, 3, '2000-01-01'), (1, 4, '2001-01-01'),
(1, 5, '2002-01-01'), (2, 1, '1998-01-01'),
(2, 2, '1999-01-01'), (2, 3, '2000-01-01'),
(2, 4, '2001-01-01'), (2, 5, '2002-01-01'),
(3, 1, '1998-01-01'), (3, 2, '1999-01-01'),
(3, 3, '2000-01-01'), (3, 4, '2001-01-01'),
(3, 5, '2002-01-01'), (4, 1, '1998-01-01'),
(4, 2, '1999-01-01'), (4, 3, '2000-01-01'),
(4, 4, '2001-01-01'), (4, 5, '2002-01-01'),
(5, 1, '1998-01-01'), (5, 2, '1999-01-01'),
(5, 3, '2000-01-01'), (5, 4, '2001-01-01'),
(5, 5, '2002-01-01');

(2) Turn off index expansion

View the optimizer configuration switch:show variables like '%optimizer_switch%';

You can see that MySQL defaults to use_index_extensions=on to turn on the index extension function:

index_merge=on,index_merge_union=on,index_merge_sort_union=on,index_merge_intersection=on,engine_condition_pushdown=on,index_condition_pushdown=on,mrr=on,mrr_cost_based=on,block_nested_loop=on,batched_key_access=off,materialization=on,semijoin=on,loosescan=on,firstmatch=on,duplicateweedout=on,subquery_materialization_cost_based=on,use_index_extensions=on,condition_fanout_filter=on,derived_merge=on

Execute the command to turn off the index expansion:SET optimizer_switch = 'use_index_extensions=off';

Execute explain command to view: EXPLAIN SELECT COUNT(*) FROM t1 WHERE i1 = 3 AND d = '2000-01-01'analysis result:

  • The primary key index PRIMARY is used, the length of key_len is 4, and the Extra uses Using where, indicating that the primary key i1 is used to query the result set that meets the conditions, and then the where condition d = '2000-01-01' is used for filtering;

(3) Turn on index expansion

Execute the open index extension command:SET optimizer_switch = 'use_index_extensions=on';

Execute explain command to view: EXPLAIN SELECT COUNT(*) FROM t1 WHERE i1 = 3 AND d = '2000-01-01'analysis result:

  • The actually used index key is k_d, but key_len is not 4 but 8, ref is also const and const, and Extra is the Using index surface that uses the covering index.
  • Although we think that the index key chosen by the optimizer is k_d, the actual use is index expansion (d, i), which can explain why key_len=8=4+4, ref=const and const, and the coverage index of Extra is just right Covering index of (d, i) is used

The above analysis results show that MySQL will use the extended index (d -> (d,i)) to improve execution efficiency during execution.

[External link image transfer failed. The source site may have an anti-hotlinking mechanism. It is recommended to save the image and upload it directly (img-yUNCeoIe-1606999927485)(../../../Library/Application%20Support/typora-user- images/image-20201203110943578.png)]

Note: A small loophole in the 5.7 version of the MySQL documentation

In the experiment of turning off index expansion in step 2 (2) above, our analysis results show that the index expansion is turned off. The key used by the explain analysis is the primary key key-PRIMARY, key_len = 4, extra = Using where.

However, the official 5.7 document results say that the k_d single-column index is used, but the extra is a Using index covering index, which is completely different from the results of my analysis.
Insert picture description here

After many experiments, it is found that the experimental results under MySQL 5.6 version are consistent with the analysis results in the document. Therefore, I guess that the MySQL 5.7 document https://dev.mysql.com/doc/refman/5.7/en/index-extensions.html turned off the index extension analysis result is to directly copy the 5.6 document without making any changes. The above analysis result is wrong.

5.7 may have changed in the actual implementation of MySQL at the bottom level. In count(*), the query of secondary index + Using index will not be used, but the query of primary key PRIMARY will be directly selected.


Reference: https://dev.mysql.com/doc/refman/5.7/en/index-extensions.html

https://coderbee.net/index.php/db/20190106/1708

Guess you like

Origin blog.csdn.net/noaman_wgs/article/details/110568257