Detailed explanation of MySQL index failure

Table of contents

B+ tree structure

Test Data

In case of index failure

index not used

Violation of the left prefix principle

range query broken index

like needs to be divided into situations

The result data more than half


B+ tree structure

The root cause of index failure is actually a violation of the structural characteristics of the B+ tree. When searching, there is no way to continue on the B+ tree, so first of all, let's review the data structure of the B+ tree.

If you are not familiar with B tree and B+ tree, you can read the blogger's previous article, which introduces these two data structures in detail: Data structure (8) Tree structure - B tree, B+ tree (including the complete tree building process)_b+ Tree Construction Process__BugMan's Blog-CSDN Blog

The B+ tree is an N-fork tree, which follows that each node follows the left < root < right, and then the leaf node is all the data on a branch, and in order to facilitate range query, the leaf nodes are connected with pointers.

Test Data

The following is the test table structure and data used in this article.

Table Structure:

create table school_timetable
(
    id   bigint primary key,
    tid bigint,
    cid bigint
)engine = innodb
 default charset = utf8;

Table data:

insert into school_timetable value(1,1,1);
insert into school_timetable value(2,2,2);
insert into school_timetable value(3,3,3);
insert into school_timetable value(4,4,4);

In case of index failure

Index invalidation situations can be classified into the following categories:

  • index not used
  • Violation of the left prefix principle
  • range query broken index
  • like needs to be divided into situations
  • The result data more than half

index not used

Of course, the index will not take effect if the index is not used. For example, no index is established on the following condition fields. When searching, you can only scan the entire table honestly to find the matching number from beginning to end. It is reflected in the SQL execution plan that the type is ALL:

Violation of the left prefix principle

The left prefix principle means that when using a composite index, the index can be fully utilized only when the query condition covers the leftmost continuous segment of the composite index.

Note: Only before the MySQL 8 version, the violation of the left prefix principle will cause the index to fail, because when the composite index is created after the MySQL 8 version, a separate index will be created on each field of the composite index, so that even if the left prefix is ​​violated According to the prefix principle, there are still single-field indexes that can go.

The left prefix principle is actually easy to understand from the characteristics of the data structure of the B+ tree. In the case of a composite index, the position of the index on the tree must be sorted according to the order of the range index, first according to the first one in the composite index. Field to sort, when the first field is equal to sort by the second field, and so on:

Taking the composite index above as an example, suppose our query condition is:

number=10001 and birthday = 2001-09-03

It can be clearly seen that after positioning with number=10001, if you use birthday directly, you can’t continue to use the nature of large left and small right, and the subsequent search can only scan all the rest, which is reflected in the SQL execution plan Here is where the type drops to range from the disconnected place.

The following is a summary of the various violations of the left prefix principle:

range query broken index

The range query breaks the index, which means that if a range query occurs in the middle of the query condition, starting from the range query, the subsequent index fields will be invalid, and the reflection in the SQL execution plan is that the type is range. The following uses our test data as an example:

index(time,cid)

The reason why the range query breaks the index is easy to understand in combination with the structural characteristics of the B+ tree. After using the range query, a range is displayed instead of a specific node. To compare and match with each node in the range.

like needs to be divided into situations

Whether the index will be invalid when using like is divided into the following two situations:

  • does not start with %
  • To ... beginning

If the wildcard % is not used at the beginning, then it is a range query, and the type of the SQL execution plan is range. If the wildcard % is used at the beginning, then it will directly fall to the type of the SQL execution plan as ALL. You can understand this after thinking about it. If you use wildcards, you have to compare each piece of data side by side, and you can’t go through the B+ tree at all.

If you have to use %, you can use a covering index. In this way, type can be forcibly pulled back from the full table scan to index. This is the only optimization method. As for the principle of covering index, we will talk about SQL index optimization in the follow-up discussed in related articles.

The result data more than half

When the number of results of the query exceeds half of the total number, MySQL usually abandons the use of indexes and performs full table scans. This is because for most query optimizers, full table scans are faster than using indexes and then backtracking more than half of the data .

When the number of results of the query exceeds half of the total number, it means that the effect of filtering through the index is relatively poor. In this case, using an index to locate more than half of the data and backtracking them to match the query criteria could result in more disk I/O and CPU overhead, reducing query performance.

Therefore, in order to improve query performance, MySQL usually chooses to perform a full table scan to avoid the overhead of index backtracking.

おすすめ

転載: blog.csdn.net/Joker_ZJN/article/details/130591264