Mysql index features

Inherit the above introduction to the principle of Mysql Server

What are the characteristics of clustered index, secondary index, and joint index?

clustered index

The data and index together are called clustered index;

Data and indexes stored separately are called non-clustered indexes;

The innodb storage engine, data and files are placed in the ibd file. The actual data is bound to the index. If there is a primary key in the table, it is bound to the primary key. If there is no primary key, then it is bound to the unique key. If there is no primary key, then it is bound to the unique key. The unique key is bound with the 6-byte rowid (6-byte random string).

The 6-byte rowid is hidden and cannot be seen. The actual mysql data row contains a lot of hidden fields.

MyISAM is a non-clustered index because data files and index files are stored separately.

There are clustered indexes and non-clustered indexes in InnoDB.

A table can have N indexes, each index is a B+ tree, each B+ tree is independent, and there will be multiple B+ trees in a table.

How many copies of data are stored in the table?

A copy of the data row stored in the leaf node is saved.

If you save one copy, there will be a problem, because a table will contain N multiple B+ trees:

Only one copy of the data is stored, and the key values ​​of the clustered index are stored in the leaf nodes of other non-clustered indexes.

So innobdb also includes non-clustered indexes.

This is a table data, id is the primary key index, name is the common index,

id primary key index, leaf nodes store data, so id is a clustered index;

name is an ordinary index, and the leaf node stores the id value, so name B+ tree is a non-clustered index, secondary index or auxiliary index.

Index is a specific physical structure;

If there are multiple unique keys in the table and no primary key, then the clustered index is created in the order of the unique keys at this time.

An index built on multiple columns is called a joint index or a composite index

Generally, when setting the index column, only one column is selected as the index field, but in some special cases, multiple columns need to be combined to form an index field, which is called a joint index.

MyISAM indexes are all auxiliary indexes, without clustered indexes, and are used to assist queries.

What issues should be paid attention to when index optimization?

  • Index fields should occupy as little storage space as possible

  • Increment as much as possible within the requirements of the business system

  • Index fields should not be null as much as possible, because null is not equal to empty in many cases

  • 选择索引的时候,索引的基数尽可能大,到底给哪些列建立索引,DV/count >= 80%适合创建索引,distinct value唯一值除以count,不重复的值要尽可能多

  • 不要给所有字段添加索引,并不是索引越多越好

什么情况下会导致索引失效?

  • 索引字段尽量不要频繁修改

  • like查询的时候左边不要加%

  • 索引字段不要添加任何表达式操作

  • 索引字段在使用的时候不要出现类型的隐式转换

  • 索引上不要出现函数计算

  • 组合索引在进行使用的时候要遵循最左匹配原则

  • in或or在很多情况下会导致索引失效,但是要根据实际的情况来进行判断

  • 在使用索引的时候,如果中间的某个索引列使用了范围查询,会导致后续的索引失效

回表

select * from table where name='zhangsan',

id是主键,name是普通索引, 先根据name值去name B+树找到对应的叶子节点,取出id值,再根据id值去id B+树中查找全部的结果,这个过程称为回表,回表的效率比较低,尽可能不要使用,避免回表的产生,因为需要回到原来的表里查询对应的数据记录。

索引覆盖

先根据name值去name B+树查找结果,能够直接获得id和name,不需要去id的B+树查找数据了,这个过程叫索引覆盖即索引的叶子节点中包含了要查询的全部数据,推荐使用索引覆盖。

最左匹配原则

id是主键,name、age是组合索引,在查找的时候必须从左往右匹配。第一和第三个sql符合该原则;

如果把age和name的顺序换下,不会影响最终的查询结果,因为这时候优化器会优化,调整对应的顺序,所以也会匹配该原则;在比较数据的时候,先比较第一个,再比较第二个,因为只有第一个相同了,才有比较第二个的可能。

索引下推

select * from table where name=? and age=?,没有索引下推前,先根据name的值从存储引擎中拿到符合条件的数据,然后在server中对age进行数据过滤,有了索引下推之后,直接根据name和age从存储引擎中筛选对应的数据,返回给server,不需要做索引过滤;原来在server层需要对age做过滤,现在下推到了存储引擎层过滤数据。

mysql 5.7之后默认开启索引下推,两个字段一起筛选,筛选的数据量少了,io量也有少了。

Guess you like

Origin blog.csdn.net/qq_16485855/article/details/129413635