Simple talk about the understanding of MySQL indexes?

First, what indexes are?

MySQL index is to help efficiently get the data structure of the data.

Second, the index can you do?

Index is critical, especially when the amount of data in the table more and more, the index more important effect on performance. Index can easily improve query performance by several orders of magnitude, in general is that we can significantly improve query efficiency.

Third, the index of classification?

1, the storage structure is divided up: BTree index (the BTree or B + Tree index), the Hash index, full-index full-text index, R-Tree index. Described herein is saved when the index is stored in the form,

2, from the application level to points: General index, the only index, a composite index

3, according to the logical-physical order of the key data (index) sequence relationship: clustered index, non-clustered index.

Usually talk about the type of index generally refers to application-level division.

Like phone Category: Like Android phone, IOS phone and Huawei cell phone, Apple phone, OPPO phone.

Common Index : i.e. only contains a single column index, a table can have multiple separate index

The only index : value of the index columns must be unique, but allow free value

Composite Index : multi-column index value of a composition, specifically for combinatorial search, the efficiency is greater than the combined index

Clustered index (clustered index) : not a single index type, but a data storage. Depending on implementation details, InnoDB clustered index is actually stored in the same B-Tree index in a structure (technically a B + Tree) and data lines.

Non-clustered index : not a clustered index is non-clustered index

 

Fourth, the implementation of the underlying index

mysql innodb default storage engine only explicit support B-Tree (B + Tree is technically) index for frequently accessed tables, innodb transparently establish adaptive hash index, i.e. hash index established on the basis of the B-tree index, can significantly improve search efficiency, the client is transparent, uncontrollable, implicit.

Hash index

Based on a hash table implementations, only exact matches the query index to be effective for all columns for each row of data, the storage engine are calculated for all the indexed columns a hash code (hash code), and all the hash index Hash code storage in the index, while preserving each data row pointer in the index table.

B-Tree index (MySQL using B + Tree)

B-Tree can speed up data access speed, because the engine no longer need to store a full table scan to obtain data, the data distributed among the various nodes.

B + Tree index

B-Tree is an improved version of the index database also stores the index structure employed. Data on the leaf nodes, and increases the sequential access pointer, each leaf node points to the address of the adjacent leaf node. Only two nodes when compared to find B-Tree, the range searches, can be traversed. B-Tree and the need to obtain all of the nodes, compared to B + Tree higher efficiency.

Example: Say you have a student table, id primary key

id name birthday
1 Tom 1996-01-01
2 Jann 1996-01-04
3 Ray 1996-01-08
4 Michael 1996-01-10
5 Jack 1996-01-13
6 Steven 1996-01-23
7 Lily 1996-01-25

Implemented in MyISAM engine (secondary index is achieved by a)

Implemented in InnoDB

Fifth, why the default index structure using B + Tree, instead of Hash, binary tree, red-black tree?

 

B-tree: because the B-tree, the data will be saved regardless of leaf nodes and non-leaf node, this has led to the number of fingers in the non-leaf nodes that can be saved fewer (some of the information, also known as fan-out), at least a pointer to save the situation large amounts of data, can only increase the height of the tree, resulting in IO operations increases, the query performance is low;

Hash: Although you can quickly locate, but there is no order, IO high complexity.

Binary tree: highly non-uniform tree, not self-balancing, search efficiency with data (height of the tree) related to IO and high costs.

Red-black tree: tree height increases as the amount of data increases, the high cost of IO.

 

Sixth, why the official recommended to use self-growth as the primary key index?

结合B+Tree的特点,自增主键是连续的,在插入过程中尽量减少页分裂,即使要进行页分裂,也只会分裂很少一部分。并且能减少数据的移动,每次插入都是插入到最后。总之就是减少分裂和移动的频率。

插入连续的数据:

插入非连续的数据

七、简单总结下

1、MySQL使用B+Tree作为索引数据结构。

2、B+Tree在新增数据时,会根据索引指定列的值对旧的B+Tree做调整。

3、从物理存储结构上说,B-Tree和B+Tree都以页(4K)来划分节点的大小,但是由于B+Tree中中间节点不存储数据,因此B+Tree能够在同样大小的节点中,存储更多的key,提高查找效率。

4、影响MySQL查找性能的主要还是磁盘IO次数,大部分是磁头移动到指定磁道的时间花费。

5、MyISAM存储引擎下索引和数据存储是分离的,InnoDB索引和数据存储在一起。

6、InnoDB存储引擎下索引的实现,(辅助索引)全部是依赖于主索引建立的(辅助索引中叶子结点存储的并不是数据的地址,还是主索引的值,因此,所有依赖于辅助索引的都是先根据辅助索引查到主索引,再根据主索引查数据的地址)。

7、由于InnoDB索引的特性,因此如果主索引不是自增的(id作主键),那么每次插入新的数据,都很可能对B+Tree的主索引进行重整,影响性能。因此,尽量以自增id作为InnoDB的主索引。

 

文章转自https://mp.weixin.qq.com/s/gNmWY8ob-QN6ZVF7e-71cA,微信公众号-程序员的私房菜

Guess you like

Origin www.cnblogs.com/zsh-blogs/p/10988726.html
Recommended