The difference between MySQL's btree index and hash index (reproduced)

The particularity of the Hash index structure, its retrieval efficiency is very high, the index retrieval can be located at one time, unlike the B-Tree index, which requires multiple IO accesses from the root node to the branch node, and finally to the page node, so the Hash index The query efficiency is much higher than that of B-Tree indexes .

Many people may have doubts. Since the efficiency of Hash index is much higher than that of B-Tree, why don't everyone use Hash index instead of B-Tree index? Everything has two sides. Hash index is the same. Although Hash index has high efficiency, Hash index itself also brings many limitations and disadvantages due to its particularity , mainly including the following.

(1) Hash index can only satisfy "=", "IN" and "<=>" queries, and cannot use range queries.

Since the Hash index compares the Hash value after the Hash operation , it can only be used for equal value filtering , not for range-based filtering, because the size relationship of the Hash value processed by the corresponding Hash algorithm cannot be used. Guaranteed to be exactly the same as before the Hash operation.

(2) Hash indexes cannot be used to avoid data sorting operations .

Since the Hash index stores the Hash value after Hash calculation, and the size relationship of the Hash value is not necessarily exactly the same as the key value before the Hash operation, the database cannot use the index data to avoid any sorting operation;

(3) Hash indexes cannot be queried with partial index keys.

For a composite index, the Hash index calculates the Hash value after merging the composite index keys when calculating the Hash value, instead of calculating the Hash value separately. Therefore , when querying through the first one or several index keys of the composite index, the Hash index also cannot be used.

(4) Hash indexes cannot avoid table scans at any time.

As we already know, Hash indexAfter the index key is subjected to the Hash operation, the Hash value of the Hash operation result and the corresponding row pointer information are stored in a Hash table. Since different index keys have the same Hash value , even if the data that satisfies a certain Hash key value is taken The number of records in the table cannot be directly queried from the Hash index. It is still necessary to perform a corresponding comparison by accessing the actual data in the table, and obtain the corresponding result.

(5) The performance of Hash index is not necessarily higher than that of B-Tree index when a large number of Hash values ​​are equal.

For index keys with low selectivity , if a Hash index is created, there will be a large number of record pointer information associated with the same Hash value. In this way, it will be very troublesome to locate a certain record, and it will waste multiple accesses to table data, resulting in low overall performance.

 
 
2. B-Tree索引 

      B-Tree 索引是 MySQL 数据库中使用最为频繁的索引类型,除了 Archive 存储引擎之外的其他所有的存储引擎都支持 B-Tree 索引。不仅仅在 MySQL 中是如此,实际上在其他的很多数据库管理系统中B-Tree 索引也同样是作为最主要的索引类型,这主要是因为 B-Tree 索引的存储结构在数据库的数据检 索中有非常优异的表现。 
      一般来说, MySQL 中的 B-Tree 索引的物理文件大多都是以 Balance Tree 的结构来存储的,也就是所有实际需要的数据都存放于 Tree 的 Leaf Node ,而且到任何一个 Leaf Node 的最短路径的长度都是完全相同的,所以我们大家都称之为 B-Tree 索引当然,可能各种数据库(或 MySQL 的各种存储引擎)在存放自己的 B-Tree 索引的时候会对存储结构稍作改造。如 Innodb 存储引擎的 B-Tree 索引实际使用的存储结构实际上是 B+Tree ,也就是在 B-Tree 数据结构的基础上做了很小的改造,在每一个 
Leaf Node 上面出了存放索引键的相关信息之外,还存储了指向与该 Leaf Node 相邻的后一个 LeafNode 的指针信息,这主要是为了加快检索多个相邻 Leaf Node 的效率考虑。 
      在 Innodb 存储引擎中,存在两种不同形式的索引,一种是 Cluster 形式的主键索引( Primary Key ),另外一种则是和其他存储引擎(如 MyISAM 存储引擎)存放形式基本相同的普通 B-Tree 索引,这种索引在 Innodb 存储引擎中被称为 Secondary Index 。下面我们通过图示来针对这两种索引的存放 
形式做一个比较。 

    MySQL的btree索引和hash索引的区别 

      图示中左边为 Clustered 形式存放的 Primary Key ,右侧则为普通的 B-Tree 索引。两种 Root Node 和 Branch Nodes 方面都还是完全一样的。而 Leaf Nodes 就出现差异了。在 Prim中, Leaf Nodes 存放的是表的实际数据,不仅仅包括主键字段的数据,还包括其他字段的数据据以主键值有序的排列。而 Secondary Index 则和其他普通的 B-Tree 索引没有太大的差异,Leaf Nodes 出了存放索引键 的相关信息外,还存放了 Innodb 的主键值。 

      所以,在 Innodb 中如果通过主键来访问数据效率是非常高的,而如果是通过 Secondary Index 来访问数据的话, Innodb 首先通过 Secondary Index 的相关信息,通过相应的索引键检索到 Leaf Node之后,需要再通过 Leaf Node 中存放的主键值再通过主键索引来获取相应的数据行。MyISAM 存储引擎的主键索引和非主键索引差别很小,只不过是主键索引的索引键是一个唯一且非空 的键而已。而且 MyISAM 存储引擎的索引和 Innodb 的 Secondary Index 的存储结构也基本相同,主要的区别只是 MyISAM 存储引擎在 Leaf Nodes 上面出了存放索引键信息之外,再存放能直接定位到 MyISAM 数据文件中相应的数据行的信息(如 Row Number ),但并不会存放主键的键值信息
 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326277343&siteId=291194637