Database -> Mysql index data structure, each has advantages and disadvantages

The data structure of Mysql index has its own advantages and disadvantages

  • The default index implementation of the InnoDB storage engine is: B+ tree index .

  • For the hash index , the underlying data structure is the hash table. Therefore, when most of the requirements are for single record query, you can choose the hash index, which has the fastest query performance; in most other scenarios, it is recommended to choose the BTree index.

B+ tree:

  • B+ tree is a balanced multi-fork tree. One node can have multiple data contents , so that when the data is huge like a binary tree, the height of the tree is relatively high, and the number of queries will be less
  • The leaf nodes of the B+ tree are linked by pointers , and the order of the index is maintained, so there is order and adjacent references, so that when performing range search, you can move left and right, and the efficiency of range search will be much higher
  • Therefore, when the data is huge, B+ trees are widely used in scenarios such as databases and file systems.

Hash index:

  • The hash index is to use a certain method 哈希算法to convert the key value into a new hash value

  • Retrieval does not need to be searched step by step from the root node to the leaf node like a B+ tree, only one hash algorithm is needed to locate the corresponding position immediately, and the speed is very fast

  • If so 等值查询, then the hash index obviously has an absolute advantage, because it only needs to go through one algorithm to find the corresponding key value;

  • The premise is 键值都是唯⼀yes. If the key value is not unique, you need to find the location of the key first, and then scan backwards according to the linked list until you find the corresponding data;

  • If it is a range query retrieval, the hash index is useless at this time , because the key values ​​that are originally ordered may become discontinuous after the hash algorithm, and then 没办法the index is used to complete 范围查询the retrieval ;

  • The hash index also cannot use the index to complete the sorting, and some fuzzy queries like 'xxx%' (this kind of partial fuzzy query is actually a range query in essence);

  • In the case of a large number of duplicate key values, the efficiency of the hash index is also extremely low, because there is a hash collision problem.

Red-black tree:

  • Red-black trees can also be queried very quickly. They are all red-black trees. In the case of a lot of data, the height of the tree is very high
  • So don't use it for indexing

ordinary binary tree

  • This is even worse, not to mention the height, it will not be balanced
  • If 123456 is added in order in this way, 1 will always follow the node, and then add it to the right of the tree after increasing sequentially
  • becomes
  • 1
    • 2
    • 3
      • 4
        • 5
  • This is even worse

So basically B+ trees are used

  • The hash index is also reserved and can be used optionally, but the format of the data must be special
  • It can't query range, it's very embarrassing, basically don't need it

Guess you like

Origin blog.csdn.net/rod0320/article/details/123489961