Data structure of mysql index (Innodb)

First of all, it should be noted that the data structure here is the data structure stored on the hard disk, not the data structure in the memory, and the number of io should be considered.

1. Unsuitable data structure:

1. Hash: Not suitable for range query and fuzzy matching query. (Some database indexes will use Hash, but only precise matching)

2. Red-black tree: It can perform range query and fuzzy matching, but it has more IO times with the hard disk.

2. Data structure (B+ tree) tailored for the database: 

1. B tree (also known as B-tree):

 

a) It is essentially an N-fork search tree: one node stores multiple keys, and N keys extend to N+1 nodes (divided into N+1 intervals).

Starting from the root node, search down one by one. 

b) Advantages compared with red-black tree: each node can store multiple elements. When the total number of elements is determined, the number of nodes is greatly reduced, the height of the tree is also reduced, and the number of ios during query is reduced. Query efficiency has been improved.

c) Splitting and merging: When inserting and deleting elements, a node can store multiple elements, but it cannot be stored without limit. When a certain number is reached, the node must be split and the node Some of the elements in are reorganized in the form of several child nodes.

2. B+ tree:

  

Features: 

a) N-fork search tree, but N elements are divided into N interval nodes, and the last element is the maximum value.

b) The elements of the parent node are repeated in the child nodes (as the maximum value). The layer of leaf nodes contains all elements.

c) The leaf nodes are connected at the end according to the way of the doubly linked list, and the previous/next element can be quickly found, which is convenient for range query.

Advantage: 

a) is particularly good at range queries.

b) All queries will eventually fall to the leaf nodes, the number of comparisons is balanced, and the query time is stable.

c) Since the leaf node is a complete set of elements, other columns of elements in each row of the table can be saved to the leaf node instead of the leaf node, which refers to the id of the storage component index. Therefore, non-leaf nodes The storage space consumption is very small, and a copy can be cached in memory, which reduces the number of hard disk io and improves query efficiency.

Guess you like

Origin blog.csdn.net/m0_73345579/article/details/132204817