Why does MySQL use B+ trees as indexes?

Hello everyone, my name is Xiaolin.

"Why does MySQL use B+ tree as an index?" Does this sentence often appear in interviews?

To explain this problem, in fact, not only from the perspective of data structure, but also the number of disk I/O operations, because MySQL data is stored in disk.

This time, I will analyze this problem with you layer by layer. The picture contains a lot of moving pictures to help everyone understand. I believe that after reading it, you will grasp this topic!

picture

What kind of indexed data structure is good?

MySQL data is persistent, which means that data (index + record) is saved to disk, because even if the device is powered off, the data will not be lost.

Disk is a ridiculously slow storage device, how ridiculous is that?

The access speed of other people's memory is in nanoseconds, while the speed of disk access is in milliseconds. That is to say, when reading data of the same size, the speed of reading from disk is tens of thousands of times slower than that of reading from memory. , even hundreds of thousands of times.

The minimum unit of reading and writing on a disk is a sector , and the size of a sector is only the 512Bsize . The operating system will read and write multiple sectors at a time, so the minimum reading and writing unit of the operating system is a block. The block size in Linux is4KB 8 sectors, which means that a disk I/O operation will directly read and write 8 sectors.

Since the index of the database is saved to the disk, when we search for a row of data through the index, we need to read the index from the disk to the memory first, and then find a row of data from the disk through the index, and then read it into the memory. , that is to say, multiple disk I/Os will occur during the query process, and the more disk I/Os, the greater the time consumed.

Therefore, we hope that the indexed data structure can complete the query work in as few disk I/O operations as possible, because the less disk I/O operations, the less time consumed.

In addition, MySQL supports range lookups, so the data structure of the index must not only be able to efficiently query a certain record, but also be able to perform range lookups efficiently.

Therefore, to design a data structure suitable for MySQL index, at least the following requirements are met:

  • Query work can be completed in as few disk I/O operations as possible;
  • To be able to efficiently query a record, it must also be able to efficiently perform range lookups;

After analyzing the requirements, we analyze each data structure.

What is binary search?

The index data is best arranged in order, so that the "binary search method" can be used to locate the data efficiently.

Suppose we now use an array to store the index. For example, there is a sorted array below. If we want to find the number 3, the easiest way is to traverse the query from the beginning. The time complexity of this method is O(n), and the query efficiency is not not tall. Because the array is ordered, we can use the binary search method, such as the following query process diagram using the binary method:

picture

It can be seen that the binary search method halves the range of the query every time, so that the time complexity is reduced to O(logn), but each search needs to continuously calculate the middle position.

What is a binary search tree?

Using arrays to implement linearly sorted data is simple and easy to use, but the performance is too low when inserting new elements.

Because inserting an element requires shifting all elements after this element by one, what if this operation happens on disk? This must be catastrophic. Because disk is hundreds of thousands of times slower than memory, we cannot sort the disk with a linear structure.

Secondly, when binary search is used for sorted arrays, the middle position must be continuously calculated for each search.

So can we design a non-linear data structure that is naturally suitable for binary search?

Yes, please see the magic operation in the figure below, find all intermediate nodes used in all binary searches, connect them with pointers, and use the most intermediate node as the root node.

Please add image description

how about it? Has it become a binary tree, but it is not an ordinary binary tree, it is a binary search tree .

The characteristic of a binary search tree is that all nodes in the left subtree of a node are smaller than this node, and all nodes in the right subtree are larger than this node , so when we query data, we do not need to calculate the position of intermediate nodes, just Compare the lookup data with the node's data.

Suppose, we find the node whose index value is key:

  1. If the key is greater than the root node, search in the right subtree;
  2. If the key is less than the root node, search in the left subtree;
  3. If the key is equal to the root node, that is, the node is found, and the root node can be returned.

The animation demonstration of finding a node in a binary search tree is as follows, for example, to find node 3:

picture

In addition, binary search tree solves the problem of inserting new nodes, because binary search tree is a jump structure and does not have to be arranged consecutively. In this way, when inserting, the new node can be placed in any position, instead of inserting an element like a linear structure, all elements need to be arranged backwards.

The following is a animation demonstration of inserting a node in a binary search tree:

Please add image description

Therefore, the binary search tree solves the problem of the high cost of inserting new elements in a continuous structure, while maintaining the natural binary structure.

Is that a binary search tree that can be used as an indexed data structure?

No, no, there is an extreme case of a binary search tree that will cause it to become a lame!

When the element inserted each time is the largest element in the binary search tree, the binary search tree will degenerate into a linked list, and the time complexity of finding data becomes O(n) , as shown in the following animation:

Please add image description

Since the tree is stored on the disk, accessing each node corresponds to a disk I/O operation ( assuming that the size of a node is "smaller" than the size of the smallest read-write unit block of the operating system ), that is to say , the height of the tree is equal to It is equal to the number of disk IO operations every time the data is queried , so the higher the tree height is, the query performance will be affected.

Due to the possibility of degenerating a binary search tree into a linked list, the time complexity of the query operation will be reduced from O(logn) to O(n).

Moreover, as more elements are inserted, the height of the tree will also increase, which means that more disk IO operations are required, which will lead to a serious drop in query performance. In addition, range queries cannot be used, so it is not suitable for use as a database index. structure.

What is a self-balancing binary tree?

In order to solve the problem that the binary search tree will degenerate into a linked list in extreme cases, a balanced binary search tree (AVL tree) was proposed later .

Mainly, some conditional constraints are added on the basis of binary search tree: the height difference between the left subtree and the right subtree of each node cannot exceed 1 . That is to say, the left subtree and right subtree of the node are still balanced binary trees, so the time complexity of the query operation will always be maintained at O(logn).

The following figure shows that the element inserted each time is the largest element in the balanced binary search tree. As you can see, it will maintain self-balancing:

picture

In addition to balanced binary search trees, there are many self-balancing binary trees, such as red-black trees, which also achieve self-balancing through some constraints. However, the constraints of red-black trees are more complicated and are not the focus of this article. You can See "Data Structures" related books to understand the constraints of red-black trees.

The following is the process of inserting nodes in the red-black tree. This left-handed and right-handed operation is for self-balancing.

picture

Regardless of whether it is a balanced binary search tree or a red-black tree, as the number of elements inserted increases, the height of the tree will increase, which means that the number of disk I/O operations is large, which will affect the efficiency of the overall data query .

For example, if the height of the balanced binary search tree below is 5, then 5 disk I/O operations are required to access the bottom node.

picture

The fundamental reason is that they are all binary trees, that is, each node can only save 2 child nodes. What if we change the binary tree to an M-ary tree (M>2)?

For example, when M=3, in the case of the same number of nodes, the tree height of the ternary tree is shorter than that of the binary tree.

picture

Therefore, when there are more nodes in the tree and the number of branches M of the tree is larger, the height of the M-forked tree will be much smaller than the height of the binary tree .

What is a B-tree

Although the self-balancing binary tree can keep the time complexity of the query operation at O(logn), because it is essentially a binary tree, each node can only have 2 child nodes, then when the number of nodes is more, the height of the tree It will also increase accordingly, which will increase the number of disk I/Os, thereby affecting the efficiency of data query.

In order to solve the problem of reducing the height of the tree, the B tree came out later, which no longer restricts a node to only 2 child nodes, but allows M child nodes (M>2), thereby reducing the height of the tree.

Each node of the B-tree can include at most M sub-nodes, and M is called the order of the B-tree, so the B-tree is a polytree.

Assuming M = 3, then it is a B-tree of order 3, which is characterized by each node having at most 2 (M-1) data and at most 3 (M) child nodes. If these requirements are exceeded, it will be Split nodes, such as the following animation:

picture

Let's take a look at the query process of a 3rd-order B-tree?

picture

Suppose we are looking for a record with an index value of 9 in a 3rd-order B tree in the figure above, then the steps can be divided into the following steps:

  1. Compare with the index (4, 8) of the root node, if 9 is greater than 8, then go to the child node on the right;
  2. Then the index of the child node is (10, 12), because 9 is less than 10, it will go to the left child node of the node;
  3. Go to the node with index 9, and then we find the node with index value 9.

It can be seen that when a 3rd-order B tree queries the data in the leaf nodes, since the height of the tree is 3, three disk I/O operations will occur during the query process.

And if the same number of nodes is in the scenario of a balanced binary tree, the height of the tree will be very high, which means more disk I/O operations. Therefore, B-tree is more efficient than balanced binary tree in data query.

However, each node of the B-tree contains data (index + record), and the size of the user's record data is likely to far exceed the index data, which requires more disk I/O operations to read " Useful Index Data".

Moreover, in the process of querying a node at the bottom (such as A record), the record data in the "non-A record node" will be loaded from disk to memory, but these record data are useless, we just want to read The index data of these nodes is used for comparison and query, and the record data in the "non-A record node" is useless to us, which not only increases the number of disk I/O operations, but also occupies memory resources.

In addition, if you use B-tree for range query, you need to use in-order traversal, which will involve disk I/O problems of multiple nodes, resulting in a decrease in overall speed.

What is a B+ tree?

The B+ tree is an upgrade of the B tree. The data structure of the index in MySQL uses the B+ tree. The B+ tree structure is as follows:

picture

The differences between B+ tree and B tree are mainly the following points:

  • Only the leaf node (the bottom node) will store the actual data (index + record), and the non-leaf node will only store the index;
  • All indexes will appear in the leaf nodes, and an ordered linked list is formed between the leaf nodes;
  • The index of the non-leaf node also exists in the child node, and is the largest (or smallest) of all indexes in the child node.
  • There are as many indexes as there are child nodes in a non-leaf node;

The following three aspects compare the performance differences between B+ and B-trees.

1. Single-point query

When B-tree performs a single index query, it can be found at the fastest time cost of O(1), and in terms of average time cost, it will be slightly faster than B+ tree.

However, the query fluctuation of B-tree will be relatively large, because each node stores both the index and the record, so sometimes the non-leaf node can be accessed to find the index, and sometimes the leaf node needs to be accessed to find the index.

The non-leaf nodes of the B+ tree do not store the actual record data, but only the indexes. Therefore, when the amount of data is the same, the non-leaf nodes of the B+ tree can store more indexes than the B-tree that stores both indexes and records. , so the B+ tree can be more "chunky" than the B tree, and the number of disk I/Os to query the underlying nodes will be less .

2. Insertion and deletion efficiency

The B+ tree has a large number of redundant nodes, so that when a node is deleted, it can be deleted directly from the leaf nodes, and even non-leaf nodes can be moved, so the deletion is very fast,

For example, the following animation is the process of deleting the 0004 node of the B+ tree, and the tree structure changes very little:

Please add image description

Note: B+ tree may have different definitions for the number of child nodes and indexes of non-leaf nodes. Some say that the number of child nodes of non-leaf nodes is of order M, and the number of indexes is M-1 ( This is the definition in Wikipedia), so my animations about B+ trees in this article are based on this. But when I introduced the difference between the B+ tree and the B+ tree, I said that "there are as many child nodes as there are non-leaf nodes, there are as many indexes", mainly because the B+ tree used by MySQL is this feature.

The following animation is the process of deleting the 0008 node of the B tree, which may lead to complex changes in the tree:
Please add image description

Even when the B+ tree deletes the root node, due to the existence of redundant nodes, complex tree deformation will not occur. For example, the following animation is the process of deleting the root node of the B+ tree:

Please add image description

The B tree is different. The B tree has no redundant nodes. It is very complicated to delete nodes. For example, deleting the data in the root node may involve complex tree deformation. For example, the following animation is the process of deleting the root node of the B tree:

picture

The same is true for the insertion of a B+ tree, there are redundant nodes, and the insertion may have node splits (if the nodes are saturated), but at most only one path of the tree is involved. Moreover, the B+ tree will be automatically balanced, and there is no need for more complex algorithms, such as rotation operations similar to red-black trees.

Therefore, insertion and deletion of B+ trees are more efficient .

3. Range query

The principle of B-tree and B+-tree equivalent query is basically the same. First, search from the root node, then compare the range of the target data, and finally enter the child node recursively to search.

Because there is also a linked list between all leaf nodes of the B+ tree, this design is very helpful for range search . For example, we want to know the order between December 1st and December 12th. At this time, we can find 12 first. The leaf node where the month 1 is located, and then use the linked list to traverse to the right until the node on December 12 is found, so that there is no need to query from the root node, which further saves the time required for query.

The B-tree does not have a structure in which all leaf nodes are connected in a linked list, so range queries can only be completed through tree traversal, which involves disk I/O operations of multiple nodes, and the range query efficiency is not as efficient as the B+ tree.

Therefore, there are a large number of range retrieval scenarios, which are suitable for using B+ trees, such as databases. For a large number of single index query scenarios, you can consider B-trees, such as nosql's MongoDB.

B+ tree in MySQL

The storage method of MySQL varies according to the storage engine. The most commonly used storage engine is the Innodb storage engine, which uses the B+ tree as the index data structure.

The following figure is the B+ tree in Innodb:

picture

But the B+ tree used by Innodb has some special points, such as:

  • The leaf nodes of the B+ tree are connected by a "doubly linked list", which has the advantage of being able to traverse both right and left.
  • The content of the B+ tree point node is the data page. The data page stores the user's records and various information. The default size of each data page is 16 KB.

Innodb is divided into clustered and secondary indexes according to different index types. The difference between them is that the leaf nodes of the clustered index store the actual data, all complete user records are stored in the leaf nodes of the clustered index, and the leaf nodes of the secondary index store the primary key value, not the actual data.

Because the data of the table is stored in the leaf nodes of the clustered index, the InnoDB storage engine must create a clustered index for the table, and because the data is only physically stored in one copy, there can only be one clustered index, and Multiple secondary indexes can be created.

For more information on Innodb's B+ tree, see my previous article: B+ Tree from the Data Page Perspective .

Summarize

MySQL persists data on the hard disk, and the storage function is implemented by the MySQL storage engine, so discussing which data structure MySQL uses as an index is actually discussing which data structure is used as an index for storage. InnoDB is MySQL The default storage engine is a data structure that uses a B+ tree as an index.

To design a MySQL index data structure, not only consider the time complexity of data structure addition, deletion and modification, but more importantly, consider the number of disk I/0 operations. Because indexes and records are stored on the hard disk, the hard disk is a very slow storage device. When we query data, it is best to complete it in as few disk I/0 operations as possible.

Although the binary search tree is a natural binary structure, it can make good use of binary search to quickly locate data, but it has an extreme situation. Whenever the inserted element is the largest element in the tree, it will lead to a binary search tree. Degenerate into a linked list, and the query complexity will be reduced from O(logn) to O(n).

In order to solve the problem that the binary search tree degenerates into a linked list, a self-balancing binary tree appears, which ensures that the time complexity of the query operation will always be maintained at O(logn). But it is essentially a binary tree, and each node can only have 2 child nodes. As the number of elements increases, the height of the tree will become higher and higher.

The height of the tree depends on the number of disk I/O operations, because the tree is stored in the disk, and accessing each node corresponds to a disk I/O operation, which means that the height of the tree is equal to each time the data is queried. The number of disk IO operations, so the higher the tree height, the query performance will be affected.

Both B-tree and B+ use a multi-fork tree to shorten the height of the tree, so these two data structures are very suitable for retrieving data stored on disk.

However, MySQL's default storage engine, InnoDB, uses B+ as an indexed data structure for the following reasons:

  • The non-leaf nodes of the B+ tree do not store the actual record data, but only the indexes. Therefore, when the amount of data is the same, the non-leaf nodes of the B+ tree can store more indexes than the B-tree that stores both indexes and records. , so the B+ tree can be more "chunky" than the B tree, and the number of disk I/Os to query the underlying nodes will be less.
  • The B+ tree has a large number of redundant nodes (all non-leaf nodes are redundant indexes). These redundant indexes make the B+ tree more efficient in insertion and deletion. For example, when deleting the root node, it will not be like the B tree. Complex tree changes occur;
  • The child nodes of the B+ tree are connected by a linked list, which is beneficial to the range query, and the B tree needs to realize the range query, so the range query can only be completed through the traversal of the tree, which will involve the disk I/O operation of multiple nodes. Range queries are not as efficient as B+ trees.

over!

Guess you like

Origin blog.csdn.net/qq_34827674/article/details/123447620