[MySQL] B tree and B+ tree

1 B tree

Before introducing the B+ tree, let's briefly introduce the B tree. The two data structures have similarities and differences. Finally, we will also compare the differences between the two data structures.

1.1 B-tree concept

B-tree is also called B-tree, it is a multi-way balanced search tree. I think everyone is familiar with binary trees. In fact, the B-tree and the B+ tree mentioned later are also transformed from the simplest binary tree. There is no mystery. Let’s take a look at the definition of B-tree.

  • Each node has at most m-1 keywords (key-value pairs that can be stored).
  • The root node can have at least one keyword.
  • The non-root node has at least m/2 keywords.
  • The keywords in each node are arranged in ascending order, all keywords in the left subtree of each keyword are smaller than it, and all keywords in the right subtree are larger than it.
  • All leaf nodes are located in the same layer, or the length from the root node to each leaf node is the same.
  • Each node stores index and data, that is, the corresponding key and value.

Therefore, the root of the number of keywords range: 1 <= k <= m-1non root of the number of keywords range: m/2 <= k <= m-1.

In addition, we need to pay attention to a concept. When describing a B-tree, you need to specify its order. The order indicates the maximum number of child nodes of a node. Generally, the letter m is used to indicate the order.

Let's take another example to illustrate the above concept. For example, here is a 5th-order B-tree, the range of the number of root nodes: 1 <= k <= 4, and the range of the number of non-root nodes: 2 <= k <= 4.

Below, we explain the insertion process of the B-tree through an example of insertion, and then explain the process of deleting keywords.

1.2 B-tree insertion

When inserting, we need to remember a rule: judge whether the number of key of the current node is less than or equal to m-1, if it is satisfied, just insert it directly, if it is not satisfied, divide the key in the middle of the node into left and right Two parts, the middle node can be placed in the parent node .

Example: In a 5-level B-tree, the node has at most 4 keys and at least 2 keys (note: the following nodes use one node to represent key and value).

  • Insert 18, 70, 50, 40

Insert picture description here
Insert 22

Insert picture description here

When inserting 22, it is found that the key of this node is greater than 4, so it needs to be split. The rules of split have already been mentioned above. After splitting, it is as follows.
Insert picture description here

  • Then insert 23
    Insert picture description here
    , 25 , 39 splits to get the following.
    Insert picture description here
    More insertion process will not be introduced, I believe you already know how to perform the insertion operation with this example.

1.3 Delete operation of B-tree

The delete operation of the B-tree is relatively more complicated than the insert operation, but you know that you can easily master several situations in mind.

  • Now there is an initial state of the B-tree as shown below, and then delete it.
    Insert picture description here
  • Delete 15. In this case, the element of the leaf node is deleted. If the number of nodes is still greater than m/2that after deletion, just delete it directly in this case.
    Insert picture description here
    Insert picture description here
  • Next, we delete 22. The rule in this case: 22 is a non-leaf node. For the deletion of non-leaf nodes, we need to overwrite the key to be deleted with the successor key (element), and then delete it in the child branch where the successor key is located. Successor key . For deleting 22, the subsequent element 24 needs to be moved to the node where the deleted 22 is located.
    Insert picture description here
    Insert picture description here

At this time, it is found that the node where 26 is located has only one element, which is less than 2 (m/2). This node does not meet the requirements. The rule at this time (borrowing elements from sibling nodes): If the leaf node is deleted, if the number of elements after deleting the element Less than (m/2), and the element of its sibling node is greater than (m/2), that is to say, the element of the sibling node is more than the minimum value m/2, the element of the parent node will be moved to this node first Then move the element of the sibling node to the parent node . This satisfies the requirements.

Let us look at the operation process to understand more clearly.
Insert picture description here
Insert picture description here
Then delete 28 and delete the leaf nodes . After deletion, the requirements are not met. Therefore, we need to consider borrowing elements from sibling nodes . However, there are not many sibling nodes (2), so what should we do? If this is the case, first, move the element of the parent node to the node first, and then merge the keys in the current node and its sibling nodes to form a new node.
Insert picture description here

After moving, merge with the sibling node.
Insert picture description here

Only the above cases can be deleted, and you can delete them according to different situations.

The above introduction, I believe that you have a certain understanding of the B tree, the next part, we will continue to explain the B+ tree, I believe that the comparison of the B+ tree will be more clear.

2 B+ tree

2.1 Overview of B+ Tree

The B+ tree is actually very similar to the B tree. Let's first look at the similarities .

  • At least one element at the root node
  • Range of non-root node elements: m/2 <= k <= m-1

difference.

  • B+ tree has two types of nodes: internal nodes (also called index nodes) and leaf nodes. Internal nodes are non-leaf nodes. Internal nodes do not store data, only indexes, and data is stored in leaf nodes.
  • The keys in the internal nodes are arranged in ascending order. For a key in the internal node, all keys in the left tree are less than it, and the keys in the right subtree are greater than or equal to it. The records in the leaf nodes are also arranged according to the size of the key.
  • Each leaf node stores pointers to adjacent leaf nodes, and the leaf nodes themselves are linked in order of the size of the key from small to large.
  • The parent node stores the index of the first element of the right child.

Let's look at an example of a B+ tree and feel it!
Insert picture description here

2.2 Insert operation

The insertion operation is very simple, just remember one trick: when the number of node elements is greater than m-1, split the middle element into left and right parts, and the middle element is split to the parent node as index storage, but the middle element itself Still split the right part.

The following takes the insertion process of a 5th-order B+ tree as an example. The nodes of the 5th-order B+ tree have at least 2 elements and at most 4 elements.

  • Insert 5, 10, 15, 20
    Insert picture description here
  • Insert 25, now the number of elements is greater than 4, split
    Insert picture description here
  • Then insert 26, 30, continue to split

Insert picture description here

With these few examples, I believe that there is nothing wrong with the insert operation. Let's take a look at the delete operation.

2.3 Delete operation

The delete operation is simpler than the B-tree, because the leaf nodes have pointers, when borrowing elements from the sibling nodes, you do not need to go through the parent node, but can move directly through the sibling node (provided that the element of the sibling node Greater than m/2), then update the index of the parent node; if the element of the sibling node is not greater than m/2 (the sibling node has no extra elements), the current node and the sibling node are merged, and the key in the parent node is deleted . Let's take a look at specific examples.

  • Initial state
    Insert picture description here
  • Delete 10. After deleting, it does not meet the requirements. It is found that there are extra elements in the left sibling node, so I borrow the element, and finally, modify the parent node index
    Insert picture description here
  • Delete element 5, find that it does not meet the requirements, and find that there are no extra elements in the left and right sibling nodes, so you can choose to merge with the sibling nodes, and finally modify the parent node index
    Insert picture description here
  • It is found that the index of the parent node does not meet the conditions, so you need to do the same operation as the previous step. In
    Insert picture description here
    this way, the deletion of the B+ tree is completed. After reading it, I think it is very simple!

3 Summary of B tree and B+ tree

The B+ tree has some advantages over the B tree, which can be summarized in the following points.

  • A single node stores more elements, which makes the number of query IOs less, so it makes it more suitable as the underlying data structure of the database MySQL.
  • All queries must find leaf nodes, and the query performance is stable, while for B-trees, each node can find data, so it is unstable.
  • All the leaf nodes form an ordered linked list, which is easier to find.

Original link: If the interviewer asks you about B-tree and B+-tree, then throw this article to him

Guess you like

Origin blog.csdn.net/dl962454/article/details/114384363