Big vernacular btree and b+tree

Quote

As a back-end programmer, I often see several concepts about trees, such as balanced binary trees, binary search trees, avl trees, btrees, b+trees, and so on. It's easy to get confused when you put them together. Today, let's play around with these terms and understand what they are. This article will introduce the binary tree from the shallower to the deeper to b+tree. I will not talk about the insertion and deletion of different trees. It is mainly used to help children who are not clear about the concept understand what data structure these nouns represent. Red-black trees are not involved for the time being. (This stuff is a bit long to write, so you can write a separate article next time).

tree:

First, let’s take a look at the most basic definition. All types of trees mentioned later belong to the broad category of
trees : a tree is a finite set containing n (n>=0) nodes, where:
( 1) Each element is called a node
(2) There is a specific node called the root node or root
(3) The rest of the data elements except the root node are divided into m ( m≥0) disjoint sets T1, T2,...Tm-1, each set Ti (1<=i<=m) itself is also a tree, which is called the subtree of the original tree (subtree ).

Then some basic concepts of trees (by default, everyone has learned it, but take care of the ones that have not been seen. A few simple concepts are listed below. If you don’t, please understand it on Baidu):

Root node: the topmost node, a node without any parent node, in general, the top node of the tree

Degree: the number of subtrees a node has. In general, it is how many branches of a tree node have, and the degree is

Leaf node: The node with degree 0 is the leaf node, generally speaking, it is the bottom part of a tree. The nodes without branches are all leaf nodes

The height or depth of the tree: the maximum level of nodes in the tree. In layman's terms, a tree has several levels, counted by nodes

Tree degree: In a tree, the degree of the largest node is called the degree of the tree. Generally speaking, it is the degree of the node with the most branches in a tree. The degree of the tree is the degree of the tree in the figure. 3. Because the largest degree is the C node, there are three branches

Subtree: The part under each node is a subtree, which also conforms to the definition of a tree, such as T1, T2 in the figure

Binary tree:

This is basically no big problem. The tree whose degree does not exceed 2 is a binary tree. In the figure, B and C are the nodes with the highest degree of 2.

Binary search tree:

Some books are also called search tree, binary search tree, search tree, search tree are the same thing. It belongs to a binary tree, characterized in that in a binary search tree, the value of all nodes in the left subtree is less than the value of the root, and the value of all nodes in the right subtree is greater than or equal to the value of the root,
and this rule is met for any node.

The advantage of a binary search tree is just like its name. It is convenient to search. The time complexity of searching for an element is o(logn). In the worst case, the time complexity of finding the element is o(n) after all traversal.

Balanced binary search tree:

The avl tree is also a binary search tree, but its name is too special. The avl tree is named after its inventors GM Adelson-Velsky and EM Landis, and it is called avl tree for short.
Because of its name, I often forget that it is a balanced binary search tree. I remembered it using the method of Lenovo. Readers can try it to see if it works. avl What I think of is a balanced binary tree (L is two branches, one horizontal and the other vertical) for search (V is symmetrical)
. Remember this sentence, and when you look at the three letters of avl, you can associate and know what it represents.
Then take a look at its characteristics:
1. First of all, it is a binary search tree.
2. With balance condition: the absolute value (balance factor) of the difference between the height of the left and right subtrees of each node is at most 1.
In other words, the avl tree is essentially a binary search tree (binary sort tree, binary search tree) with balancing function.
If a tree is like this:

it is a binary search tree, which meets the definition, but if the elements are on the 13->9 subtree, the query efficiency is similar to that of the linked list, because this number is unbalanced, left and right subtrees If the height difference is too large
, the efficiency of querying will be relatively low. So the avl tree was born, and the height difference between the left and right subtrees of each node did not exceed 1. The figure and the binary search tree above are both avl trees, which are balanced.

btree:

First of all, it is important to make it clear, btree and b-tree are the same thing. Before, many people thought it was B-tree and B-minus tree according to the pronunciation (maybe affected by B-plus-tree, but there is no B-minus-tree. It is just a dash, which is actually btree). Well,
we know that the avl tree is a binary balanced search tree, then btree is easy to talk about.
Btree is a multi-way balanced search tree. Compared with avl, it is not binary, but multi-fork. It has many branches. A btree of order M has a tree degree of M and has at most M branches.
1. The root node has at least two child nodes
2. Each non-leaf node and not the root node has at least m/2
3. Each node has no more than m children
4. All leaf nodes are at the same height
5. Each A non-leaf node consists of n keys and n+1 pointers, and [ceil(m/2)-1]<=n<=m/2 (where ceil means rounding up)
these definitions do not need to be memorized. Let's understand it in light of the actual situation. This is a 4th-order bree I found from the Internet. Because I am really lazy, it is troublesome

to draw. Okay, let’s look at the picture and

compare the rules above, 1-4 should be no problem. , The fifth item M is 4, n is calculated as 1<=n<=2, the pointer is 2-3, 1-2 keys, the key is the box with the value in the figure, and the pointer is the white box.

Advantages:
Because btree is mainly used to search for data. Compared with binary search tree, it is multi-branch. So in the case of the same amount of data, its tree height will be lower. If it is a binary tree, it will no longer find a data The
higher the tree of a certain subtree, the more the number of queries. Multi-path reduces the tree height and improves the search efficiency.

b+tree:

After understanding btree, it is easier to understand b+tree. According to the name, we know that they look alike, except that b+tree has one more +. b+tree is actually an optimized version of btree.
Have you ever tried to find the difference? Look at a few differences from btree.

Smart you should have found the answer, the answer is 5 places. Yes, that's right, it's all about pointers.

So this addition of b+tree can be understood as the addition of the pointer of the leaf node, and the detailed data of b+tree is only placed on the leaf node, and the non-leaf node is just the saved index (think about mysql's clustered index and Non-clustered index, is it a familiar feeling? Check out my other blog: mysql's clustered index and non-clustered index ).

Why add a pointer between leaf nodes? If you have used mysql, we know that innodb supports two indexes, one is hash and the other
is btree. The btree here is actually the b+tree. We know that there will be a range search during mysql index search. If it is to find a certain range of data, does btree have to query from the root node several times, and each time it traverses to the leaf node, can we know how much data there is?
But b+tree is different. All leaf nodes are connected by pointers. Is it enough to find them in turn according to the pointers? Yes, this is the advantage of b+tree.

After picking up the difference from the Internet, you can understand it yourself:
Difference:
B-tree
1. The search key cannot be stored repeatedly.
2. Data can be stored in leaf nodes as well as internal nodes.
3. Searching for some data is a slower process because data can be found on internal nodes and leaf nodes.
4. Deleting internal nodes is very complicated and time-consuming.
5. Leaf nodes cannot be linked together. Leaf nodes are linked together to make search operations more efficient

B+ tree
1. There may be redundant search keys.
2. Data can only be stored on leaf nodes.
3. The search speed is relatively fast, because the data can only be found on the leaf nodes.
4. Deletion will never be a complicated process, because the element will always be deleted from the leaf node.

postscript:

1. After reading the article, I know the concepts and characteristics of trees, binary trees, binary search trees, avl trees, btrees, and b+trees. You should have a general understanding of these trees
. 2. But in fact, there are many others. Continue to go deeper, such as btree, b+tree how to add, delete, modify and check?
There is also the concept and operation of the red-black tree, which actually has a lot of content to talk about.

Guess you like

Origin blog.csdn.net/sc9018181134/article/details/104743176