Java data structure and algorithm (11): multi-way search tree

1. Binary tree and B tree

1.1 Problem analysis of binary tree

The operation efficiency of the binary tree is higher, but there are also problems.
Insert picture description here

  • The binary tree needs to be loaded into the memory. If the binary tree has few nodes, there is no problem, but if the binary tree has many nodes (for example, 100 million), the following problems exist:
    • Question 1: When building a binary tree, multiple I/O operations are required (massive data is stored in a database or file), and there are a large number of nodes. When building a binary tree, the speed has an impact;
    • Problem 2: The large number of nodes will also cause the height of the binary tree to be very large, which will reduce the operation speed;

1.2 Polytree

  • In a binary tree, each node has a data item, and there are at most two child nodes. If each node is allowed to have more data items and more child nodes, it is a polytree;
  • 2-3 trees, 2-3-4 trees are polytrees, which can optimize the binary tree by reorganizing nodes to reduce the height of the tree;
    Insert picture description here

1.3 Basic introduction of B-tree

The B-tree improves efficiency by reorganizing nodes, reducing the height of the tree, and reducing the number of I/O reads and writes.
Insert picture description here

  • The designer of the file system and database system used the principle of disk read-ahead and set the size of a node to be equal to a page (the page size is usually 4k), so that each node only needs one I/O to be fully loaded;
  • Set the height M of the tree to 1024, and only 4 I/O operations are required to read the desired element among 60 billion elements. B-tree (B+) is widely used in file storage systems and database systems;

2. 2-3 trees

2.1 2-3 tree is the simplest B-tree structure

  • All leaf nodes of the 2-3 tree are in the same layer (as long as the B tree meets this condition);
  • A node with two child nodes is called a second node, and the second node either has no child nodes or has two child nodes.
  • A node with three child nodes is called a three-node, and a three-node either has no child nodes or has three child nodes;
  • The 2-3 tree is a tree composed of two nodes and three nodes;

2.2 2-3 tree application case

Construct the sequence {16,24,12,32,14,26,34,10,8,28,38,20} into a 2-3 tree, and ensure the size order of data insertion.
Insert picture description here
Insertion rules:

  • All leaf nodes of the 2-3 tree are in the same layer. (As long as the B tree meets this condition)
  • A node with two child nodes is called a second node, and the second node either has no child nodes or has two child nodes.
  • A node with three child nodes is called a three-node, and a three-node either has no child nodes or has three child nodes;
  • When inserting a node to a certain node according to the rules, the above three requirements cannot be met, it needs to be demolished, and the upper layer is demolished first. If the upper layer is full, the current layer is demolished, and the above three conditions still need to be met after demolishing;
  • The value size of the three-node subtree still complies with the rules of (BST Binary Sorting Tree);

2.3 Other instructions

In addition to 23 trees, there are 234 trees, etc. The concept is similar to that of 23 trees, and it is also a B-tree. As shown in the figure:
Insert picture description here

3. B tree, B+ tree and B* tree

3.1 Introduction to B-tree

B-tree means B-tree, and B means Balanced, which means balance. Some people translate B-tree into B-tree, which is easy to misunderstand. One would think that B-tree is a kind of tree, and B-tree is another kind of tree. In fact, B-tree refers to B-tree. Both the 2-3 tree and the 2-3-4 tree are B-trees.
Insert picture description here
Description:

  • B-tree order: the maximum number of child nodes of the node. For example, the order of a 2-3 tree is 3, and the order of a 2-3-4 tree is 4;
  • The search of B-tree starts from the root node and performs binary search on the keyword (ordered) sequence in the node. If it hits, it ends, otherwise it enters the son node of the query keyword's scope: repeat until the corresponding The son pointer of is empty, or is already a leaf node;
  • The key set is distributed in the entire tree, that is, leaf nodes and non-leaf nodes store data;
  • The search may end at non-leaf nodes;
  • Its search performance is equivalent to a binary search in the full set of keywords;

3.2 Introduction to B+ Tree

B+ tree is a variant of B tree, and it is also a multi-path search tree.
Insert picture description here
Description:

  • The search of B+ tree is basically the same as that of B tree. The difference is that B+ tree hits only when it reaches the leaf node (B tree can hit non-leaf nodes), and its performance is equivalent to doing a binary search in the full set of keywords;
  • All keywords appear in the linked list of leaf nodes (that is, data can only be in leaf nodes [also called dense index]), and the keywords (data) in the linked list happen to be in order.
  • The non-leaf node is equivalent to the index of the leaf node (sparse index), and the leaf node is equivalent to the data layer that stores (keyword) data;
  • More suitable for file index system;
  • B-tree and B+-tree each have their own application scenarios, it cannot be said that B+-tree is completely better than B-tree, and vice versa;

3.3 Introduction to B* Tree

The B tree is a variant of the B+ tree. In the non-root and non-leaf nodes of the B+ tree, pointers to brothers are added.
Insert picture description here
Description of B tree:

  • B* tree defines that the number of non-leaf node keywords is at least (2/3)*M, that is, the minimum usage rate of blocks is 2/3, and the minimum usage rate of blocks in B+ tree is 1/2;
  • From the above characteristics, we can see that the probability of assigning new nodes to the B* tree is lower than that of the B+ tree, and the space utilization rate is higher;

Guess you like

Origin blog.csdn.net/houwanle/article/details/110693501