Multiple dynamic graphs explain the binary search tree in detail

In computer science, a Binary Search Tree (sometimes called an ordered or sorted binary tree) is a container that can store a specific type of data. Binary search tree allows to quickly find, add or delete a node, and it is a dynamic set.
A binary search tree holds nodes in key order, so lookups and other operations can use the principle of binary search: when looking up a node in the tree (or looking for a place to insert a new node), it traverses from the root to the leaf nodes, It compares with the key of each node, and then decides to continue searching in the left subtree or the right subtree based on the comparison result. On average, each comparison will skip about half the elements of the tree, which makes each lookup, insertion, or deletion of a node take time proportional to the logarithm of the tree's number of nodes (the height of the tree) Linear tables perform a lot better.

1. Definitions

The binary search tree is organized as a binary tree, each node is an object, including key, satellite data, in addition to some information needed to maintain the tree structure: left, right, parent, respectively pointing to the left child, right child, parent node. Among them, if the child node or parent node does not exist, it is represented by NIL. The root node is the only node in the tree whose parent is NIL.

A binary search tree has the following properties:

1. If the left subtree of the node is not empty, the value of all nodes on the left subtree is less than or equal to the value of its root node;

2. If the right subtree of the node is not empty, the value of all nodes on the right subtree is greater than or equal to the value of its root node;

3. The left and right subtrees of any node are also binary search trees respectively;

For example, in the above figure, the root node has keywords 6, the left subtree has keywords 2, 4, and 5, all of which are not greater than 6; the right subtree has keywords 7 and 8, both of which are not less than 6. This property holds for every node in the tree, that is, the definition of a binary search tree is recursive.

Before discussing the operations of a binary search tree, let's look at traversal of a binary search tree. Binary search trees can use preorder tree walk, inorder tree walk, and postorder tree walk. This naming is based on the position of the output key relative to the left and right subtrees. Taking in-order traversal as an example, the pseudo code is as follows:

INORDER-TREE-WALK(x)   
    if x != nil  // 如果节点不为空   
        INORDER-TREE-WALK(x.left)  // 首先递归地遍历左孩子,直到左孩子为空   
        print x.key     // 输出当前节点(显然第一次运行到这里时,它是最小值,因为它是整棵树的最左节点)     
        INORDER-TREE-WALK(x.right)  // 递归地遍历右孩子

For the binary search tree in the above figure, the dynamic process is as follows, so the output results are: 2, 4, 5, 6, 7, 8, which are arranged in order from small to large. Because the left child is always traversed when outputting, until the first node whose left child is empty is encountered, it is output, and then the stack is returned to continue output.

2. Inquiry

Binary search tree should also be able to complete MINIMUM, MAXIMUM, SUCCESSOR and PREDECESSOR operations, that is, to find the minimum, maximum, successor and predecessor, and these operations can be completed in O(lgn) time.

Find the specified keyword

The TREE-SEARCH operation searches a binary tree for a node with the specified key, inputting the root node pointer of the tree and the key k, if it exists, returns the node pointer, otherwise, returns nil.

TREE-SEARCH(x, k)   
    if x == nil or k == x.key  //如不存在或者找到,直接返回   
        return x   
    if k < x.key                    //如果小于当前节点,根据性质,在左子树中搜索   
        return TREE-SEARCH(x.left, k)   
    else                          //如果大于等于当前节点,根据性质,在右子树中搜索   
        return TREE-SEARCH(x.right, k)

For example, to find a node with a keyword of 5, start from the root node 6 and compare it with 5. Because 5 is less than 6, the search continues in the left subtree of node 6. When reaching node 4, because 5 is greater than 4, search in the right subtree of node 4, so that node 5 is successfully found, and the function will return a pointer to node 5 at this time. If the target node is not found, the TREE-SEARCH function will return nil. The entire search process is as follows:

min/max keywords

By starting from the root of the tree and searching down the left child until nil is encountered, then according to the nature of the binary search tree, if the node x has no left subtree, and the keys of the right subtree of x must be greater than x. key, so the current node must be the minimum value in the entire tree at this time.

TREE-MINIMUM(x)   
    while x.left != nil  // 沿着左子树一直深入搜索下去,直到遇到左子树为空的节点,此时当前节点为最小值   
        x = x.left   
    return x

Similarly, the pseudocode for the largest keyword is as follows:

TREE-MAXIMUM(x)   
    while x.right != nil  // 沿着右子树一直深入搜索下去,直到遇到右子树为空的节点,此时当前节点为最大值   
        x = x.right   
    return x

The time complexity of finding the largest and smallest keywords is only o(lgn), which is proportional to the height of the tree, because the search process forms a line from top to bottom, and the maximum length of the line is the height of a number, such as finding the minimum value the process of:

predecessor/successor

Given a node of a binary search tree, something needs to find its successor in the order of in-order traversal. If all the keys are different from each other, the successor of a node x must be the smallest key greater than x.key.

TREE-SUCCESSOR(x)   
    if x.right != nil   //case 1:如果右子树不为空,则后继一定是右子树的最小值,即大于x的最小值(右子树的值都大于x节点)   
        return TREE-MINIMUM(x.right)   
    y = x.p    // case 2:右子树为空时   
    while y != nil and x == y.right   
        x = y   // 变量x代表节点原始x的祖先,如果找到x,它是父节点的左孩子,则循环终止   
        y = y.p   // y 代表节点x的父节点,如果x是y的左孩子,循环终止,并且返回y   
    return y

1. The first case is relatively simple. If the right subtree of x is not empty, then its successor is the leftmost node of the right subtree, which corresponds to pseudocode case 1. The smallest node of the right subtree, 72, is also the leftmost node of the right subtree.

2. The second case is that the right subtree of x is empty. Note that the successor of x is always the minimum value greater than x (or does not exist), so where is the minimum value greater than x when the right subtree of x does not exist? ? We just need to simply start at x and go up the tree to find the first such node: its parent node is empty (ie the root node) or its left child is an ancestor node of x node (not necessarily a direct ancestor) . For example, in the following figure, in order to find the successor of 17, it goes up the tree, and first encounters nodes 13 and 11, which are not eligible because they are not the left children of the parent node. When encountering node 10, at this time x points to node 10, y points to node 19, and node 10 is the left child of node 19 and meets the conditions, so return node y, which is the successor of node x.

As another example, in the following figure, in order to find the successor of 15, it still goes up the tree until it encounters node 10 (the variable x in the pseudocode points to node 10 at this time): it is the ancestor of 15, and it is the left child. Therefore, the parent node 19 of node 10 is returned at this time, that is, the successor of node 15.

In a binary search tree, all but the largest node have successors. For the predecessor node, the principle is the same as that of the successor node, and will not be repeated here.

3. Insert

Insertion operations cause dynamic changes to the set of binary search trees, so certain modifications are required to maintain the binary search tree. Due to the nature of the binary search tree, that is, the left child is less than or equal to the parent node, and the right child is greater than or equal to the parent node, so the insertion operation is relatively simple.

Inserting a node into a binary search tree requires a call to TREE-INSERT, which takes the node z as input, where z.left = nil, z.right = nil, z.key = the key into which the data will be inserted:

TREE-INSERT(T, z)   
    y = nil   
    x = T.root   
    while x != nil //循环结束后,x一定为空,此时x即为节点z要插入的地方   
        y = x    //在这里给y赋值,保证循环结束后y始终是x的父节点   
        if z.key < x.key   
            x = x.left   
        else   
            x = x.right   
        z.p = y  //  y始终是x的父节点,为了插入z,需要让z的父节点指向x的父节点,即指向y   
        if y == nil  //  如果y为空,说明插入时是一棵空的树,需要将树根指向z   
            T.root = z   
        elseif z.key < y.key   //  判断节点z是y的左孩子还是右孩子   
            y.left = z   
        else   
            y.right = z

The above pseudo code starts from the root of the tree, and the pointer x records a simple downward path. The size of z.key and x.key is compared through the while loop, so that the pointer x and the pointer y move down, and when the loop ends, an empty space is found. x and use it as a slot, put the node z here (insert), and keep the node y as the parent node of the node x, so that it is easy to decide whether to use z as its left child or right child after insertion. As an example:

In the above figure, in order to insert node 46 in the tree, first x points to the root node, and node 46 is compared with the root node 68 (x node), which is less than 68, so the pointer x points to the left child 62 of the root node (x node), and then moves all the way down . Note that when x points to 45, node 46 is greater than 45, so point x to the right child of node 45. At this time, x is nil, the loop ends, and the position of node 46 is found: the right child of node 45. Then do some operations to insert node 46 into the tree.

4. Delete

Removing a node z from a binary search tree is a little tricky, but in general it can be divided into three cases:

1. If z has no child node, then simply delete it and modify its parent node, replacing z with nil as the child node.

2. If z has only one child, then lift the child to the position of z, and modify its parent point, replacing z with the child of z.

3. If z has two children, then use the successor y of z (at this time, the successor y of z must be in the right subtree of z, because the right child of z is not empty) to occupy the position of z. The part of the original right subtree is called the new right subtree of y, and the left subtree of z is called the new left subtree of y. This case is slightly more troublesome because it also depends on whether y is the right child of z.

Case 1: Node z has no children

This situation is relatively simple, we can directly delete the node z, and it will not affect the properties of the binary search tree:

The animation is represented as:

Second case: node z has only one child

This situation is also relatively simple, and you can directly replace node z with the child of node z. In fact, the first case and the second case can be classified into one: the number of children of node z is less than 2, and the child of node z can be used to replace node z, but when node z has no children, nil is used instead of node z , here are three cases in order to explain more clearly.

For example, as shown in the figure below, when node 42 has only the left child, directly point the right child of the parent node 6 of 42 to node 29, and set the parent node of node 29 to node 6:

Or when there is only the right child, just point the left child of 94 to 78, and the parent node of 78 to point to 94:

Case 3: Node z has two children

This situation is a little more complicated, because at this point we need to find the successor y of node z, which in turn is divided into whether y is the direct right child of node z or not.

1. The successor y of z is the right child of z.
At this time, you can directly replace z with the successor y, and the left child of y must be empty at this time (because the left child of the successor must be empty), and then use the left child of z instead of y The left child of the original null can be used.

Deleting node 67 with a moving graph is:

2. The successor y of z is not the right child of z
In this case we first replace y with the right child x of y, and then replace z with y:

Deleting node 50 with a moving graph is to replace 73 with 74, that is, the right child of the parent node 82 of 73 points to 74, and the parent node of 74 is set to 82, and then 73 is used instead of 50, that is, the left child 31 of 50 is set to the left of 73. Children, 50's right child 82 is set to 73's right child:

In order to implement the pseudocode of the deletion process, we need to define a subprocess TRANSPLANT, which is to replace the subtree rooted with u with the subtree rooted in v, so that the parent node of u becomes the parent node of v, that is Let v be the child of u's parent node:

TRANSPLANT(T, u, v)   
    if u.p == nil   // 当u位树的根节点时,直接将树的根节点指向v   
        T.root = v   
    elseif u == u.p.left   // 如果u是左孩子,则u的父节点的左孩子指向v   
        u.p.left = v   
    else    // 如果u是右孩子,则u的父节点的右孩子指向v   
        u.p.right = v   
    if v != nil      
        v.p = u.p    // 将v的父节点设为u的父节点

Then implement the specific deletion process:

TREE-DELETE(T, z)   
    if z.left == nil  // 如果左孩子为空,则直接用右孩子代替z即可,而不管右孩子是否为空(右孩子为空时对应情况一否则对应情况二)   
        TRANSPLANT(T, z, z.right)   
    elseif z.right == nil  // 右孩子为空,直接用做孩子代替z   
        TRANSPLANT(T, z, z.left)   
    else y = TREE-MINIMUM(z.right)  // 左右孩子均不为空,找到z的后继y,即z的右子树的最小值,对应第三种情况   
        if y.p != z   // 如果z的后继y不是z的右孩子,对应第三种情况的2   
            TRANSPLANT(T, y, y.right)  // 用y的右孩子代替y   
            y.right = z.right    // 将y的右孩子指向z的右孩子   
            y.right.p = y     // 将y的右孩子(原来的z的右孩子)的父节点设为y   
        TRANSPLANT(T, z, y)  // 用y代替z   
        y.left = z.left   
        y.left.p = p

So in general, delete operations can be divided into two categories:

1. When the total number of children of z is less than 2, the deletion of z is completed by directly replacing z with the children of z.

2. When z has two children:
2.1. The successor y of z is the right child of z: just replace z with y (don't forget to set the parent node of the left child of z to y).
2.2. The successor y of z is not the right child of z: first replace y with x, the right child of y, and then replace z with y.

5. Summary

Because of the nature of binary search trees, the size of the data can be halved after each comparison, so on average each operation can be completed in O(lgn) time, which takes time equal to the height of the tree proportional. But in the worst case, the binary search tree degenerates into a linked list, and the time complexity at this time degenerates to O(n). But many improved versions of binary search trees can make the tree height o(lgn), such as SBT, AVL tree, red-black tree, etc.

Reprint: Multi-dynamic graph explains binary search tree in detail | Congcong Li's Blog (lufficc.com) 

Guess you like

Origin blog.csdn.net/yangbindxj/article/details/123911981