Red-black tree (RBTree)

Red-black tree (RBTree)

What is a red-black tree?

Red-black trees were invented by Rudolf Bayer in 1972 and were called symmetric binary B-trees at the time.

It was modified into today's "red-black tree" by Leo J. Guibas and Robert Sedgewick in 1978.

The red-black tree is also a self-balancing binary search tree, which is similar to the AVL tree.Maintain the balance of the binary tree through rotation operations during addition and deletion for more efficient query performance

Compared with AVL trees, red-black trees sacrifice part of their balance in exchange for fewer rotation operations during insertion/deletion operations, and the overall performance is better than that of AVL trees.

Although RBTree is complex, its worst-case running time is also very good, and it is efficient in practice:

it canSearch, insert and delete in O(log n) time, where n is the number of elements in the tree.

Characteristics of red-black trees

Red-black tree is the most commonly used balanced binary search tree in practical applications.Not strictly balanced, but average usage performance is very good

In a red-black tree, nodes are labeled red and black.

The principles of red-black trees have the following points:
Characteristic 1: Nodes are either black or red
Characteristic 2: The root node must be black
Characteristic 3: Leaf nodes (NIL) must be black
Characteristic 4: The two child nodes of each red node are both black. (There cannot be two consecutive red nodes on all paths from each leaf to the root)
Feature 5:All paths from any node to each of its leaves contain the same number of black nodes

The red attribute indicates that the child of a red node must be black. However, the children of RBTree black nodes can be red or black.

Leaf attribute description, leaf node can be empty nil,The leaf nodes of AVL are not emptyimage

Based on the above principles, weGenerally, when inserting a red-black tree node, the node will be set to red.

**Reason: **Refer to the last principle: Red is the least likely to destroy the principle. If it is black, it is likely that the black nodes of this branch will be 1 more than the other branches, destroying the balance.

Memory points:
You can remember several principles of red-black trees according to the classification in brackets:
( Color attribute ) Property 1: The node is either black or red
( Root attribute ) Property 2: The root node must be black
( Leaf attribute ) Property 3 : Leaf node (NIL) must be black
( red attribute ) Property 4: The two child nodes of each red node are both black. (There cannot be two consecutive red nodes on all paths from each leaf to the root)
( Black attribute ) Property 5: All paths from any node to each of its leaves contain the same number of black nodes.

Black attributes can be understood as balanced characteristics, if the balance characteristics cannot be met, a balancing operation must be performed.

Space-for-time
RBT is somewhat of a kindOptimization of space-for-time type, on the node of avl, addedColor attribute data, which is equivalent to increasing space consumption. By increasing the color attribute, the number of subsequent balancing operations is reduced.

black perfectly balanced

The red-black tree is not an AVL balanced binary search tree. As can be seen from the figure, the left subtree of the root node P is obviously higher than the right subtree.

According to characteristic 5 of a red-black tree, all paths from any node to each of its leaves contain the same number of black nodes, which means:
The number of levels of black nodes in the left subtree and right subtree of rbt is equal.
The equilibrium conditions of red-black trees,It is not constrained by the overall height, but by the height of the black node..
Therefore, this balance of red-black trees is calledblack perfectly balanced
image

To see the effect of perfectly balanced black,
remove the red nodes in rbt and you will get aQuadtree, from the root node to each leaf, the height is the same, which is the length of the black path from the root of rbt to the leaves.
image

Three operations of the red-black tree restoration balance process

Once the five principles of red-black trees are not satisfied, we regard the balance as being broken. How to restore the balance? Rely on its three operations:Color change, left-hand rotation, right-hand rotation

Discoloration

The color of the node changes from red to black or from black to red.

Left-handed

With a certain node as the pivot (pivot), its parent node (the root of the subtree) is rotated into its own left subtree (left-handed). The original left subtree of the pivot becomes the right subtree of the original root node. The original left subtree of the pivot becomes the right subtree of the original root node. The right subtree remains unchanged.image

Right rotation:

With a certain node as the fulcrum (pivot), its parent node (the root of the subtree) is rotated into its own right subtree (right rotation). The original right subtree of the pivot becomes the left subtree of the original root node. The pivot's The original left subtree remains unchanged.image

The left-turn and right-turn operations of red-black trees are similar to the left-turn and right-turn operations of AVL trees.

Red-black tree node insertion

image

image

By default, newly inserted nodes are red:
Because the parent node has a higher probability of being black, inserting a new node as red can avoid color conflicts.

scene 1:The red-black tree is an empty tree

Just use the inserted node as the root node.

In addition: According to the properties of the red-black tree, the two root nodes are black. still needSet the inserted node to black

Scenario 2:The Key of the inserted node already exists

Update the value of the current node to the value of the inserted node.

Scenario 3:The parent node of the inserted node is black

Since the inserted node is red, when the parent node of the inserted node is black, it will not affect the balance of the red-black tree. SoDirect insertion without self-balancing

Scenario 4:The parent node of the inserted node is red

According to property 2: the root node is black.
If the parent node of the inserted node is a red node, thenThe parent node cannot be the root node, so the inserted node always has a grandparent node (three-generation relationship).

According to property 4: the two child nodes of each red node must be black. There cannot be two red nodes connected.

At this time, two states will appear:
the father and uncle are red,
the father is red, and the uncle is black.

Scene 4.1: Father and uncle are red nodes

According to property 4: red nodes cannot be connected ==》The grandparent node must be a black node:
The father is red, then the number of red-black tree levels that should be inserted into the sub-tree at this time is: black, red, and red.

becauseIt is impossible to have two connected red nodes at the same time and needs to be changed.

Discoloration treatment: == black red red> red black red

  1. Change the F and V nodes to black
  2. Change P to red
  3. Set P as the current node for subsequent processing
    image

It can be seen that P is set to red.
If the parent node of P is black, then no processing is required;
but if the parent node of P is red, it violates the nature of the red-black tree, so P needs to be set as the current node. Continue the insertion operation and perform self-balancing processing until the overall balance is achieved.

Scene 4.2: The uncle is black, the father is red, and is inserted into the left node of the father

Uncle is black, or does not exist (NIL)It is also a black node, andThe parent node of a node is the left child node of the grandparent node.
Note: From a pure insertion point of view, the uncle node is either red or black (NIL node), otherwise the red-black tree property 5 will be destroyed. At this time, the path will have one more black node than other paths.
image

Scenario 4.2.1 LL type imbalance

The newly inserted node is the left child node of its parent node (LL red case). After insertion, there is an LL type imbalance.

image

Self-balancing processing:

  1. Change color:
    set F to black and P to red
  2. Right-turn the F node
    image
Scenario 4.2.2 LR type imbalance

The newly inserted node is the right child node of its parent node (LR red case). After insertion, there is an LR type imbalance.

image
Self-balancing processing:

  1. Left-turn F
  2. Set F as the current node and get the LL red case
  3. Process according to the LL red situation (1. Discoloration 2. Right-handed P node)
    image
Scenario 4.3: The uncle is a black node, the father is red, and the father node is the right child node of the grandfather node

image

Scenario 4.3.1: RR type imbalance

The newly inserted node is the right child node of its parent node (RR red case)
image

Self-balancing processing:

  1. Color change:
    Set F to black and P to red
  2. Perform left rotation on P node
    image
Scenario 4.3.2: RL type imbalance

A newly inserted node is the left child node of its parent node (RL red case)
image

Self-balancing processing:

  1. Right-turn F
  2. Set F as the current node to get the RR red situation
  3. Process according to the RR red situation (1. Discoloration 2. Left-handed P node)
    image

Red-black tree node deletion

The deletion operation of the red-black tree also includes two parts of work:

  • Find target node
  • Self-balancing after deletion

When the target node does not exist, this operation is ignored; when the target node exists, self-balancing processing must be performed after deletion. After deleting a node, you still need to find a node to replace the deleted node. Otherwise, the subtree will be disconnected from the parent node. Unless the deleted node happens to have no child nodes, then there is no need to replace it.

image

image
Notice:R is the replacement node that is about to be replaced at the position of the deleted node. Before deletion, it participates in the sub-balancing of the tree at its original position. After balancing, it is replaced at the position of the deleted node before deletion is completed.

Case 1: The replacement node is a red node

When the replacement node is moved to the position of the deleted node, since the replacement node is red, deletion will not affect the balance of the red-black tree. You only need to set the color of the replacement node to the color of the deleted node to restart. balance.

Processing :The color changes to the color of the deleted node

Case 2: The replacement node is a black node

When the replacement node is black, self-balancing processing must be performed.Consider whether the replacement node is the left child node or the right child node of its parent node to perform different rotation operations to rebalance the tree.

Case 2.1: The replacement node is the left child node of its parent node
Case 2.1.1: The sibling node of the replacement node is the red node

If the sibling node is a red node, according to property 4, the parent node and child node of the sibling node must be black. So
deal with :

  • Set S to black
  • Set P to red
  • Perform left rotation on P to get scenario 2.1.2.3
  • Processing of case 2.1.2.3

image

Case 2.1.2: The sibling node of the replacement node is black

If the sibling node is black, the color of the parent node and child node cannot be determined.

Case 2.1.2.1: The right child node of the sibling node of the replacement node is red and the left child node is any color

The replacement node is black. After deletion, the left subtree has one less black node. Then a black node must be borrowed from the right subtree, and it must be rotated left.
Processing :

  • Set the color of S to the color of P
  • Set P to black
  • Set SR to black
  • Left-rotating P
    image
    and deleting R have no effect.
Case 2.1.2.2: The right child node of the sibling node of the replacement node is black and the left child node is red.

If R is deleted, a red node must be borrowed from the right node to not affect the balance, so consider converting case 2.1.2.2 to case 2.1.2.1.
Processing :

  • Set S to red
  • Set SL to black
  • Right-turning S gives us scenario 2.1.2.1
  • Perform scenario 2.1.2.1 processing
    image
Case 2.1.2.3: The child nodes of the sibling nodes of the replacement node are all black.

In this case, no brother's red node is lent to you, so you have to find another place to get the red node. Then due to the nature of red-black trees growing from the bottom up, we thought that we can use P as a new node just like adding nodes. Replace the node, but R is actually deleted. Just treat P as a replacement node and repeat the above situation from bottom to top.

  • Set S to red
  • Use P as the new replacement node
  • Re-execute the deletion node scenario
    processing :
    image
Case 2.2: The replacement node is the right child of its parent node

Similar to case 2.1, this is handled by using the successor node as a replacement node.

Case 2.2.1: The sibling node of the replacement node is red

Processing :

  • Set S to black
  • Set P to red
  • Perform a right rotation on P to get scenario 2.2.2.3
  • Perform scenario 2.2.2.3 processing
    image
Case 2.2.2: The sibling node of the replacement node is black
Case 2.2.2.1: The left child node of the sibling node of the replacement node is a red node, and the right child node is any color.

Processing :

  • Set the color of S to the color of P
  • Set P to black
  • Set SL to black
  • Right-turn P
    image
Case 2.2.2.2: The left child node of the sibling node of the replacement node is black and the right child node is red

Processing :

  • Set S to red
  • Set SR to black
  • Perform left rotation on S to get scenario 2.2.2.1
  • Perform scenario 2.2.2.1 processing
    image
Case 2.2.2.3: The child nodes of the sibling nodes of the replacement node are all black.

Processing :

  • Set S to red
  • Use P as the new replacement node
  • Re-process the deletion node scenario
    image
Summary of rules
  • Loop condition: x != root && x.color = BLACK, x is not the root node and the color is black
  • Finishing operation: set x to black
  • x is the father's left son or right son, and the processing operations are symmetrical; similarly, you only need to remember the operations of the left son, and you can draw inferences

x is the father’s left son

  • The brother is red: change the brother to black and the parent node to red; rotate the parent node left to restore the black height of the left subtree, and the left nephew becomes the new brother
  • The brother is black, and the left and right nephews are black: the brother becomes red, x points to the parent node, and the adjustment continues
  • The brother is black, the right nephew is black (the left nephew is red): the left nephew becomes black, the brother becomes red; the brother rotates right, restoring the black height of the right subtree, and the left nephew becomes the new brother
  • The brother is black and the right nephew is red: the brother becomes the color of the parent node, the parent node and the right nephew become black; the parent node rotates left, x points to the root node of the entire tree, ending the loop

RBT interview questions:

Question: With a binary search tree, why do we need to balance the binary tree?

The binary search tree easily degenerates into a chain.
At this time, the time complexity of the search will also degenerate from O (log n) to O (N). A
balanced binary tree AVL with restrictions on the height difference between the left and right subtrees is introduced to ensure the optimal search operation. The bad time complexity is also O (log n)

Question: With balanced binary trees, why do we need red-black trees?

The height difference between the left and right subtrees of AVL cannot exceed 1. Every time an insertion/deletion operation is performed, a rotation operation is almost required to maintain balance. In scenarios where insertion/deletion is
frequent, frequent rotation operations greatly reduce the performance of AVL.
Red Black Trees sacrifice strict balance in exchange for a small amount of rotation during insertion/deletion.

Overall performance is better than AVL

  • The imbalance when inserting the red-black tree can be solved by no more than two rotations; the imbalance when deleting can be solved by no more than three rotations.

  • The red-black rules of the red-black tree ensure that the search operation can be completed in O (log n) time in the worst case
Question: Do you still remember the principles of red-black trees?

You can follow the classification in the brackets and remember several principles of red-black trees:

  • (Color attribute) Node is either black or red
  • (Root attribute) The root node must be black
  • (Leaf attribute) Leaf nodes (NIL) must be black
  • (Red attribute) The two child nodes of each red node are both black. (There cannot be two consecutive red nodes on all paths from each leaf to the root)
  • (Black attribute) All paths from any node to each of its leaves contain the same number of black nodes.
Q: What are the internal operations of the red-black tree?

Discoloration
Turning a red node into black, or turning a black node into red, is the discoloration of this node.
rotate
Similar to the rotation operation of balanced binary trees.

The difference between red-black trees and AVL trees
  1. The mechanisms for adjusting the balance are different.
    The red-black tree determines whether it is unbalanced based on the consistent number of black nodes on the path. If it is unbalanced, it is restored through color change and rotation.

AVL determines whether it is unbalanced based on the balance factor of the tree (the absolute value of the height difference between the left and right subtrees of all nodes does not exceed 1). If it is unbalanced, it is restored by rotation.

  1. Red-black tree insertion efficiency is higher

Red-black trees use non-strict balance to reduce the number of rotations when adding and deleting nodes. Any imbalance will be resolved within three rotations.

The red-black tree does not pursue "complete balance". It only requires partial balance requirements, reducing the requirements for rotation, thereby improving performance.

AVL is a strictly balanced tree (a highly balanced binary search tree), so when adding or deleting nodes, depending on the situation, the number of rotations is more than that of the red-black tree.

Therefore, the insertion efficiency of red-black tree is higher

3. The statistical performance of red-black trees is higher than that of AVL trees.
Red-black trees can perform query, insertion, and deletion operations with O(log n) time complexity.
AVL tree lookups, insertions, and deletions are O(log n) in both average and worst cases.

The algorithm time complexity of the red-black tree is the same as that of AVL, but the statistical performance is higher than that of AVL tree.

4. Applicability: AVL search efficiency is high.
If in your application, the number of queries is much greater than insertions and deletions, then choose the AVL tree. If the number of queries and insertions and deletions are almost the same, you should choose the red-black tree.

That is, sometimes it is just for sorting (create-traverse-delete) without searching or the number of searches is very small, and the RB tree is more cost-effective.

Red-black tree VS AVL tree

Common balanced trees include red-black trees and AVL balanced trees. Why do both STL and Linux use red-black trees as the implementation of balanced trees? There are probably several reasons:

  1. In terms of implementation details, if inserting a node causes the tree to become unbalanced, both the AVL tree and the red-black tree require at most 2 rotation operations, that is, both are O(1); however, deleting a node causes the tree to become unbalanced. When balancing, in the worst case, AVL needs to maintain the balance of all nodes on the path from the deleted node to the root, so the magnitude of rotation required is O(logN), while RB-Tree only needs 3 rotations at most, only Requires O(1) complexity

  2. From the perspective of the balance requirements of the two balanced trees, the structure of AVL is more balanced than that of RB-Tree. Inserting and deleting nodes is more likely to cause unbalance of the Tree. Therefore, when a large amount of data needs to be inserted or deleted, AVL needs to The frequency of rebalance will be higher. Therefore, RB-Tree is more efficient in scenarios that require a large number of nodes to be inserted and deleted. Naturally, since AVL is highly balanced, AVL's search efficiency is higher.

  3. Generally speaking, the statistical performance of RB-tree is higher than AVL.

Guess you like

Origin blog.csdn.net/u010523811/article/details/132768469