In those years, the red-black tree who was abused in the interview

Interviewer : Xiao Guizi, right? Your resume says that you are proficient in java programming. You must have a good grasp of java, right?
Xiao Guizi : Yes, yes, I have always used java to write bugs~
Interviewer : Then you talk about the underlying implementation principle of HashMap before jdk1.7, and why it may cause high CPU usage in high concurrency scenarios?
Little Guizi : This. . . Like a red-black tree?
Interviewer : Oh? You are talking about the design after jdk1.8. Since you mentioned it, let’s talk about the data structure of the red-black tree. Here is a white paper and a pen. Let’s write one!
Xiao Guizi : Oh, oh, oh, teacher, my stomach suddenly hurts, I have to go to the toilet, I will be back in a while~~~


Hurry up and pick up your mobile phone, scan the code on WeChat and follow the official account of Life between your Fingers


Interviews are full of tricks. . . I don't know if you have the same embarrassing interview experience as Xiao Guizi. If so, please leave a message in the comment area and tell your story~

Next, we entered the topic and began to explore the red-black tree that the interviewer embarrassed Xiaoguizi. When it comes to the red-black tree, most people should be both familiar and unfamiliar with it. Familiarity is because we use it directly or indirectly in coding every day, but the complexity of design and implementation makes many people shy away from its principle. The definition of red-black tree is relatively simple. It is nothing more than a few more self-balancing rules in the process of insertion and deletion, but it is only a single digit. Just stop and follow this article, I believe you will gain something, let's moving ...

The following editor assumes that you already have a certain understanding of the structure, function, and drawbacks of binary trees and balanced binary trees.

image

The red-black tree (as shown above, quoted from Wikipedia) is a self- balancing binary tree. The so-called self-balancing means that in the process of insertion and deletion, the red-black tree will adopt a certain strategy to adjust the organizational form of the tree. In order to reduce the height of the tree as much as possible, so as to save the search time. The properties of a red-black tree are as follows:

  1. Node is red or black
  2. The root node is always black
  3. Leaf nodes (NIL nodes) are all black
  4. The two immediate children of the red node are black (that is, there are no two consecutive red nodes on all paths from the leaf to the root)
  5. All simple paths from any node to each leaf contain the same number of black nodes

The above properties ensure that on the premise of satisfying the characteristics of a balanced binary tree, the red-black tree can also achieve that the longest path from the root to the leaf will not exceed twice the shortest path at most . This mainly considers two extreme cases. 4 and 5 We can know that in a red-black tree, the shortest path from the root to the leaf is all composed of black nodes, while the longest node is composed of red and black nodes interleaved (always organized in the order of one red and one black). ), and because the number of black nodes on the shortest path and the longest path is the same, the number of nodes on the longest path is twice that of the shortest path.

self-balancing strategy

The most basic operation of a red-black tree is adding, deleting, modifying, and checking. Neither checking nor modifying will change the structure of the tree, so it is no different from ordinary balanced binary tree operations. The rest is the addition and deletion operations. Both insertion and deletion will destroy the structure of the tree, but with the help of a certain balancing strategy, the tree can be redefined to meet the definition. Balancing strategies can be simply summarized into three types: left rotation , right rotation , and color change . After inserting or deleting a node, as long as we perform these three operations along the path from the node to the root, we can finally make the tree redefine the definition.

  • Rotate left

For the current node, if the right child node is red and the left child node is black, a left rotation is performed, as shown in the following figure:

image

  • rotate right

For the current node, if the left child and left grandchild nodes are both red, right rotation is performed, as shown in the following figure:

image

  • discoloration

For the current node, if the left and right child nodes are both red, the color change is performed, as shown in the following figure:

image

insert operation

As a kind of balanced binary tree, the red-black tree also needs to locate the insertion point by means of the search operation, but the red-black tree stipulates that the newly inserted nodes are all red , which is mainly to simplify the self-balancing process of the tree. For an empty tree, inserting a red node will add a color change operation, but for the rest of the cases, if the inserted node is a black node, it will inevitably destroy property 5, and inserting a red node has Possibly breaking property 4, but at this point we can adjust the tree to re-satisfy the definition with a simple strategy.

We agree that X is the inserted node, P is the parent node of X, G is the grandfather node of X, and U is the uncle node of X.

The following is a discussion of the insertion process in accordance with the above strategies and scenarios:

1. The newly inserted node X is the root node

At this time, the newly inserted node is red, which violates property 2, and only needs to be changed to black.

2. The parent node P of the newly inserted node X is black

At this time, it needs to be divided into two cases according to the size of the newly inserted node X value relative to the parent node P. If it is smaller than the size of the parent node P, simply insert X into the left child position of P (left in the figure below). If the value of X is greater than P, you need to insert X into the position of the right child node of P, and then perform a left rotation (right in the figure below).

image

3. The parent node P is red, and the uncle node U is also red.

Because P is red, according to property 4, G must be black. If the value of X is less than P, it needs to be inserted at the left sub-position of P (as shown in the figure below), and property 4 is not satisfied after insertion. At this time, only one color changing operation needs to be performed. , just invert the colors of P, G, U, because G becomes red, so the path length is reduced by 1, but because both P and U become black, the path length is increased by 1, and the final length remains unchanged, but At this point G turns red, so you need to continue recursing upwards.

image

If the value of X is greater than P, you need to insert it at the right sub-position of P (as shown in the figure below). After insertion, the property 4 is not satisfied. In this case, you need to perform left rotation to change to the above situation, and continue to change color.

image

4. The parent node P is red, while the uncle node U is black or does not exist

Because P is red, G must be black according to property 4. If the value of X is less than P, it needs to be inserted at the left sub-position of P (as shown in the figure below). After insertion, property 4 is not satisfied. At this time, a right rotation needs to be performed first. , after the rotation, the property 4 is still violated, and the height of the left subtree is reduced by 1. At this time, another color change operation needs to be performed to satisfy the definition.

image

If the value of X is greater than P, it needs to be inserted at the right sub-position of P (as shown in the figure below), and property 4 is not satisfied after insertion. At this time, we need to perform a left rotation, and then convert to the above situation, and continue to rotate right , the color can be changed.

image

delete operation

As a kind of balanced binary tree, the red-black tree also needs to locate the deletion point by means of the search operation. Before executing the deletion, we need to determine how many child nodes the node to be deleted has. Find the node with the largest value in the subtree, or find the node with the smallest value from the right subtree, and replace the node to be deleted with the node value (as long as the value of the target node disappears from the tree, do not worry about the specific deletion which node it is). These two nodes have one thing in common, that is, there is only one child node at most (because they are already the largest and the smallest in their own range, there is no room for two tigers (rats) in one mountain). At this time, the requirement is changed to delete only one child node. The point of the node is relatively simple.

We agree that X is the node to be deleted, P is the parent node of X, S is the child node of X, B is the sibling node of X, BL is the left child node of B, and BR is the right child node of B point.

  1. If the node X to be deleted is a red node, it can be deleted directly without violating the definition.
  2. If the node X to be deleted is a black node, and its child node S is red, then it is only necessary to replace X with S, and change S from red to black at the same time.
  3. If the node X to be deleted is black, and its child node S is also black, this situation needs to be further discussed in different scenarios.

For the third case, we first replace X with S, and rename it to N. N follows the name of X for elders and juniors. It needs to be clear that the X node is actually deleted here, and the length of the path through N is reduced by 1 after deletion. .

1. N is the new root

This case is relatively simple and does not require any further adjustment.

2. The parent node of N, the sibling node B, and the child node of B are all black

As shown in the figure below, you only need to change B to red at this time, so that all paths passing through B are reduced by 1, which is exactly the same as all paths passing through N, but at this time, the length of the path passing through P is reduced by 1, so it needs to go up Recursively continue to determine the node P.

image

3. The sibling node B of N is red, and the rest of the nodes are black

As shown in the figure below, you need to perform a left rotation at this time, and then exchange the colors of P and B. The path of each node does not change before and after the adjustment, but because the length of the path passing through N before is one unit less, it still does not meet the definition at this time, and it needs to be adjusted according to the following scenarios.

image

4. The parent node P of N is red, the sibling node B, and the child nodes of B are all black

As shown in the figure below, at this time we only need to simply exchange the colors of P and B. In this case, it has no effect on the node path that does not pass through N, but it adds 1 to the node path passing through N, which just makes up for the previous delete operation. resulting losses.

image

5. The sibling node B of N is black, the left child of B is red, and the right child of B is black

As shown in the figure below, at this time, we need to perform a right rotation operation, and then exchange the colors of B and BL. After the operation, the length of the path through all nodes does not change, but N has a new black sibling node. And the right child of the sibling node is red, so it can continue to be adjusted according to a scenario introduced next.

image

Note: A white node means that the node can be either black or red, and the same is true for subsequent illustrations.

6. The sibling node B of N is black, and the right child of B is red

As shown in the figure below, at this time we need to perform a left rotation first, swap the colors of P and B, and turn the right child node of B into black. After the change, the path lengths of other nodes except N remain unchanged, but a black node is added on the path passing through N, which just makes up for the loss caused by the previous deletion.

image

summary

The main difficulty of the red-black tree lies in the self-balancing adjustment in the process of insertion and deletion. The adjustment of the insertion process is relatively simple, and the deletion process needs to deal with more situations. However, whether it is insertion or deletion, readers are advised to put all graphs Put them together for observation, and you can find the mystery of the past and the future. Due to the length of this article, I will not post a long picture.

In addition, readers are also recommended to manually construct a red-black tree on white paper according to the above process, and delete the nodes one by one to deepen their understanding. They can also use the interactive website provided by the University of San Francisco to assist learning ( click here ) , the relevant implementation is located under the org.zhenchao.classic.searchpackage , address:

https://github.com/plotor/algorithm-design

"Robert Sedgwick", one of the authors of the red book "Algorithms", is the proposer of the red-black tree. The red-black tree is improved on the basis of the 2-3 tree. Compared with the red-black tree, 2 The self-balancing strategy of -3 tree is much easier to understand. It is also recommended that you refer to the relevant chapters when learning.

references

  1. Red-black tree - Wikipedia
  2. Algorithms (4th Edition)

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325341856&siteId=291194637