Red black tree

Red black tree

 

The nature and definition of red-black trees

A red-black tree is a binary search tree that satisfies the following properties:

1. Each node is either red or black.

2. The root node is black.

3. All leaf nodes are black (actually all are Null pointers, indicated by NIL in the figure below). Leaf nodes do not contain any keyword information, and all query keywords are on non-terminals./

4. The two child nodes of each red node must be black. In other words: there cannot be two consecutive red nodes on all paths from each leaf to the root

5. All paths from any node to each leaf contain the same number of black nodes

 

 

Black depth-the number of black nodes on the path starting from a certain node x (not including the node x itself) to the leaf node (including the leaf node), called the black depth of the node x, remember For bd(x), the black depth of the root node is the black depth of the red-black tree. The black depth of the leaf node is 0. For example: in the above figure bd(13)=2, bd(8)=2, bd(1)=1

Internal node-the non-terminal point of the red-black tree

External nodes-leaf nodes of red-black trees

 

Red-black tree related theorems

1.  The longest possible path from root to leaf is no more than twice as long as the shortest possible path.

      According to the above property 5, we know that there are 3 black nodes on each path of the red-black tree in the above figure. Therefore, the shortest path length is 2 (the path without red nodes). According to property 4 (two red nodes cannot be connected) and property 1, 2 (leaves and roots must be black nodes). Then we can get that: a path with 3 black nodes can only have 2 red nodes at most (there is a red-black interval). That is to say, the longest path of a red-black tree with a black depth of 2 (the root node is also black) is 4, and the shortest path is 2. From this we can see that the red-black trees are roughly balanced. (Of course it is worse than the balanced binary tree, the balance factor of AVL is at most 1)

 

2. The height (h) of the red-black tree is not more than twice the black depth (bd) of the red-black tree, that is, h<=2bd

      According to Theorem 1, it is not difficult for us to explain this point. bd is the shortest path length of the red-black tree. The longest possible path length (the maximum tree height) is the red and black path, which is equal to 2bd. Therefore h<=2bd.

 

3. The height of a red-black tree with n internal nodes (not including leaf nodes) h<=2log(n+1)

      Below we first prove that a red-black tree with n internal nodes satisfies n>=2^bd-1. This can be proved by mathematical induction, which is summed up in tree height h. When h=0, this is equivalent to a leaf node, the black height bd is 0, and the number of internal nodes n is 0, at this time 0>=2^0-1 holds. Assuming that when the tree height is h<=t, n>=2^bd-1 is established, we remember that the number of internal nodes in the left subtree of the root node of a red-black tree with a tree height of t+1 is nl, and the right The number of internal nodes of the subtree is nr, remember the black height of the two subtrees as bd' (note that the black height of the two subtrees must be the same), obviously the height of the two subtrees is <=t, so There are nl>=2^bd'-1 and nr>=2^bd'-1, add these two inequalities to nl+nr>=2^(bd'+1)-2, add the inequality to the left and right 1. Obtain n>=2^(bd'+1)-1. Obviously bd'+1>=bd, so the previous inequality can be changed to n>=2^bd-1, which proves that one has The red-black tree with n internal nodes satisfies n>=2^bd-1.

        According to Theorem 2, h<=2bd. That is, n>=2^(h/2)-1, then h<=2log(n+1)

        From here, we can see that the search length of the red-black tree does not exceed 2log(n+1) at most, so the search time complexity is also O(log N) level.

 

Red-black tree operation

 

Because each red-black tree is also a specialized binary search tree, the search operation on the red-black tree is the same as the search operation on the ordinary binary search tree. However, inserting and deleting operations on the red-black tree will result in no longer conforming to the nature of the red-black tree. Restoring the attributes of the red-black tree requires a small amount of (O(log n)) color changes (actually very fast) and no more than three tree rotations (two for insert operations). Although insertion and deletion are complicated, the operation time can still be kept as O(log n) times.

 

Insert operation

We first increase the node with the binary search tree method and mark it as red. (If it is set to black, it will result in a path from the root to the leaf and an extra black node. This is difficult to adjust. But after setting it to a red node, it may cause two consecutive red nodes to appear. Conflict, then you can adjust it through color flips and tree rotation.) The next operation depends on the colors of other neighboring nodes. As in the human family tree, we will use the term uncle node to refer to the siblings of a node's parent.

 

Suppose the newly added node is N, the father node is P, the uncle node is Ui (the uncle node is the sibling node of a series of P), and the grandfather node G (the father of the father node P). Each situation will be given below, we will use the C sample code to demonstrate. Through the following functions, you can find the uncle and grandfather of a node:  

C code

view plain copy to clipboard print ?

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

  1. node grandparent(node n) {    
  2.      return n->parent->parent;    
  3.  }    
  4.      
  5. node uncle(node n) {    
  6.      if (n->parent == grandparent(n)->left)    
  7.          return grandparent(n)->right;    
  8.      else    
  9.          return grandparent(n)->left;    
  10. }    

node grandparent(node n) {return n->parent->parent;
}

node uncle(node n) {
if (n->parent == grandparent(n)->left)
return grandparent(n)->right;
else
return grandparent(n)->left;
}  

 

Situation 1. The current red-black tree is empty, the new node N is located at the root of the tree, and there is no parent node.

 

       At this point, it is very simple. We will directly insert a black node N (satisfying property 2). In other cases, the inserted N is red (the reason is mentioned above).

C code 

view plain copy to clipboard print ?

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

  1. void insert_case1(node n) {    
  2.     if (n->parent == NULL)    
  3.         n->color = BLACK;    
  4.     else    
  5.         insert_case2(n);//insert case 2     
  6. }    

void insert_case1(node n) {if (n->parent == NULL)
n->color = BLACK;
else
insert_case2(n);//insert case 2
}  

Case 2. The parent node P of the new node N is black.

 

       In this case, we insert a red node N (satisfying property 5).

Java code 

view plain copy to clipboard print ?

  1. void insert_case2(node n) {    
  2.     if (n->parent->color == BLACK)    
  3.         return;//The tree is still valid     
  4.     else    
  5.         insert_case3(n);//insert case 3     
  6. }    

void insert_case2(node n) {if (n->parent->color == BLACK)
return;//the tree is still valid
else
insert_case3(n);//insert case 3
}  

 

Note: In cases 3, 4, and 5, we assume that the new node has a grandparent node because the parent node is red; and if it is the root, it should be black. So the new node always has an uncle node, although it may be a leaf in cases 4 and 5.

 

Case 3. If both the parent node P and the uncle node U are red.

 

        As shown in the figure below, because the newly added node N must be red, then we can redraw the parent node P (guaranteed property 4) and the uncle node U of N (guaranteed property 5) to black. If the grandfather node G is the root at this time, the change ends. If it is not the root, the grandfather node is redrawn in red (guaranteed property 5). However, G's father may also be red, in order to guarantee the nature4. We regard G recursion as a newly added node N for rechecking various situations.

      

C code 

view plain copy to clipboard print ?

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

  1. void insert_case3(node n) {    
  2.     if (uncle(n) != NULL && uncle(n)->color == RED) {    
  3.         n->parent->color = BLACK;    
  4.         uncle(n)->color = BLACK;    
  5.         grandparent(n)->color = RED;    
  6.         insert_case1(grandparent(n));    
  7.     }    
  8.     else    
  9.         insert_case4(n);    
  10. }    

void insert_case3(node n) {if (uncle(n) != NULL && uncle(n)->color == RED) {
n->parent->color = BLACK;
uncle(n)->color = BLACK;
grandparent (n)->color = RED;
insert_case1(grandparent(n));
}
else
insert_case4(n);
}  

 

Note: In cases 4 and 5, we assume that the parent node P is the left child of the grandfather node G. If it is a right child node, the left and right in case 4 and case 5 should be reversed.

 

Case 4. The parent node P is red and the uncle node U is black or missing; in addition, the new node N is the right child node of its parent node P, and the parent node P is the left child node of the grandfather node G.

 

       As shown in the figure below, in this case, we perform a left rotation to exchange the roles of the new node and its parent node (the same as the left rotation of the AVL tree); this causes some paths to pass through the new node N or parent node that they did not pass before. One of P, but both nodes are red, so the property 5 does not fail. But the current situation will violate the property 4, so then, we continue to deal with the previous parent node P according to the following situation 5.

 

C code 

view plain copy to clipboard print ?

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

  1. void insert_case4(node n) {    
  2.        
  3.       if (n == n->parent->right && n->parent == grandparent(n)->left) {    
  4.         rotate_left(n->parent);    
  5.         n = n->left;    
  6.     } else if (n == n->parent->left && n->parent == grandparent(n)->right) {    
  7.         rotate_right(n->parent);    
  8.         n = n->right;    
  9.     }    
  10.     insert_case5(n)    
  11. }    

void insert_case4(node n) {

if (n == n->parent->right && n->parent == grandparent(n)->left) { rotate_left(n->parent); n = n->left; } else if (n == n->parent->left && n->parent == grandparent(n)->right) { rotate_right(n->parent); n = n->right; } insert_case5(n) Copy code

}  

   

Case 5. The parent node P is red and the uncle node U is black or missing, the new node N is the left child node of its parent node, and the parent node P is the left child node of the grandfather node G.

 

       As shown in the figure below: In this case, we perform a right rotation for the grandparent node P; in the tree generated by the rotation, the previous parent node P is now the parent node of the new node N and the previous grandparent node G. We know that the previous grandparent node G is black, otherwise the parent node P cannot be red. We switch the colors of the previous parent node P and grandparent node G, and the resulting tree satisfies property 4 [3]. Property 5[4] also remains satisfied, because all paths through any of these three nodes previously passed through the grandparent node G, and now they all pass through the previous parent node P. In each case, these are the only black nodes among the three nodes.

         

C code 

view plain copy to clipboard print ?

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

  1. void insert_case5(node n) {    
  2.     n->parent->color = BLACK;    
  3.     grandparent(n)->color = RED;    
  4.     if (n == n->parent->left && n->parent == grandparent(n)->left) {    
  5.         rotate_right(grandparent(n));    
  6.     } else {    
  7.         /* Here, n == n->parent->right && n->parent == grandparent(n)->right */    
  8.         rotate_left(grandparent(n));    
  9.     }    
  10. }    

void insert_case5(node n) {n->parent->color = BLACK;
grandparent(n)->color = RED;
if (n == n->parent->left && n->parent == grandparent(n) ->left) {
rotate_right(grandparent(n));
} else {
/* Here, n == n->parent->right && n->parent == grandparent(n)->right */
rotate_left(grandparent( n));
}
}  

 

Delete operation

 

If the node to be deleted has two sons, then the problem can be transformed into a problem of deleting another node with only one son (for ease of presentation, the son referred to here is the son of a non-leaf node). For a binary search tree, when deleting a node with two non-leaf children, we find either the largest element in its left subtree or the smallest element in its right subtree, and put its The value is transferred to the node to be deleted (as shown here). We then delete the node from which we copied the value. It must have fewer than two non-leaf children. Because only a value is copied without violating any attributes, this reduces the problem to the problem of how to delete nodes that have at most one son. It doesn't care whether this node is the node to be deleted initially or the node from which we copied the value.

 

In the rest of this article, we only need to discuss deleting a node with only one son (if both of its sons are empty, that is, both are leaves, we arbitrarily consider one of them as its son). If we delete a red node, its father and son must be black. So we can simply replace it with its black son without destroying attributes 3 and 4. All paths through the deleted node are only missing a red node, so that attribute 5 can continue to be guaranteed. Another simple case is when the deleted node is black and its son is red. If we just remove this black node and replace it with its red son, attribute 4 will be destroyed, but if we redraw its son as black, all paths that have passed through it will pass through its black son, so that it can continue to be maintained. Property 4.

 

What needs further discussion is that this is a complicated situation when both the node to be deleted and its son are black. We first replace the node to be deleted with its son. For convenience, call this son N, and his brother (the other son of his father) S. In the following diagram, we still use P to call the father of N, SL to call the left son of S, and SR to call the right son of S. We will use the following function to find the sibling node:

C code 

view plain copy to clipboard print ?

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

  1. struct node * sibling(struct node *n)    
  2. {    
  3.         if (n == n->parent->left)    
  4.                 return n->parent->right;    
  5.         else    
  6.                 return n->parent->left;    
  7. }    

struct node * sibling(struct node *n) {
if (n == n->parent->left)
return n->parent->right;
else
return n->parent->left;
}  

 We can use the following code to perform the above summary steps, where the function replace_node replaces child to the position of n in the tree. For convenience, the code in this chapter will assume that empty leaves are represented by actual node objects that are not NULL (the code in the inserted chapter can work with any representation).

C code 

view plain copy to clipboard print ?

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

  1. void delete_one_child(struct node *n)    
  2. {    
  3.         /*  
  4.          * Precondition: n has at most one non-null child.  
  5.          */    
  6.         struct node *child = is_leaf(n->right)? n->left: n->right;    
  7.      
  8.         replace_node(n, child);    
  9.         if (n->color == BLACK) {    
  10.                 if (child->color == RED)    
  11.                         child->color = BLACK;    
  12.                 else    
  13.                         delete_case1(child);    
  14.         }    
  15.         free(n);    
  16. }    

void delete_one_child(struct node *n) {
/* * Precondition: n has at most one non-null child. */
struct node *child = is_leaf(n->right)? n->left: n->right;

replace_node(n, child); if (n->color == BLACK) { if (child->color == RED) child->color = BLACK; else delete_case1(child); } free(n); Copy code

}  

 If N and its original parent are black, then deleting its parent will cause the path through N to have one black node less than the path not through it. Because this violates attribute 4, the tree needs to be rebalanced. There are several situations to consider:

 

**Case 1. **** N is the new root.

        In this case, we are done. We removed a black node from all paths, and the new root is black, so the attributes are maintained.

C code 

view plain copy to clipboard print ?

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

  1. void delete_case1(struct node *n)    
  2. {    
  3.         if (n->parent != NULL)    
  4.                 delete_case2(n);    
  5. }    

void delete_case1(struct node *n) {
if (n->parent != NULL)
delete_case2(n);
}

Note: In cases 2, 5, and 6, we assume that N is the left son of its father. If it is the right son, then the left and right should be reversed in these cases.

 

Case 2. S is red.

 

        In this case we do a left rotation on N's father, transforming the red brother into N's grandfather. We then swapped the colors of N's father and grandfather. Although all paths still have the same number of black nodes, now N has a black brother and a red father, so we can proceed according to 4, 5 or 6 cases. (Its new brother is black because it is a son of red S.)

C code 

view plain copy to clipboard print ?

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

  1. void delete_case2(struct node *n)    
  2. {    
  3.         struct node *s = sibling(n);    
  4.      
  5.         if (s->color == RED) {    
  6.                 n->parent->color = RED;    
  7.                 s->color = BLACK;    
  8.                 if (n == n->parent->left)    
  9.                         rotate_left(n->parent);    
  10.                 else    
  11.                         rotate_right(n->parent);    
  12.         }    
  13.         delete_case3(n);    
  14. }    
  15.    

void delete_case2(struct node *n) {
struct node *s = sibling(n);

if (s->color == RED) { n->parent->color = RED; s->color = BLACK; if (n == n->parent->left) rotate_left(n->parent); else rotate_right(n->parent); } delete_case3(n); Copy code

}

Situation 3:  N's father, S and S's sons are all black.

 

       In this case, we simply redraw S to red. The result is all the paths that pass through S, which are those paths that did not pass through N before, and all have one black node missing. Because deleting the initial parent of N makes all paths through N one less black node, this balances things out. However, all paths that pass through P now have one black node less than the paths that do not pass through P, so property 4 is still violated. To fix this problem, we have to start with Case 1, and do rebalancing on P.

 ,

C code 

view plain copy to clipboard print ?

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

  1. void delete_case3(struct node *n)    
  2. {    
  3.         struct node *s = sibling(n);    
  4.      
  5.         if ((n->parent->color == BLACK) &&    
  6.             (s->color == BLACK) &&    
  7.             (s->left->color == BLACK) &&    
  8.             (s->right->color == BLACK)) {    
  9.                 s->color = RED;    
  10.                 delete_case1(n->parent);    
  11.         } else    
  12.                 delete_case4(n);    
  13. }  

void delete_case3(struct node *n) {
struct node *s = sibling(n);

if ((n->parent->color == BLACK) && (s->color == BLACK) && (s->left->color == BLACK) && (s->right->color == BLACK)) { s->color = RED; delete_case1(n->parent); } else delete_case4(n); Copy code

}   

 

Situation 4. Both S and S's sons are black, but N's father is red.

 

       In this case, we simply exchange the colors of N's brother and father. This does not affect the number of black nodes on the path that does not pass N, but it adds one to the number of black nodes on the path that passes N, adding up the black nodes that are deleted on these paths.

Java code 

view plain copy to clipboard print ?

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

  1. void delete_case4(struct node *n)    
  2. {    
  3.         struct node *s = sibling(n);    
  4.      
  5.         if ((n->parent->color == RED) &&    
  6.             (s->color == BLACK) &&    
  7.             (s->left->color == BLACK) &&    
  8.             (s->right->color == BLACK)) {    
  9.                 s->color = RED;    
  10.                 n->parent->color = BLACK;    
  11.         } else    
  12.                 delete_case5(n);    
  13. }    

void delete_case4(struct node *n) {
struct node *s = sibling(n);

if ((n->parent->color == RED) && (s->color == BLACK) && (s->left->color == BLACK) && (s->right->color == BLACK)) { s->color = RED; n->parent->color = BLACK; } else delete_case5(n); Copy code

}  

 

Case 5. S is black, S's left son is red, S's right son is black, and N is its father's left son.

 

      In this case we do a right rotation on S, so that S's left son becomes S's father and N's new brother. We then exchanged the colors of S and its new father. All paths still have the same number of black nodes, but now N has a black brother whose right son is red, so we enter case 6. Neither N nor its father are affected by this transformation.

C code 

view plain copy to clipboard print ?

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

  1. void delete_case5(struct node *n)    

  2. {    

  3.         struct node *s = sibling(n);    

  4.      

  5.         if (s->color == BLACK)     

  6. /* this if statement is trivial,  

  7. due to Case 2 (even though Case two changed the sibling to a sibling's child,  

  8. the sibling's child can't be red, since no red parent can have a red child). */    

  9.     

  10. //the following statements just force the red to be on the left of the left of the parent,     

  11. //or right of the right, so case six will rotate correctly.     

  12.                 if ((n == n->parent->left) &&    

  13.                     (s->right->color == BLACK) &&    

  14.                     (s->left->color == RED)) {//this last test is trivial too due to cases 2-4.     

  15.                         s->color = RED;    

  16.                         s->left->color = BLACK;    

  17.                         rotate_right(s);    

  18.                 } else if ((n == n->parent->right) &&    

  19.                            (s->left->color == BLACK) &&    

  20.                            (s->right->color == RED)) {

    //this last test is trivial too due to cases 2-4.     

  21.                         s->color = RED;    

  22.                         s->right->color = BLACK;    

  23.                         rotate_left(s);    

  24.                 }    

  25.         }    

  26.         delete_case6(n);    

  27. }    

void delete_case5(struct node *n) {
struct node *s = sibling(n);

if (s-> color == BLACK) copy the code

/* this if statement is trivial, due to Case 2 (even though Case two changed the sibling to a sibling's child, the sibling's child can't be red, since no red parent can have a red child). */

//the following statements just force the red to be on the left of the left of the parent,
//or right of the right, so case six will rotate correctly.
if ((n == n->parent->left) &&
(s->right->color == BLACK) &&
(s->left->color == RED)) {//this last test is trivial too due to cases 2-4.
s->color = RED;
s->left->color = BLACK;
rotate_right(s);
} else if ((n == n->parent->right) &&
(s->left->color == BLACK) &&
(s->right ->color == RED)) {//this last test is trivial too due to cases 2-4.
s->color = RED;
s->right->color = BLACK;
rotate_left(s);
}
}
delete_case6( n);
}  

 

Case 6. S is black, S's right son is red, and N is its father's left son.

 

       In this case we do a left rotation on N's father, so that S becomes the father of N's father and S's right son. We then exchange the colors of N's father and S, and make S's right son black. The subtree still has the same color at its root, so attribute 3 is not violated. However, N now adds a black ancestor: either N's father becomes black, or it is black and S is added as a black grandfather. Therefore, a black node is added to the path through N.

       At this time, if a path does not pass through N, there are two possibilities:

      It passed N's new brother. Then it must pass through the fathers of S and N before and now, and they just exchanged colors. So the path maintains the same number of black nodes. 
It passed N's new uncle, S's right son. Then it used to pass S, S's father and S's right son, but now it only passes S, it is assumed to be the color of its previous father, and S's right son, it is changed from red to black. The composite effect is that this path passes through the same number of black nodes. 
In any case, the number of black nodes on these paths has not changed. So we restored attribute 4. The white nodes in the diagram can be red or black, but the same color must be specified before and after the transformation.

C code 

view plain copy to clipboard print ?

10 20 30 40 50 60 70 80 90 100 110 120 130 140 150

  1. void delete_case6(struct node *n)    
  2. {    
  3.         struct node *s = sibling(n);    
  4.      
  5.         s->color = n->parent->color;    
  6.         n->parent->color = BLACK;    
  7.      
  8.         if (n == n->parent->left) {    
  9.                 s->right->color = BLACK;    
  10.                 rotate_left(n->parent);    
  11.         } else {    
  12.                 s->left->color = BLACK;    
  13.                 rotate_right(n->parent);    
  14.         }    
  15. }    

void delete_case6(struct node *n) {
struct node *s = sibling(n);

s->color = n->parent->color; n->parent->color = BLACK; if (n == n->parent->left) { s->right->color = BLACK; rotate_left(n->parent); } else { s->left->color = BLACK; rotate_right(n->parent); } Copy code

}  

 

       Similarly, function calls all use tail recursion, so the algorithm is in-place. In addition, no recursive calls are made after the rotation, so a constant number (up to 3) of rotations are performed.

 

 

Advantages of red-black trees

 

Red-black trees can search, insert, and delete operations with O(log2(N)) time complexity. In addition, any imbalance will be resolved within 3 rotations. This is something that AVL does not have.

 

And in practical applications, many languages have implemented the red-black tree data structure. Such as TreeMap, TreeSet (Java), STL (C++), etc.