An article to learn about trees in data structures

1. Tree knowledge system construction

Tree

2. Tree basics

2.1 About the tree structure

The tree structure is a very important non-linear structure, which is a one-to-many hierarchical structure defined by the branch relationship.
A tree is a finite set of n (n≥0) nodes. When n=0, it is called an empty tree. In any non-empty tree, it should satisfy:
a. There is and only one specific node called the root.
b. When n>1, the remaining nodes can be divided into m (m>0) disjoint finite sets T, T2,..., Tm, each of
which root's subtree.

A tree structure has the following properties:

  • The root node of the tree has no predecessor, and all nodes except the root node have one and only one predecessor.
  • All nodes in the tree can have zero or more successors.

2.2 Related concepts of trees

Node : A data element and its branches pointing to its subtrees.
Degree of a node : The number of subtrees owned by a node is called the degree of a node.
Degree of a tree : The maximum value of the degree of a node in a tree is called the degree of the tree.
Leaf node : A node with a degree of 0 (no branch down) in the tree is called a leaf node (or terminal node).
Non-leaf nodes : Nodes whose degree is not 0 in the tree (with branches below) are called non-leaf nodes (or non-terminal nodes or branch nodes).
In addition to root nodes, branch nodes are also called internal nodes.
Child nodes : The root of a node's subtree is called the node's child nodes or child nodes.
Parent node or parent node : corresponding to the child node, the root of the subtree.
Brother node : All child nodes of the same parent node are called brother nodes.
Level : It is stipulated that the level of the root node in the tree is 1, and the level of its multiple nodes is equal to the level of its parent node plus 1.
Cousin nodes : All nodes on the same layer but with different parents are called cousin nodes.
Hierarchical path : Starting from the root node, all the nodes passed through to reach a certain node p are called the hierarchical path of node p (there is only one).
Ancestor node : All nodes (except p) on the hierarchical path of node p are called the ancestors of p.
Descendent node : Any node in the subtree rooted at a certain node is called the descendant node of the node.
Tree depth : The maximum level value of nodes in the tree, also known as the height of the tree.
number of branches = number of children = degree

2.3 Main properties of trees

Property 1 : The number of nodes in the tree is equal to the sum of the degrees of all nodes plus 1.
Property 2 : For an m-degree tree, define the number of leaf nodes as n 0 n_0n0, the number of nodes with degree 1 is n 1 n_1n1,... , the number of nodes with degree m is nm n_mnm,
so there are n 0 = n 2 + 2 ∗ n 3 + 3 ∗ n 4 + … + ( m − 1 ) ∗ nm + 1 n_0=n_2+2*n_3+3*n_4+…+(m-1)*n_m +1n0=n2+2n3+3n4++(m1)nm+1
Property 3: In a non-empty m-degree tree, there are at mostmi − 1 m^{i -1}mi 1 nodes (i≥1).
Property 4: m(m>1) degree tree with height h, at mostmh − 1 m − 1 \frac{m^{h}-1}{m-1}m1mh1nodes.
Property 5 : The minimum height of an m-degree tree with n nodes is log ⁡ m [ n ∗ ( m − 1 ) + 1 ] \log_m[n * (m -1) +1]logm[n(m1)+1]

2.4 Tree Representation

Parent Notation
Child Chain Notation
Child Sibling Notation

3. Binary tree

3.1 Definition and main features

The definition of a binary tree only needs to change the m branches in the definition of the tree to at most two branches.

3.1.1 Notes on binary trees
  • a. The binary tree is divided into left and right subtrees, which means that the binary tree is an ordered tree, which is different from the binary tree. A binary tree may not be an ordered tree; but a binary tree must be an ordered tree.
  • b. The binary tree has at most two branches, that is, the degree of the binary tree is at most 2, but it does not mean that the degree of each node is 2.
3.1.2 Properties of Binary Trees

Property 1 : The number of nodes in the tree is equal to the sum of the degrees of all nodes plus 1.
Property 2 : For any binary tree, define the number of leaf nodes as n 0 n_0n0, the number of nodes with degree 2 is n 2 n_2n2 n 0 = n 2 + 1 n_0=n_2+1 n0=n2+1
Property 3: In a non-empty binary tree, there are at most2 i − 1 2^{i -1}2i 1 nodes (i≥1).
Property 4: A binary tree of height k has at most2 k − 1 {2^{k}-1}2k1 node (k≥1).
Property 5: The minimum height of a binary tree with n nodes islog ⁡ 2 ( n + 1 ) \log_2(n +1)log2(n+1)

3.1.3 Binary tree classification

Full binary tree, complete binary tree, binary sorted tree, balanced binary tree, binary tree
, heap.

  • Full binary tree : full, a tree with a depth of k and 2 k − 1 {2^{k}-1}2kAn ambiguous tree with 1 node is called a full ambiguous tree .
  • Complete binary tree : If the depth is k, a binary tree composed of n nodes, if and only if each node corresponds to the nodes numbered from 1 to n in the full binary tree with depth k, the binary tree called a complete binary tree. (corresponding position, corresponding number, no jumping number)
3.1.4 Features of full binary tree
  • The number of nodes on each layer is always the maximum number of nodes.
  • All non-leaf nodes have left and right subtrees.
  • Continuously number the nodes of the full binary tree. If it is stipulated to start from the root node 1, it will be carried out according to the principle of "top-down, white left to right".
3.1.5 Properties of Complete Binary Trees
  • Property 1: In a non-empty fork tree, there are at most 2 k − 1 2^{k-1} on the i-th layer2k 1 nodes (i≥1).

  • Property 2: A binary tree with a depth of k has at most 2 k − 1 2^{k}-12k1 node (k≥1).

  • Property 3: For any binary tree, if the number of leaf nodes is n, the number of nodes with degree 2 is n 2 n_2n2, then n 0 = n 2 ten 1 n_0=n_2 ten ten 1n0=n2Ten 1

  • Property 4: The number of nodes in the tree is equal to the sum of the degrees of all nodes plus 1

  • Property 5: The first n nodes numbered from 1 to n in a full binary tree of depth k form a complete binary tree of depth k, where 2 k − 1 ≤ n ≤ 2 k − 1 2^{k- 1 }\le n \le2^k-12k1n2k1

  • Property 6: The depth of a complete ambiguous tree with n nodes is ( log 2 n ) + 1 (log_2n)+1(log2n)+1 orlog 2 ( n + 1 ) log_2(n+1)log2(n+1)

  • Property 7: If the depth of a complete binary tree is k, all leaf nodes appear in the kth or k-1th layer.

  • Property 8: For any node, if the maximum level of its right subtree is 1, then the maximum level of its left subtree is 1 or 2.

  • Property 9: If a complete binary tree with n nodes (depth is ( log 2 n ) + 1 (log_2n)+1(log2n)+1 ) The nodes are layered (from layer 1 to( log 2 n ) + 1 (log_2n)+1(log2n)+1 layer) sequence is numbered from left to right, then for number 1(1 ≤ i ≤ n 1\le i \le n1iThe node of n
    ) has the following situations: ①If i=1, then node i is the root of the binary tree and has no parent node; otherwise, if i>1, the parent node number is [i/2」
    ② If 2i>n, then node i is a leaf node and has no left child: otherwise, its left child node number is 2i. ③If
    2i+1>n, then node i has no right child; otherwise, its right child The node number is 2i+1.

  • Property 10: For a complete binary tree, the number of nodes with degree 1 is 1 or 0.

3.2 Binary tree storage

3.2.1 Sequential structure storage

The sequential storage of the binary tree is to store the number of each node as the subscript of the array in the array; if the complete binary tree is not satisfied, we can build a complete binary tree by filling in the blank flag.
In the worst case, a one-branched tree of depth k and only k nodes needs to be of length 2 k − 1 2^{k-1}2One-dimensional array of k 1 . At this time, it is a single right-branched tree. Generally, only a complete binary tree is stored in a sequential structure, and a general binary tree is not suitable for a sequential structure.

#define MAX_SIZE 100
typedef int sqbitree[MAX_SIZE];
3.2.2 Chain structure storage

Because the binary tree has left and right subtrees, each node in the chain structure of the binary tree has 3 fields: a data field, and two pointer fields pointing to the left and right child nodes respectively.

3.3 Binary tree traversal

Binary tree traversal is divided into: pre-order traversal, in-order traversal, post-order traversal and level traversal.
Pre-order traversal, in-order traversal, and post-order traversal require the use of stacks, while hierarchical traversal requires the use of queues. No matter what kind of traversal, during the traversal process, the relative order of the leaf nodes is unchanged.

3.3.1 Basic concepts

Access refers to doing some kind of processing on the node, such as outputting information and modifying the value of the node. We call the operation of accessing the node traversal. Traversing the binary tree refers to visiting each node in the binary tree once according to the specified rules. and only once.

If L, D, and R represent traversing the left subtree, traversing the root node, and traversing the right subtree, respectively, there will be the following six traversal schemes: LDR, LRD, DLR, DRL, RLD, and RDL. If it is agreed in advance that the first left and then the right, there are only the following three situations: DLR - pre-order (root) traversal, LDR - in-order (root) traversal, LRD - post-order (root) traversal.

3.3.2 Preorder traversal

Preorder traversal is to start from the root node of the binary tree, output the node data when the node is reached for the first time, and visit in the direction of first going to the left and then going to the right. If the binary tree is empty, the traversal ends. Otherwise, there are the following situations:
a. Visit the root node.
b. Traverse the left subtree in preorder (call this algorithm recursively).
c. Pre-order traversal of subtrees (this algorithm is called).

If recursion is used for traversal, the system stack will be called for traversal. The specific code is as follows:

//二叉树定义部分:
#include<stdio.h>
#include<stdbool.h>
typedef int ElementType;
typedef struct BTNode{
    
    
    ElementType data;
    struct BTNode * Lchild, *Rchild;
}BTNode;

void PreorderTraverse(BTNode * T){
    
    
    //先序遍历,递归写法
    //如果为空,则访问根结点
    if(T == NULL) return;
    printf("%d\n", T->data);
    PreorderTraverse(T->Lchild);
    PreorderTraverse(T->Rchild);
}

If you use the user stack, you can use the following code:

//定义需要用的类型及函数
#include<stdio.h>
#include<stdbool.h>
#define MAX_SIZE 100
typedef int ElementType;
typedef struct BTNode{
    
    
    ElementType data;
    struct BTNode * Lchild, *Rchild;
}BTNode;
typedef struct Stack{
    
    
    //定义栈
    //定义一个存放二叉树指针的数组
    BTNode * data[MAX_SIZE];
    int top;
}SqStack;

void Init_stack(SqStack * S){
    
    
    S->top = 0;
}

int isEmpty(SqStack * S){
    
    
    if(S->top == 0) return 1;
    else return 0;
}

bool push(SqStack * S, BTNode * node){
    
    
    if (S->top == MAX_SIZE) return false;
    S->data[S->top] = node;
    S->top++;
    return true;
}

BTNode * pop(SqStack * S){
    
    
    //出栈
    if (S->top == 0) return NULL;
    S->top--;
    return S->data[S->top];
}

void PreorderTraverse_user(BTNode * root){
    
    
    //二叉树遍历,调用用户栈
    //用来暂存结点的栈
    //BTNode * stack[100];
    SqStack * stack;
    Init_stack(stack);
    // 新建一个工作结点,并指向根结点
    BTNode * node = root;
    // 当遍历到最后一个结点,若左右子树都为空,且栈也为空,则跳出循环。
    //while (node != NULL || !stack.isEmpty()){
    
    
    while (node != NULL || !isEmpty(stack)){
    
    
        while(node != NULL){
    
    
            printf("%d\n", node->data);
            //暂存该结点
            //stack.push(node);
            push(stack, node);
            node = node->Lchild;
        }
        //if (!stack.isEmpty()){
    
    
        if (!isEmpty(stack)){
    
    
            //左子树为空,再取出该元素,并获取其右子树
            //node = stack.pop();
            node = pop(stack);
            node = node->Rchild;
        }
    }
}


3.3.3 Inorder traversal

In-order traversal starts from the root node of the binary tree, outputs node data when the node is reached for the second time, and visits in the direction of first going to the left and then going to the right. The recursive definition of the algorithm is: if the binary tree is empty, the traversal ends; otherwise, the following situations occur:
a. Inorder traversal of the left subtree (recursively call this algorithm).
b. Visit the root node.
c. Inorder traverse the right subtree (recursively call this algorithm).
The implementation code for calling the recursive method is as follows:

void InorderTraverse(BTNode * T){
    
    
    // 中序遍历
    if (T == NULL) return;
    InorderTraverse(T->Lchild);
    printf("%d", T->data);
    InorderTraverse(T->Rchild);
}

Call the user stack, the custom method is as follows:

void midorderTraversal(BTNode * root){
    
    
    SqStack * stack;
    Init_stack(stack);
    BTNode * node = root;
    while(node != NULL || !isEmpty(stack)){
    
    
        while(node != NULL){
    
    
            push(stack, node);
            node = node->Lchild;
        }
        if (!isEmpty(stack)){
    
    
            node = pop(stack);
            printf("%d", node->data);
            node = node ->Rchild;
        }
    }
}

3.3.4 Postorder traversal

Post-order traversal starts from the root node of the binary tree, outputs node data when the node is reached for the third time, and visits in the direction of first going left and then going right. The recursive definition of the algorithm is: if the binary tree is empty, the traversal ends; otherwise, the following situations occur:
a. Post-order traversal of the left subtree (recursively call this algorithm).
b. Post-order traverse the right subtree (recursively call the wood algorithm).
c. Visit the root node.
If the system stack is used, the recursive method is as follows:

void postTraverse(BTNode * T){
    
    
    //后续遍历
    if (T == NULL) return;
    postTraverse(T->Lchild);
    postTraverse(T->Rchild);
    printf("%d\n", T->data);
}

If the user stack is used, the custom method is as follows:

void postTraversal(BTNode * root){
    
    
    SqStack * stack;
    BTNode * node = root;
    BTNode * last = root;
    while(node != NULL || !isEmpty(stack)){
    
    
        while (node != NULL){
    
    
            push(stack, node);
            node =node->Lchild;
        }
        // 查看当前栈顶元素
        //如果其右子树为空,或者右子树已经访问,则输出该结点的值
        if (node->Rchild == NULL || node->Rchild == last){
    
    
            printf("%d\n", node->data);
            pop(stack);
            last = node;
            node = NULL;
        }else{
    
    
            node = node->Rchild;
        }
    }
}

3.3.5 Hierarchy traversal

Hierarchical traversal of a binary tree starts from the root node, and visits each node in the tree in a hierarchical order "from top to bottom, from left to right". To ensure traversal by level, a queue must be set, which is empty when initialized. Let T be a pointer variable pointing to the root node, and the non-recursive method of hierarchical traversal is: if the binary tree is empty, then return; otherwise, let p=T, and p enters the queue.
a. The first element of the team is dequeued to p.
b. Visit the node pointed to by p.
c. Put the left and right child nodes of the node pointed to by p into the queue in turn until the queue is empty.
The implementation code is as follows:

void leverTraverse(BTNode * T){
    
    
    // 定义一个队列来存储结点
    BTNode * Queue[MAX_SIZE];
    BTNode * p = T;
    int front = rear = 0;
    // 当结点部位空时,开始进出队列
    if(p != NULL){
    
    
        Queue[rear++] = p;
        while(front < rear){
    
    
            //当队列不为空,进行输出和访问。
            p = Queue[front++];
            printf("%d", p->data);
            //有左结点,就将左节点放到队列中
            if (p->Lchild != NULL)
                Queue[front++] = p->Lchild;
            if (p->Rchild != NULL)
                Queue[front++] = p->Rchild;
        }
    }
}

3.3.6 Generation of binary tree

The following three combinations can uniquely determine a binary tree:

  • Preorder traversal + inorder traversal
  • post-order traversal + in-order traversal
  • Hierarchical traversal + inorder traversal

3.4 Thread Binary Tree

The binary tree with clues added to the nodes of the binary tree is called a threaded binary tree, and the process of traversing the binary tree in a certain traversal method (such as pre-order, in-order, post-order, or hierarchy, etc.) to turn it into a threaded binary tree is called a threaded binary tree. Thread a binary tree.

3.4.1 Marking method of threaded binary tree
  • If the node has a left child, Lchild points to its left child: otherwise, it points to its immediate predecessor.
  • If the node has a right child, Rchild points to its right child; otherwise, it points to its direct successor.
    The structure of the clue binary tree is as follows:
left child or immediate predecessor Is there no left child data Is there no right child right child or immediate successor
Lchild Ltag data Rday Rchild

Taking Ltag as an example, the marking method is as follows:

  • Ltag is marked as 0, and Lchild points to the left child of the node
  • Ltag is marked as 1, and Lchild points to the immediate predecessor of the node
3.4.2 Construction of threaded binary tree

Write out the traversal sequence first, and then add pointers to the left and right subtrees or clues to the direct predecessor and direct successor according to whether there are subtrees.
The following is an example of building a threaded binary tree:
threaded binary tree

4. Trees, forests

4.1 Conversion of tree and binary tree

4.1.1 Convert the tree to a binary tree

For a general tree, it can be easily converted into a unique binary tree corresponding to it. The detailed steps are as follows (child brother notation):

  • ① Add a dotted line. Add dotted lines between sibling nodes in order from left to right at each level of the tree.
  • ② Go to the connection. Except for the first leftmost child node, the connections between the parent node and all other child nodes are removed.
  • ③Rotate. Rotate the tree 45 degrees clockwise, and the original solid line is inclined to the left.
  • ④ Integer. Change all dotted lines in the rotated tree to solid lines and slant to the right.

Here is an example, as follows:
conversion graph

4.1.2 Binary tree to tree

To convert a binary tree into a tree, the steps are as follows:

  • ① Add a dotted line. If a node i is the root node of the left subtree of its parent node, then the right child node of the node i and all the right child nodes searched along the right child chain, and the i node's Parent nodes are connected by dotted lines.
  • ② Go to the connection. Remove all links between the parent node and its right child node in the binary tree.
  • ③ Regularization. Arrange the nodes in the graph hierarchically and change all dotted lines into solid lines. Turn 45 degrees counterclockwise.

To give an example, as follows:
binary tree to tree

4.2 Tree storage structure

4.3 Forest and Binary Tree Conversion

4.3.1 Forest to Binary Tree

The conversion steps of forest to binary tree are as follows:

  • ①Convert each tree in F={T,T2,…,T} into a binary tree.
  • ②According to the order of the trees in the given forest, starting from the last binary tree, each binary tree is used as the right subtree of the root node of the previous binary tree, and so on, then the root node of the first tree is the converted The root node of the generated binary tree.

Here is an example of a forest turned into a tree:
Forest to Binary Tree

4.3.2 Binary tree to forest

The steps to convert a binary tree into a forest are as follows:
① Go to the connection. Remove all the connections between the root node of the binary tree and its right child node, as well as all right child nodes along the direction of the right child node chain, to obtain several isolated binary trees, and each tree corresponds to the trees in the original forest in turn of binary trees.
②Restore the binary tree. Restore each isolated binary tree to a general tree by the method of restoring a binary tree to a tree.
An example of converting a binary tree to a forest is as follows:
Binary tree to forest

4.4 Tree and forest traversals

There are two types of tree traversal: preorder traversal and postorder traversal.

  • Preorder traversal: visit the root node first, and then traverse each subtree in order. The preorder traversal of the tree is the same as the preorder traversal of the binary tree after converting the tree to a binary tree.
  • Post-order traversal: first traverse each subtree in post-order, and then visit the root node. The post-order traversal of the tree is the same as the in-order traversal of the binary tree after converting the tree to a binary tree.

5. The application of tree and binary tree

5.1 Binary Sorting Tree

5.2 Balanced Binary Tree

5.3 Huffman tree and Huffman coding

5.3.1 Related concepts
  • Node path : A branch from one node to another in the tree constitutes a path between those two nodes.
  • Path Length : The number of branches on a path to a node is called the path length.
  • The path length of the tree : the sum of the path lengths from the root of the tree to each node.
  • The weighted path length of a node : the product of the path length from the node to the root node of the tree and the weight (value) of the node. Among them, weight (value) is an abstract name for various expenses, costs, frequencies, etc.
  • The weighted path length of the tree : the sum of the weighted path lengths (WTL) of all leaf nodes in the tree, recorded as
    WPL = ω 1 ι 1 + ω 2 ι 2 + . . . + ω n ι n = ∑ i = 1 n ω i ι i WPL = \omega_1\iota_1+\omega_2\iota_2+...+\omega_n\iota_n=\sum_{i=1}^{n}\omega_i\iota_iWPL=oh1i1+oh2i2+...+ohnin=i=1nohiii
  • Huffman (Huffman) tree : has n leaf nodes (the weight of each node is ω 1 \omega_1oh1) has more than one binary tree. But among all these binary trees, there must be a tree with the smallest WPL value, which is called the Huffman tree (or the optimal tree).

The following is an example of calculating WPL:
WPL calculation

5.3.2 Construction of Huffman tree

Construction points: It is necessary to make the length of the path with long encoding longer, and the length of path with short encoding should be shorter.
Given n weights are w 1 w_1w1, w 2 w_2 w2,…, w n w_n wnnodes, the algorithm for constructing a Huffman tree is described as follows:

  1. The n nodes are respectively regarded as n binary trees containing only one node to form a forest F.
  2. Construct a new node, select two trees with the smallest root node weight from F as the left and right subtrees of the new node, and set the weight of the new node as the root node on the left and right subtrees The sum of the weights of .
  3. Delete the two trees just selected from F, and add the newly obtained tree to F at the same time.
  4. Repeat steps 2 and 3 until only one tree remains in F.

Example:
If w={8,3,4,6,5,5} is constructed into a Huffman tree, the steps are as follows:
Huffman tree 1

Huffman tree 2
Its WPL value is as follows:

W P L = 6 ∗ 2 + 3 ∗ 3 + 4 ∗ 3 + 8 ∗ 2 + 5 ∗ 3 + 5 ∗ 3 = 79 WPL=6*2+3*3+4*3+8*2+5*3+5*3=79 WPL=62+33+43+82+53+53=79

5.3.3 Huffman coding

If none of the encodings is a prefix of another encoding, then such encodings are called prefix encodings.
It is a natural process to obtain Huffman codes from Huffman trees. First, each character that appears is regarded as an independent node, and its weight is the frequency (or times) of its appearance, and the corresponding Huffman tree is constructed. Obviously, all character nodes appear in leaf nodes. We can interpret a character's encoding as a sequence of edge labels on the path from the root to that character, where an edge label is 0 for "turn to the left child" and 1 for "turn to the right child".

How to judge whether it is Huffman encoding?

  • Method 1: According to whether it conforms to the definition of prefix encoding
  • Method 2: Draw the Huffman tree to see if it meets
5.3.4 Characteristics of Huffman tree
  • A Huffman tree has only degree 0 nodes and degree 2 nodes.
  • The Huffman tree has the smallest WPL value.
  • The Huffman tree is not unique, because the left and right subtrees of the Huffman tree can be exchanged, but the W value is unique.
  • Huffman trees are not binary trees in nature, but we usually think of Huffman trees as binary trees.
  • The weight of the upper node of the Huffman tree is not less than the weight of the lower node.
  • Huffman coding only discusses the coding of leaves.

5.4 Red-black tree

5.5 Union search and its application

Union check set is a very delicate and practical data structure, which is mainly used to deal with the merge and query questions of some disjoint sets. Common uses include finding connected subgraphs, Kruskal's algorithm for finding the minimum spanning tree, and finding the nearest common ancestor.
When using a union search, first a set of disjoint dynamic sets S={S1,s2,...,Sn} will be stored, and an integer is generally used to represent an element in the set.

Union check is a simple collection representation that supports the following three operations:

  1. Initial(S): Initializes each element in the collection S as a subcollection with only one single element.
  2. Union(S, Root1, Root2): Merge the sub-set Root2 in the set S into the sub-set Root1. It is required that Root1 and Root2 are mutually disjoint, otherwise no merge is performed.
  3. Find(S,x): Find the sub-set where the single element x in the set S is located, and return the root node of the sub-set.

Usually, the parent representation of the tree (forest) is used as the storage structure of the union search set, and each sub-set is represented by a tree. All the trees representing the sub-collections form a forest representing the full collection and are stored in the parent representation array. Usually, the subscript of the array element is used to represent the element name, and the subscript of the root node is used to represent the subcollection name, and the parent node of the root node is a negative number.

Guess you like

Origin blog.csdn.net/qq_41780234/article/details/127295174