Learning data structure-Chapter 4: Trees and Binary Trees (Binary Sort Tree, Binary Balance Tree, Huffman Tree)

Chapter 4: Trees and Binary Trees (Applications of Trees and Binary Trees: Binary Sort Tree, Binary Balanced Tree, Huffman Tree)

1. Binary Sort Tree

Binary sort tree : BST, also known as binary search tree

The binary sort tree is either an empty tree or a non-empty tree. When it is a non-empty tree, it has the following characteristics:

  • If the left subtree is not empty, then the key value of all nodes on the left subtree is the key of the 均小于root node
  • If the right subtree is not empty, the key value of all nodes on the right subtree is the key of the 均大于root node
  • The left and right subtrees themselves are also a binary sorting tree.

Note that here is less than and greater than but not equal, which means that there are no nodes with the same value in the binary sort tree.

Binary Sort Tree

In-order traversal of binary sort tree:1 2 3 4 5 6 8 10 16

It can be found here that the middle-order traversal result of the binary sort tree increases in time, which is consistent with all binary sort trees.

An increasing ordered sequence when the middle order of a binary sort tree traverses the sequence

1.1 Finding a Binary Sort Tree

  • When the binary tree is not empty, search for the root node, and if it is equal, the search is successful;
  • If it is not equal, when it is less than the value of the root node, find the left subtree; when it is greater than the value of the root node, find the right subtree.
  • When the leaf node is found but the corresponding value is not found, the search fails.

Binary Sort Tree

Exercise:

Find 5: first 8>5 finds the left subtree, 5>4 finds the right subtree, 5=5 finds a successful
search 6: first 8>6 finds the left subtree, 6>4 finds the right subtree, 6>5 finds the right Subtree, 6<7 The search for the left subtree is empty, and the search fails.

According to this process, it is actually easy to find that the entire search process can be completed using recursion, and you can try it yourself. Here we use non-recursive query, and the code is written:

Ref. 1 Binary Tree, Ref. 2 Keyword Parameter 3, save the parent node of the found node. It is a pointer reference variable. When the pointer is modified in the function body, not only the formal parameters will be modified, but also Modify the variable pointer we passed in.

BSTNode *BST_Search(BiTree T,ElemType key,BSTNode *&p){
    
    
    p = NULL; //双亲结点置为空(根结点没有双亲结点)
    while(T != NULL && key != T->data){
    
    //树非空且关键字不匹配
       p = T; //p指向改结点
       if(key < T->data){
    
    
           T = T->lchild; //循环查找左子树
       }else{
    
    
           T = T->rchild; //循环查找右子树
       }
    }
    return T;
}

Time complexity : O(h) (h is the height of the binary sort tree)

1.2 Insertion of Binary Sort Tree

  • If the binary sort tree is empty, insert the node directly;
  • If the binary sort tree is not empty, when the value is less than the root node, insert the left subtree; when the value is greater than the root node, insert the right subtree; when the value is equal to the root node, do not insert

Binary Sort Tree

Exercise:

Insert 6: 6<8 into the left subtree, 6>4 into the right subtree, and 6>5 into the left subtree.

Binary Sort Tree

Code writing:

//参1 二叉树(注意是引用),参2 插入值
int BST_Insert(BiTree &T,KeyType k){
    
    
	if(T==NULL){
    
     //树为空
		T = (BiTree)malloc(sizeof(BSTNode));
		T->key = k;
		T->lchild = T->rchild = NULL;
		return 1;
	}else if(k == T->key){
    
     //相同不插入
		return 0;
	}else if(k < T->key){
    
    //小于插入左子树中,递归调用
		return BST_Insert(T->lchild,k);
	}else{
    
    //大于插入右子树中,递归调用
		return BST_Insert(T->rchild,k);
	}
}

1.3 Constructing a binary sort tree

Read in an element and create a node, if the binary tree is empty, use it as the root node; if the binary sort tree is not empty, when the value is less than the root node, insert the left subtree; when the value is greater than the root node, insert Right subtree; when the value is equal to the root node, no insertion is performed.

//参3 插入结点的数量
void Create_BST(BiTree &T,KeyType str[],int n){
    
    
	T = NULL;
	int i=0;
	while(i<n){
    
    
		BST_Insert(T,str[i]);
		i++;
	}
}

The construction process of the binary sort tree even if the values ​​of the two arrays are exactly the same, but the construction order is different, the resulting binary sort tree is different.

1.4 Deletion of Binary Sort Tree

Suppose we delete 4 nodes, how should the left and right subtrees of the remaining 4 nodes form the left subtree of node 8?

In order to maintain the basic nature of the binary tree, the deletion operation is actually more complicated. We need to divide it into three situations:

  • 1. If the deleted node z is a leaf node, it can be deleted directly without affecting
  • 2. If the deleted node z has only one subtree, let the subtree of z become the subtree of the parent node of z, with the title of node z.

For example, we delete node 5


  • 3. If the deleted node z has two subtrees, let the in-order sequence of z directly replace z, and delete the direct successor node.

For example, if we delete node 4, we know that the immediate successor of node 4 is the leftmost node of the right subtree of 4, here is 5, we directly replace node 4 with node 5, and then directly Just delete node 5. Why can it be deleted directly here? Because this node must be the leftmost node, either there is no subtree (leaf node), or only the right subtree (if there is a left subtree, it is not the leftmost node), so delete Just refer to the above two situations directly.


Thinking:
Delete and insert a node in the binary sort tree. Is the binary sort tree the same as the original one?

First we delete a node 7:

Then insert node 7

At this time, it is found that the binary sort tree is the same after deletion and insertion. This is the case of deleting leaf nodes. After deletion and insertion, the binary sort tree is the same. Is there any difference?

If we delete the 5 of a parent node

Then insert node 5

At this point we find that the binary sort tree is different after deletion and insertion.

Therefore: deleting and inserting a node in the binary sorting tree, depending on the type of the node deleted and inserted, the resulting binary sorting tree may be the same or different.

1.5 search efficiency

Search length : the number of nodes experienced when searching for this node.

Average search length (ASL) : The sum of the search lengths of all nodes takes the average value, which depends on the height of the tree.

E.g:

Search efficiency : O(log2n)

Search efficiency : O(n)

2. Balanced Binary Tree

Balanced binary tree : AVL, any node 平衡因子is 绝对值less than one.

Balancing factor: the height of the left subtree -right subtree height

The binary tree shown in the figure above, is it a balanced binary tree?

The balance factor of all nodes can be calculated and judged, and it can be seen that the binary tree is a balanced binary tree.

The 最小(结点数最少)平衡二叉树number of nodes with height h, Nh?

According to the value of the node on the right side of the balanced binary tree, you can choose: h, h-1, and h-2. If you choose h, plus 1 level of the root node, the total number of levels is h+1, which does not meet the meaning of the question. Among them, h-1 and h-2 can be selected, but why choose h-2? This is because we are talking about a minimum balanced binary tree, and the number of nodes at the h-1 level is definitely more than the number of nodes at the h-2 level. N0=0 (the number of nodes at level 0 is 0) N1=1 (the number of nodes is 1)

For example, we now want to calculate the number of nodes of the minimum balanced binary tree with a height of 3. then:

N3 = N2 + N1 + 1; =》 N2 = N1 + N0 + 1 = 2; =》 N3 = 2 + 1 + 1 = 4;

2.1 Judgment of balanced binary tree

Use the recursive post-order traversal process:

  • Determine the left subtree is a balanced binary tree
  • Determine the right subtree is a balanced binary tree
  • Judge the binary tree rooted at this node as a balanced binary tree.
    Judgement conditions
    If the left subtree and the right subtree are both balanced binary trees and the absolute value of the height difference between the left subtree and the right subtree is less than or equal to 1, it is balanced.

According to the judgment condition, we know that each node needs to save two variables: one is the balance of the node (b:1 balance 0 unbalance) and the other is the height of the node (h).

//参1 该棵树的根节点
void Judge_AVL(BiTree bt,int &balance,int &h){
    
    
    //左子树平衡性左子树高度 右子树平衡性右子树高度
	int bl=0,br=0,hl=0,hr=0;
	if(bt==NULL){
    
    //如果根节点为空
		h=0; //高度设为0
		balance=1; //并且是平衡的
	}else if(bt->lchild==NULL&&bt->rchild==NULL){
    
    
	    //左子树和右子树都为空
		h=1; //高度为1
		balance=1; //平衡的
	}else{
    
    
		Judge_AVL(bt->lchild,bl,hl); //判断左子树
		Judge_AVL(bt->rchild,br,hr); //判断右子树
		//下面计算该节点为根二叉树的高度
		//首先判断哪个子树的高度高,然后加1即可
		if(hl>hr){
    
    
			h=hl+1;
		}else{
    
    
			h=hr+1;
		}
		//判断平衡性
		// abs 是取绝对值
		if(abs(hl-hr)<2&&bl==1&&br==1){
    
    
			balance=1;
		}else{
    
    
			balance=0;
		}
	}
} 

3. Balanced binary tree insertion

The insertion process of the balanced binary tree is actually one step more than the insertion process of the binary sort tree. If the insertion process of the binary sort tree is followed, the formed binary tree is not necessarily a balanced binary tree, so we need to insert it first and then adjust it. Namely:先插入后调整 .

Adjustment principle : adjust the minimum unbalanced subtree each time

For example, as shown in the figure above, we need to adjust after the insertion is completed. Our adjustment starts from the insertion node and moves upwards. First, the 4-node balance factor is -1, which conforms to the balanced binary tree, and then the upward 6-node balance factor is 2, which does not conform to the balanced binary tree and needs to be adjusted.

3.1LL balance rotation (right single rotation)

The reason for the imbalance : A new node is inserted in the left subtree of the left child of node A.

Adjustment method : Right-handed operation: Replace A with A's left child B, call A node as the root node of B's ​​right subtree, and B's original right subtree as A's left subtree.

When there is an imbalance, adjust the following to a balanced binary tree:

3.2RR balance rotation (left single rotation)

The reason for the imbalance : A new node is inserted in the left subtree of the right child of node A.

Adjustment method : Left-handed operation: replace A with A's right child B, call A's node B's left subtree root node, and B's original left subtree as A's right subtree.

3.3LR balance rotation (first left and then right double rotation)

The reason for the imbalance : A new node is inserted in the right subtree of the left child of node A.

Adjustment method : first left-handed and then right-handed operation: replace the node C of the right child of A's left child B, and then replace the position of A with the node C up.

Note: Cl and Cr may also be empty, because the right subtree Br of B may be empty.

3.4RL Balance Rotation (Double Rotation from Right and Left)

The reason for the imbalance : A new node is inserted in the left subtree of the left child of node A.

Adjustment method : Rotate to the right and then rotate to the left. Operation: Replace the node C of the left child of A's right child B, and then replace the position of A with the node C upwards.

4. Weighted path length

Before learning Huffman first need to understand the weighted path length

Path length path experienced: number

The weight of the node: the value assigned to the node

Weighted length path tree : WPL, all the trees in 叶结点the weighted sum of the path lengths

The weighted path length of the above binary tree is :WPL=7*2+2*2+3*2=24

The weighted path length of the above binary tree is :WPL=7*1+2*2+3*2=17

We can see that although the weights of the nodes of the two binary trees are the same, their weighted path lengths are not the same, which leads us to哈夫曼树的定义

5. Huffman tree

Huffman tree : also called the optimal binary tree, a binary tree with n weighted leaf nodes with the smallest weighted path length.

5.1 The structure of the Huffman tree

Huffman tree construction algorithm

  • Take n nodes as n binary trees with only one root node to form a forest F
  • Generate a new node, and find the two trees with the smallest root node weight from F as its left and right subtrees, and the weight of the new node is the sum of the weights of the root nodes of the two subtrees
  • Delete these two trees from F and add the newly generated tree to F
  • Repeat steps 2 and 3 until there is only one tree in F

It should be noted here that the construction process of the Huffman tree does not require that tree as the left subtree and the tree as the right subtree, so the Huffman tree is not unique

For example, the above example:

The first is three weighted nodes ABC

Then select the two trees B and C with the smallest root node weight to form a binary tree, the weight of the root node of the binary tree is the sum of the weights of the two nodes, and then put it back in the forest

Then still pick the two trees with the smallest root node weight...

5.2 The nature of the Huffman tree

  • Each initial node becomes a leaf node, and the double branch nodes are newly generated nodes
  • The larger the weight, the closer to the root node, and the smaller the weight, the farther away from the root node.
  • The degree of no node in the Huffman tree is 1
  • The total number of nodes in a Huffman tree with n leaf nodes is 2n-1, and the number of nodes with degree 2 is n-1

5.3 Huffman coding

Encoding : For a sequence of strings, use binary to represent characters

Fixed length encoding :

For example: HelloWorld, where each character uses a three-digit binary representation, from which we can get the binary sequence corresponding to the string: 000001010010011100011101010110.

From the above code, we may have two doubts: 1. Why use a three-digit binary Representation, rather than a shorter binary representation, such as two-digit? 2. lCharacters appear more frequently, can these characters with more occurrences use shorter codes to get shorter binary sequences? This leads to the second encoding method:可变长度编码

Variable length encoding :

For example: HelloWorld, we use a relatively short binary representation, from which we can get the binary sequence corresponding to the string: 0001001101110000

It can be seen that such a sequence is much shorter than the previous sequence, but such a sequence is not applicable because we used a fixed-length encoding to represent one character every three digits, and the corresponding string can be obtained by traversing in turn. But now variable-length codes such as 00 can represent the character H or two characters l, which creates ambiguity. In fact, prefix encoding is available.

Prefix encoding : none of the encoding is the prefix of another encoding

We modify the corresponding table above to:

We modify l to 11, which is not a prefix of any code, so that the binary sequence formed can be inverted into a string.

So how do you get such a prefix encoding? In fact, it takes advantage of the characteristics of the Huffman tree. Give a chestnut: five letters and the number of occurrences: A:5 B:3 C:6 D:9 E:13, we use the Huffman tree construction algorithm to construct a Huffman tree, first of all, each letter is used as A node, and its weight is its number of occurrences

Then construct the Huffman tree

Then we only need to assign all the left edges in the tree to 0 and the right edges to 1

Then we use the edge experienced from the root node to a certain node, we can get the prefix code corresponding to the node: D 00-E 01-C 10-A 110-B 111, and we can see that the number of occurrences is higher. The more nodes, the shorter the prefix code, and the longer the code of the node with fewer occurrences, which also achieves the purpose of shortening the binary sequence.

Finally, we need to pay attention to:

The Huffman tree is not unique, so the Huffman code corresponding to each character is also not unique, but the weighted path length is the same and optimal

No knowledge of public data structure processing wood off the synchronous update, the next time will explain: data structure diagrams , welcome everyone's attention

Guess you like

Origin blog.csdn.net/qq_41941875/article/details/106522994