ASL and Binary Sorted Trees

(1) Concept

Suppose there is a linear list or a linked list, which contains 100 data, and you want to find the A you want in the 100 data, then this is called "one search". In this search, how many times do you need to compare? Find the A you want, which is called the search length.

In these 100 data, the search of 100 data will have 100 different search lengths

For example, A is the first, the search length is 1, B is the third, the search length is 3, C is the tenth, and the search length is 10...

Add up the different search lengths of these 100 different data, divide by the total number of data 100, and get the average search length

Simple understanding is actually about how many times I need to compare a piece of data on average

The average search length of successful search (ASL success): refers to the average search length of the corresponding data that is successfully searched if the searched data exists

Average search length of search failures (ASL failure): refers to the average search length of search failures that do not exist for the searched data

insert image description here
Pi is the probability of finding the i-th data. Generally, it is assumed that the search probability of each data is 1/n; some topics will specifically give different probabilities for each data, here Pi must be replaced by Ci means finding the i-
th The number of times a data needs to be compared, that is, how many times to compare to find the current data using the search algorithm given in the topic.

ASL is the most important indicator to measure the efficiency of search algorithms.

(2) Example

1: Assume that there is an array or linked list of n data, and it is searched sequentially, find:

ASL success:
If you use the sequential search method (that is, compare one by one from the beginning to the end), it is obvious:
find the first data length 1, find the second data length 2, find the third data length 3, find the fourth Data length 4...Find the nth data length n
according to the arithmetic sequence summation formula: n*(1+n)/2
Finally, because there are n data, divide by n:n*(1+n)/ 2 / n
to obtain ASL success: (1+n)/2

ASL failure:
If the title says that the array has a sentinel bit, that is, search from the back to the front in reverse order, and compare the sentinel at a[0], then a total of n+1 comparisons are required. If the title does not say that there is a sentinel bit, then
only Compare n times to determine that the data does not exist

2: Given an unordered array: [29, 13, 37, 7, 10, 16, 19, 32, 33, 41, 43], generate a binary sorting tree for the array, and Sorting tree for binary search, seeking ASL success and ASL failure

The idea of ​​​​generating a binary sorting tree (small on the left and large on the right) is as follows:
Loop an unordered array
(1) If the current tree is empty, use the current element as the root node
(2) If the tree is not empty, and the current element is larger than the current node If the element of the point is small, recursively find the corresponding position to the left
(3) If the tree is not empty and the current element is larger than the element of the current node, then recursively find the corresponding position to the right
(4) until a certain node If the left child or the right child is empty, then the current element is placed in the corresponding position according to the rule of small left and large right. The
pseudo code is as follows:

int a[MaxLenght];   // 假设的无序数组
	TNode rootNode = null; // 创建一个根结点
	for(int i=0; i<MaxLenght; i++){
    
    
		// 把数组一个个的元素不断插入到二叉排序树中
		Create_Tree(rootNode, a[i]);
	}
	
	void Create_Tree(TNode node, int value){
    
    
		if(node == null) {
    
    
			// 如果是空,则生成一个新的结点,该新结点一定是叶子结点
		 	node = (TNode)malloc(sizeof(TNode));
			// 二叉树中叶子结点左右孩子一定是null
			node.lchild = node.rchild = null;
			node.data = value;
			return 0;
		} else if (value < node.data) {
    
    
			// 待插入数据比当前结点的数据小,往左边找位置
			Create_Tree(node.lchild , value);
		} else if (value > node.data) {
    
    
			// 待插入数据比当前结点的数据大,往右边找位置
			Create_Tree(node.rchild , value);
		} else if (value == node.data) {
    
    
			// 如果有重复值,无需插入
			return 1;
		}
	}

The generated binary sorting tree:
insert image description here
perform a half-search on it. The premise of the half-search is that a set of arrays can be used in a monotonous order, so first traverse through the middle order (left, middle and right, just to accurately convert the order from small to large) Get an ordered array:
Subscript: 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
Data: 7, 10 , 13, 16, 19, 29, 32, 33, 37, 41 , 43

Compared with the graph of the spanning tree, the number of layers is the number of comparisons of each data element. Assuming that the search probability is the same (each is 1/n), a total of 11 data elements, according to the formula: ASL success = (1 1
+ 2 2 + 3 4 + 4 4 )/ 11 = 33/11 = 3
Contrasted with the graph of the spanning tree, if the search element is smaller than 7, the search must fail, and the position of the left child corresponding to node 7 is null. To the left child position of node 7, three comparisons are required to confirm that the search element does not exist (compared with 29, compared with 13, compared with 7, and then the search is gone, so it is 3 comparisons), in the same layer, There are also 16 left child nodes that are empty (representing that data greater than 13 and less than 16 does not exist, but it needs to be compared three times to confirm), 32 left child nodes (data greater than 29 and less than 32), 41 left child nodes Child nodes (data greater than 37 and less than 41); therefore, it can be determined that there are 4
types of data that do not exist (that is, they cannot be found! The search fails!) by comparing 3 times. 7 less than 10 and greater than 10 less than 13; greater than 16 less than 19 and greater than 19 less than 29...

A lot of analysis above, in fact, the simpler way is to count how many empty left and right nodes there are! The empty nodes in the nth layer only need to compare n-1 times.

ASL failures = (3*4 + 4*8)/12 = 11/3

3: Perform a sequential search on the sequence table with a length of 3. If the probability of finding the first element is 1/2, the probability of the second element is 1/3, and the probability of the third element is 1/6; the average search is successful The length is:

Note that the non-equal probability search needs to be based on the formula: Pi (probability) * Ci (comparison times), i from 1 to 3
: (1/2) + (2/3) + (3/6) = 5/3

Delete the node operation of the binary sort tree:

insert image description here
For the binary sorting tree shown, if you want to delete a node X:
(1) If X is a leaf node, delete it directly
(2) If X is a non-leaf, and X has only one subtree (both left and right, but There can only be either left or right), then directly use the subtree to replace the position of X
(3) If X has both left and right subtrees, then use the first node of the inorder traversal in the right subtree to fill X Analysis of the position
: Why use the first node of the inorder traversal of the right subtree to fill X?
Because after deleting X, the filling up can still meet the principle of small left and large right, the first element of the in-order traversal of the right child of X must be larger than all elements of the left subtree, and at the same time be the right subtree The smallest element in , so it is placed at the X position, which just satisfies the principle of small left and large right.
insert image description here

Find the time complexity:

The search process of binary sorting tree is similar to binary search, but different binary sorting trees will be generated for the same group of arrays with inconsistent insertion order. In extreme cases, the root node a is the smallest among all the data, which will cause all other The nodes are all piled up on the right side of a. When the inserted data is ordered (that is, the latter element must be smaller/larger than the previous element), it will lead to a one-sided binary sorting tree, and the worst search time complexity is O. (n) ;
Binary search is generally performed directly on a given ordered array, so its spanning tree must be unique, and each round of search can eliminate half of the data, so its time complexity is: O (log2n)

[2013] In any non-empty binary sorting tree T1, delete node V to form a binary sorting tree T2, and then insert V into T2 to form a binary sorting tree T3. Which of the following statements about T1 and T3 is correct:

If V is a leaf node of T1, then T1 and T3 are the same.
Analysis: It can be seen from the above deleted demonstration that if the leaf node is deleted and then returned to the original leaf node, the binary sorting tree does not change at all. If
V is not the leaf node of T1, then T1 and T3 are different.
Analysis: If it is not a leaf node, then after deletion, the first element of the in-order traversal of the right subtree must be topped. If V comes back, there is no way to return to the original position, and the data of the new binary sorting tree is added Elements must be leaf nodes. So T1 and T3 are different, definitely different.

Note ⚠️: There was a very similar question in 2019. But it is a balanced binary tree

[2019] In any non-empty balanced binary tree (AVL tree) T1, delete a node V to form a balanced binary tree T2, and then insert V into T2 to form a balanced binary tree T3. Then the following statements about T1 and T3 are correct:

If V is the leaf node of T1, then T1 and T3 may be different.
Analysis:
Since it is a balanced binary tree, deleting a leaf node may cause a balance rotation, so it will affect the shape of the entire tree. At this time, T1 and T3 are different.
If deleting the leaf node V does not cause a balanced rotation, then T1=T3
, so they may or may not be the same

If V is not a leaf node of T1, then T1 and T3 may be the same.
V is not a leaf node. After deletion, the tree will definitely change. If V is inserted again at this time, it may cause balance and restore it to T1 , the example is as follows:
insert image description here

The sequence obtained by traversing a binary sorting tree in preorder is (50, 38, 30, 45, 40, 48, 70, 60, 75, 80); try to draw the binary sorting tree and find out the equal probability The average search length of successful search and failed search

The pre-order traversal sequence is restored to a binary tree. The first 50 must be the root node, and those smaller than 50 are thrown into the left subtree, and those larger than 50 are thrown into the right subtree; in the left subtree, the first one is
visited The 38 arrived must be the root node of the left subtree. All those smaller than 38 are thrown to the left, and those larger than 38 are thrown to the right
subtree on the right. The first visited 70 must be the root node of the right subtree Points, all those smaller than 70 are thrown to the left, and those larger than 70 are thrown to the right.
Repeatedly, it can be recovered.
The title is said to be equal probability, so directly calculate Pi*Ci:
insert image description here

Build a binary sorting tree according to the sequence (40, 72, 38, 35, 67, 51, 90, 8, 55, 21), draw the tree, and find the average search for successful search in the case of equal probability length

40 as the root node, according to the default principle of the binary sorting tree, the left is small and the right is large, draw the tree, and then calculate the success of ASL according to the formula
insert image description here

The best binary sorting tree: the height is the smallest, and the difference between the number of nodes in the left and right subtrees does not exceed 1, it is the best binary sorting tree. The best thing is that it has high query efficiency. It not only has the convenience of searching the sorting tree, but also is a balanced binary tree, and it is also a balanced binary tree with the difference between the left and right nodes not exceeding 1.
Given a set of keywords {25, 18, 34, 9, 14, 27, 42, 51, 38}; Assuming that the probability of finding each keyword is the same, please draw the best binary sorting tree.

Idea: First, sort the keyword sequence, find the element in the middle as the root node, and then continue to add elements to the tree according to the left small and right large; since we have determined the root node, the left and right subtrees occur When it is unbalanced, it is only adjusted inside the subtree, and the root node is not allowed to be adjusted; otherwise, the difference between the left and right elements will be greater than 1, which is not optimal. At the same time, in the process of generating the left and right subtrees, it is not
allowed Adjust the root node.
insert image description here

Write an algorithm to determine whether a given binary tree is a binary sorting tree / find the smallest and largest keywords in a binary sorting tree

Idea: According to the pre-order traversal, see if it is in an increasing order. If the reverse order occurs, it is not a binary sorting tree; to find the minimum and maximum keywords, only the pre-order traversal is required. The first one is the smallest, and the last one is the largest (or more Directly, direct while loop to find the element in the bottom left corner, which is the smallest element; while loop to find the element in the bottom right corner, which is the largest element)

A binary tree that satisfies both the requirements of the large root heap and the binary sorting tree must be a unilateral tree with only one left child whose value is smaller than the value of the root node

Because the large root heap requires the value of the root node to be greater than that of the left and right children, and the binary sorting tree is small on the left and large on the right, then only the left child can exist, because the right child of the binary sorting tree must be larger than the root, so it cannot exist.
Secondly, the big root heap is a complete binary tree. When the upper level is not full (no right child), no child nodes of the next level can be generated, so the height of the qualified unilateral binary tree can only be 2

Guess you like

Origin blog.csdn.net/whiteBearClimb/article/details/128074229