Trees and Binary Tree Heaps: The Meaning of Heaps

Table of contents

 The meaning of heap: 

The first is the sorting of the heap, and the second is the ranking of the top k of the heap.

Top k ranking problem of heap:

Top k problems facing large amounts of data:

 Implementation of heap sort:——Take ascending order as an example

Method 1: Swap the beginning and the end: 

Build a big pile:

The exchange of root nodes and tail nodes cooperates with top-down operations:

Top-down functions:

Bottom-up function:

Source File: 

Main function part:

Method 2 Repeated horizontal jumps: 

accomplish: 

Top K ranking problem: - Taking processing a large amount of data as an example, the largest number of top K

Create data and store it in a file: 

Create a small heap of K numbers: 

 To exchange:

Print pile: 

Complete code: 

Time complexity of top-down adjustment:

Time-complex reading of bottom-up adjustment: 

 


 The meaning of heap: 

The first is the sorting of the heap, and the second is the ranking of the top k of the heap.

Top k ranking problem of heap:

  • The top k ranking problem of the heap mainly takes advantage of the characteristic that the top of the heap of a large or small heap must be the maximum or minimum value, and then uses the deletion operation of the heap to sort the heap by size, thereby forming an ascending order or The top list in reverse order is full.

Top k problems facing large amounts of data:

When facing a large amount of data, if you want to arrange the first k smallest values, you can first create a small heap with a size of K nodes, then insert it, and compare the inserted data with the top element of the stack. If it is smaller than the top element of the stack, replace the top element of the stack, and then compare and exchange it with each node under the top of the stack.

In the end, this approach forms a result where the top of the stack is the smallest element. This method is also equivalent to a last-place elimination system.​ 

 Implementation of heap sort:——Take ascending order as an example

  • Regarding ascending order, most people generally think of using a small heap, because the characteristic of a small heap is that the parent node is smaller than the child node, and the underlying structure of the heap is an array, so most people initially think of using a small heap to implement heap sorting. question.
  • But this is wrong, because the heap is actually a selection sort.

If we create a small heap in ascending order, we can get the smallest number at the beginning, but what is our next step, if we find the second smallest number?

If you want to find the second smallest number, if you search on the basis of the current pile, then the relationship will be completely messed up, because you have to find it between brothers, between the father's brothers, and between the children of the father's brothers.

At this time, only a new heap can be rebuilt, but the cost of building a heap is the repeatability of time complexity.

Therefore, use a large heap to solve the ascending order problem of heap sort!

Method 1: Swap the beginning and the end: 

 According to an idea of ​​​​heap deletion, exchange!

The exchange of root node elements and tail node elements is characterized by the fact that the parent node elements of a large heap are larger than the child node elements. The element of the last node is the minimum value of the heap, while the element of the root node is the minimum value of the heap. Maximum value, when the two are exchanged, the minimum value goes to the root node position, and the maximum value goes to the tail node.

At this time, using the loop method combined with the method of shielding the tail nodes, in the process of taking turns to exchange, a small heap built on the basis of the large heap will soon be formed, and the sorting of the small heap in the array is Ascending order! 

Build a big pile:

The previous method of building a heap is too cumbersome, so you can use ready-made arrays and bottom-up exchange methods to build large heaps. 

The exchange of root nodes and tail nodes cooperates with top-down operations:

- -end is used to shield the last node of each session, end means subscript, and n means array size.

 

Top-down functions:

Bottom-up function:

Source File: 

Main function part:

Method 2 Repeated horizontal jumps: 

The core of repeated horizontal jumps is to find the parent node of the last node, perform top-down exchange, and use the characteristics of ascending order and large piles - the parent node is larger than the child node, and the elements of the parent node and the child node are elements are exchanged.

  • And after each exchange, a subtree will be formed. At this time, jump the subscript of this subtree to the next subtree and perform the same operation again. In this way, the parent-child structure of the left and right subtrees will not be affected. Destroyed, and will become a small pile in ascending order.

accomplish: 

i represents the subscript of the parent node, and n is the subscript of the last node at the beginning. 


Top K ranking problem: - Taking processing a large amount of data as an example, the largest number of top K

 In the previous article on the meaning of the heap, the top k problem of the heap was discussed in detail, and here we take the top k that processes more data as an example.

Create data and store it in a file: 

This step is mainly due to fear that too much data will cause the terminal to freeze. 

  •  Code interpretation:
  • The srand function selects randomly, then uses time to transmit the random value, and then opens the file in write ("w") mode.
  • Then because 10 million values ​​need to be generated, and rand can only generate more than 30,000 numbers, it is necessary to +i and add 10000000.
  • Then write the data to the file (fprintf), and finally close the file. fin is the pointer variable of the file.

file is a pointer to the file, and fin is a pointer to perform content operations inside the file. 

 

Create a small heap of K numbers: 

  •  According to the characteristics of a small heap, the root is the smallest number in the heap. If you need to find the largest number, you can conduct the first check through the root node element at the top of the heap.
  • The number larger than the top of the heap replaces the element at the top of the heap. Then, according to the characteristics of the small heap, the parent node is smaller than the child node, so a top-down exchange is performed, and the larger element is thrown to the end of the heap for secondary processing. times of investigation.
  • In this way, in the end, the last node is the largest number, and the root node element at the top of the heap is the Kth largest number among these millions of numbers.

  • Note that the method of using an array to build a heap is still used here, and the elements entering the array are taken from the previously created file by the fscanf function.​ 
  • Malloc is used to create a space size of bytes that can store K elements as an array.
  • fscanf obtains content from the specified standard stream, scanf obtains content from the keyboard standard stream
  • The three parameters of fascnf are: a file type pointer responsible for reading data, a format for display after reading, and the third is the space for storing the data after reading.
  • fscanf will return EOF after reading the data.

 To exchange:

Because fscanf will return EOF after reading the data, this is used as a judgment condition. Perform top-down exchange to adjust the elements of the heap to maintain a small heap structure. 

Print pile: 

  • Finally print the heap 

 

Complete code: 


Time complexity of top-down adjustment:

Time-complex reading of bottom-up adjustment: 


 

Guess you like

Origin blog.csdn.net/2301_76445610/article/details/134699288