Sorting algorithm-heap sort

Heap sort

In the past, we used the divide-and-conquer algorithm on the linear linked list of sequential storage, such as double-ended searching at the same time (fast sorting, merging, etc.). To improve the efficiency of the algorithm.
Now let's understand the unusual sorting method. of the binary tree is stored in the sequential list of heap sort, we expect to arrive at an ordered sequence.
there is no feeling is not the same starting line. past are optimization algorithms, optimize the structure who would have thought it?

Knowledge points.

  • Using arrays to implement tree-related data structures may seem a bit weird, but it is very efficient in time and space.
  • Not every smallest heap is an ordered array! To convert the heap into an ordered array, you need to use heap sort.
  • The root node of the heap stores the largest or smallest element, but the sort order of other nodes is unknown.
    For example, in a largest heap, the largest element is always in index 0the same position, but the smallest element may not be the last element.
    The only guarantee is that the smallest element is a leaf node, but it is not sure which one .

Step analysis

The basic idea of ​​heap sort is:

  1. Construct the sequence to be sorted into a large top pile, at this time, the maximum value of the entire sequence is the root node of the top of the pile. Exchange it with the last element, and the end is the maximum value at this time. (Equivalent to sink the extreme value to the end of the array)
  2. Then n-1reconstruct the remaining elements into a heap, so that the nnext smallest value of the element will be obtained .
  3. Repeated execution like this, you can get an ordered sequence

Large top pile is used for ascending order, small top pile is used for descending order

The question is, how to build an unordered sequence into a heap?

堆结构
{10,5,9,1,2,7,8}
    
          10 
    5             9
1       2     7       8

in memory unit :
index        0    1     2      3     4   5    6
node_v       10   5     9      1     2   7    8             
tree_level   1    2     2      3     3   3    3
// 以根结点为第一层 根的孩子为第二层

First, we observe the size relationship of the nodes in the heap structure in the above figure:

  • In the heap, the filling of the next layer is not allowed until all the nodes of the current level have been filled:
  • The parent node in the sequence table is always in front of the child node
  • There is a mapping relationship between the index of the parent node and the child node in the sequence table:K(i) : left_child_index : K(2i + 1) | right_child_index K(2i+2)
  • In the big top pile, level(i)the nodes on the layer must be larger than level(i+1)the nodes on the upper layer. However, the two child nodes belonging to the same parent node on the same layer are free to be larger or smaller.

Therefore, use the mathematical formula between the sequence subscripts to describe the following大顶堆``小顶堆

  • Daegu Bank:arr[i] >= arr[2i+1] && arr[i] >= arr[2i+2]
  • Small top pile:arr[i] <= arr[2i+1] && arr[i] <= arr[2i+2]
  1. We first need to assume the unordered sequence as an unordered heap, and then find the starting node of the lowest level in the unordered heap (why start from the second bottom? Because there is no need to deal with terminal nodes and non-terminal nodes separately)
  2. After finding the bottom layer and comparing it with the nodes of the upper layer, this will be a screening process from the leaf node to the top of the heap (the nodes on the path are compared in pairs, the larger one rises). Loop screening to build what we expect Big top pile.

So how to find the lowest level in the unordered heap? According to the mathematical characteristics of the binary tree:
if the original sequence is regarded as a complete binary tree, the last non-terminal node must be the |n/2|first element ... So the filter needs to start from the [n/2]first element .

void external_sort(int a[],int len)
{
    int index;
    int array_len = len;
    
    // array_len/2 - 1  是当前无序堆中第一个非终端结点在数组中的下标
    for( index = array_len/2 - 1; level >= 0 ; level-- ){
        
        // i : 当前层中结点存储在顺序表中的 开始索引 
        // array_len - 1 : 当前层中结点存储在顺序表中的 结束索引 
        heap_sort(a,i,array_len - 1);
    }    
}

Substitute the {10,5,9,1,2,7,8}sequence for analysis

Instance

#include <stdio.h>
#define MAX_SIZE 10
int wait_sort[MAX_SIZE] = {11,15,20,13,17,65,27,49,99,18};

//#define MAX_SIZE 3
//int wait_sort[MAX_SIZE] = {15,11,20};

void show(int * s,int length)
{
    int i ;
    for(i=0;i<length;i++){
        printf(" %d ",s[i]);
    }

    printf("\n");
}

void swap(int * i,int * j)
{
    int temp;
    temp = *i;
    *i = *j;
    *j = temp;
}

void build_max_heap(int arr[], int start, int end)
{
    //建立父节点指标和子节点指标
    int dad = start;
    int son = dad * 2 + 1;
    while (son <= end)  //若子节点指标在范围内才做比较
    {
        if (son + 1 <= end && arr[son] < arr[son + 1])
            //先比较两个子节点大小,选择最大的
            son++;

        //如果父节点大於子节点代表调整完毕,直接跳出函数
        if (arr[dad] > arr[son]) {
            break;
        //否则交换父子内容再继续子节点和孙节点比较
        } else {
            swap(&arr[dad], &arr[son]);
            dad = son;
            son = dad * 2 + 1;
        }

    }
}

// adjust 调整
// 保障从终端 到 当前 结点的路径是有序的,然后层次逐渐上升.没有多余的比较
void build_min_heap_sort(int a[],int pos,int len)
{
    int temp = a[pos];
    int child;

    // pos = child 代表向下(向着叶子结点方向 进行 两两比较)
    // 最终为 temp 选取一个合适的位置.
    for(; 2 * pos + 1 <= len ; pos = child )
    {
        // 首先计算出当前结点的左子结点在顺序表中的索引值
        child = 2 * pos + 1;

        // 选出 左右子结点 中较小的一个
        if(child < len && a[child] > a[child + 1]){
            child++;
        }

        // 选出 父 子 中值较小的结点,上升 ,因为要构建出小顶堆
        if(a[child] < temp){
            a[pos] = a[child];
        } else {
            break;
        }

    }

    // 归位
    a[pos] = temp;

}

void heap_sort(int a[],int len)
{

    int i;

    // 先构建一个小顶堆
    for(i = len/2 - 1;i>=0;i--){
        build_max_heap(a,i,len-1);
    }

    show(wait_sort,MAX_SIZE);
    for(i = len - 1;i>=0;i--){

        // 构建完成小顶堆后   ,将根结点下沉到数组的末尾,依次得出 最小值,次小值...
        swap(&a[0],&a[i]);
        build_max_heap(a,0,i-1);
    }

}

int main(void)
{
    heap_sort(wait_sort,MAX_SIZE);
    show(wait_sort,MAX_SIZE);
    return 0;
}

Complexity analysis

Compared with quick sort, in the worst case, the time complexity of O(nlogn)heap sort is also high , which is the advantage of heap sort. At the same time, its auxiliary storage is O(1). Unstable sort.
Heap sort is not recommended for files with fewer records .

Reference

https://www.jianshu.com/p/6b526aa481b1

Guess you like

Origin blog.csdn.net/qq_30549099/article/details/107070275