Chapter 9: C Language Data Structure and Algorithm Elementary Heap

Series Article Directory



foreword

A heap is a complete binary tree.


1. Definition of heap

We have learned about trees, binary trees and other related concepts, so the heap explained today is based on the complete binary tree in the binary tree. Then on the basis of a complete binary tree, the heap also satisfies this property: the child nodes in the heap are always less than or equal to (greater than or equal to) the parent node .

if,The parent node of the heap is always less than or equal to its child node, we call it a small root heap.
if,The parent node of the heap is always greater than or equal to its child node, we call it the big root heap

The logical structure and physical structure of the heap :

insert image description here
From the above physical structure, we can know that our next code implementation is based on arrays . Therefore, we will use the idea of ​​a dynamic sequence table to store the heap.

Second, the realization of the heap

typedef int HPDataType;

typedef struct Heap
{
    
    
	HPDataType* a;
	int size;
	int capacity;
}HP;

3. Heap interface functions

1. Initialization

void HeapInit(HP* php)
{
    
    
	assert(php);
	php->size = 0;
	php->capacity = 4;
	HPDataType* cur = (HPDataType*)malloc(sizeof(HP));
	assert(cur);
	php->a = cur;
}

2. Destroy

void HeapDestory(HP* php)
{
    
    
	assert(php);
	php->size = 0;
	php->capacity = 0;
	free(php->a);
	php->a = NULL;
}

3. Insert

void HeapPush(HP* php, HPDataType x)
{
    
    
	assert(php);

	if(php->capacity == php->size)
	{
    
    //扩容
		php->capacity *= 2;
		HPDataType* cur = (HPDataType*)realloc(php->a, sizeof(HP) * php->capacity);
		assert(cur);
		php->a = cur;
	}

	php->a[php->size++] = x;

	
	AdjustUp(php->a, php->size - 1);
		
		
}

We insert a data at the last position, and then let the data move up.
insert image description here
We found that if 100 needs to move up, it only needs to be compared with 100's ancestors. Therefore, we can write the function of AdjustUp.

void AdjustUp(HPDataType* a, int child)
{
    
    //向上调整
	int parent = (child - 1) / 2;
	while (child > 0)
	{
    
    
		if (a[parent] < a[child])
		{
    
    
			swap(&a[parent], &a[child]);
			child = parent;
			parent = (child - 1) / 2;
		}
		else
		{
    
    
			break;
		}
	}
}

4. Delete

void HeapPop(HP* php)
{
    
    
	assert(php);
	assert(php->size > 0);
	swap(&php->a[0], &php->a[--php->size]);

	AdjustDown(php->a, 0, php->size);
}

What we need to delete here is the top of the heap. But the time complexity of deleting the top element of the heap in the array is O(N). This is quite complicated, and the time complexity of tail deletion is O(1), so here we also exchange the tail element and the top element of the heap first, and then move the top element down the heap.
insert image description here

void AdjustDown(HPDataType* a, int parent, int size)
{
    
    //向下调整
	int child = parent * 2 + 1;
	while (child < size)
	{
    
    
		//确认child指向大的哪个孩子
		if (child + 1 < size && a[child + 1] < a[child])
		{
    
    
			++child;
		}

		if (a[child] < a[parent])
		{
    
    //孩子大于父亲,交换,继续向下调整
			swap(&a[child], &a[parent]);
			parent = child;
			child = parent * 2 + 1;
		}
		else
		{
    
    //孩子小于父亲
			break;
		}
	}
}

5. Empty judgment

bool HeapEmpty(HP* php)
{
    
    
	assert(php);

	return php->size == 0;
}

6. Number of elements

int HeapSize(HP* php)
{
    
    
	assert(php);

	return php->size;
}

4. The principle of heap application

Large heap: the root is the largest number in the heap
Small heap: the root is the smallest number in the heap
Each time you can select the largest/smallest number in the heap, and then maintain the heap, you can select the second largest/smallest number in turn, and you can get an increasing or decreasing sequence by looping in turn.

Five, heap sort

1. Build a heap

The basis of heap sorting is to build the elements in the array into a heap:
Method 1: Tail insertion upward adjustment
We start from the first element, continuously insert new elements, and then adjust this element upward to correspond to the corresponding position. Keep the array always on a heap so it can be adjusted upwards.
Number of adjustments: F(N) =>O(N*logN)

insert image description here

void AdjustUp(int*arr,int child)
{
    
    
    int parent=(child-1)>>1;
    while(child>0)
    {
    
    
        
        if(arr[child]>arr[parent])
        {
    
    
            swap(arr[child],arr[parent]);
            child=parent;
            parent=(child-1)>>1;
        }
        else break;
    }
}
void Heap_Sort(int*arr,int size)
{
    
    
	//建堆
    for(int i=0;i<size;i++)
    {
    
    
       AdjustUp(arr,i);
    }
    
	//.....
}

Method 2: Downward adjustment of the root node
The downward adjustment is generally aimed at the root node, but the downward adjustment must ensure that the two subtrees immediately below are two heaps, otherwise an error will occur. Therefore, we can start from the penultimate row and continuously adjust each small pile, from small to large, from small to large.
Number of adjustments: F(N) = N - log 2 (N+1) => O(N)
insert image description here
We first ensure that the two subtrees are heaps, and then adjust the root nodes of the two subtrees.

void AdjustDown(int*arr,int size,int parent)
{
    
    //升序建大堆,降序建小堆
    int child=parent*2+1;
    while(child<size)
    {
    
    
        if(child+1<size&&arr[child+1]>arr[child])child++;
        if(arr[child]>arr[parent])
        {
    
    
            swap(arr[child],arr[parent]);
            parent=child;
            child=parent*2+1;
        }
        else break;
    }
}
void Heap_Sort(int*arr,int size)
{
    
    
    //搭建一个大根堆
    for(int i=(size-1-1)/2;i>=0;i--)
    {
    
    
        AdjustDown(arr,size,i);
    }
    //.........
}

2. Sorting

For sorting, suppose we are sorting in ascending order, but we create a small root heap, then take out the root node every time, but after taking it out, the structure of our heap will be messed up, so we need to rebuild the heap, and the time complexity at this time is n squares.

So we change our thinking, we create a big root heap, then the root node is the largest, we swap the root node with the last element, then we delete the last element, that is, let the tail pointer move forward, at this time our maximum value is stored in the last bit in the array, then we let the root node move down to restore the heap structure, at this time the top of the heap is the second largest value, and then we exchange again, so that the second largest element is in the penultimate position. By analogy, all elements can be arranged in ascending order in the end.

The time complexity of moving our root node down is O(logN), a total of N elements, and the time complexity at this time is O(NlogN).

#include<iostream>
#include<ctime>
using namespace std;
void AdjustDown(int*arr,int size,int parent)
{
    
    
    int child=parent*2+1;
    while(child<size)
    {
    
    
        if(child+1<size&&arr[child+1]>arr[child])child++;
        if(arr[child]>arr[parent])
        {
    
    
            swap(arr[child],arr[parent]);
            parent=child;
            child=parent*2+1;
        }
        else break;
    }
}
void Heap_Sort(int*arr,int size)
{
    
    
    for(int i=(size-1-1)/2;i>=0;i--)
    {
    
    
        AdjustDown(arr,size,i);
    }

    for(int end=size-1;end>0;end--)
    {
    
    
        swap(arr[0],arr[end]);
        AdjustDown(arr,end,0);
    }
}

Six, the application of the heap - TOPK

1. What is the TOPK problem?

The topk problem is that we select the top K largest or smallest numbers from a bunch of numbers.

2. Solution

If our data volume is one billion, our memory area does not support forming it into a heap at this time, so we use the first k elements to create a small root heap with k elements, then the larger elements in our heap will definitely "sink to the bottom". At this point, we continue to read the element, and then compare the element with the root node. If it is larger than the root node, we replace the root node, and then sink the replaced new root node. Why do we compare the two? Because we created a small root heap, but what we want is the maximum value, and the root node is the smallest, so the root node is the most likely to be replaced, so we let the root node compare, and finally the remaining heap with K elements is the answer.

// 在N个数找出最大的前K个  or  在N个数找出最小的前K个
// 1. 建立一个K个数的大堆/小堆,PopK次,依次取堆顶
时间复杂度:N + logN * K
空间复杂度:O(1)放不进内存
// 2. 建立K个数的小堆/大堆,比堆顶数据大/小,就替换堆顶,向下调整
时间复杂度:K + (N-k)*logk => O(N*logK)
空间复杂度:O(k)
void TopK(int* a, int n, int k)
{
    
    
	HP hp;
	HeapInit(&hp);
	// 创建一个K个数的小堆
	for (int i = 0; i < k; ++i)
	{
    
    
		HeapPush(&hp, a[i]);
	}

	// 剩下的N-K个数跟堆顶的数据比较,比他小,就替换他进堆
	for (int i = k; i < n; ++i)
	{
    
    
		if (a[i] < HeapTop(&hp))
		{
    
    
			HeapPop(&hp);
			HeapPush(&hp, a[i]);
		}
	}

	while(k--)
	{
    
    
		printf("%d ", HeapTop(&hp));
		HeapPop(&hp);	
	}
}



Summarize

The heap is a logically complete binary tree and physically a dynamic sequence table.
In the duel of hope and despair, victory will belong to hope if you hold it with courage and firm hands. — Pliny

Guess you like

Origin blog.csdn.net/yanyongfu523/article/details/129582526