[Data structure] Sequential structure and implementation of binary tree

Table of contents

1. Sequential structure of binary tree

2. Concept and structure of heap

3. Implementation of heap

3.1 Heap downward adjustment algorithm

3.2 Creation of heap

3.3 Time complexity of heap construction

3.4 Heap insertion

3.5 Heap deletion

3.6 Code implementation of heap

4. Application of heap

4.1 Heap sort

4.2 TOP-K problem


1. Sequential structure of binary tree

Ordinary binary trees are not suitable for storage in arrays because there may be a lot of wasted space. A complete binary tree is more suitable for sequential structure storage. In reality, we usually store the heap (a binary tree) using an array of sequential structure. It should be noted that the heap here and the heap in the operating system virtual process address space are two different things. One is Data structure, one is an area segmentation that manages memory in the operating system.

2. Concept and structure of heap

If there is a set of key codesK=\left \{ k_{0},k_{1},k_{2},...,k_{n-1} \right \}, store all its elements in a one-dimensional array in the order of a complete binary tree, and satisfy:< a i=2> and ( and ) i = 0, 1, 2..., then it is called small Heap (or big pile). The heap with the largest root node is called the maximum heap or large root heap, and the heap with the smallest root node is called the minimum heap or small root heap. K_{i}<=K_{2*i+2}K_{i}<=K_{2*i+2}K_{i}>=K_{2*i+1}K_{i}>=K_{2*i+2}

Properties of heap:

  • The value of a node in the heap is always no greater than or no less than the value of its parent node;
  • The heap is always a complete binary tree.

3. Implementation of heap

3.1 Heap downward adjustment algorithm

Now we are given an array, which is logically regarded as a complete binary tree. We can adjust it into a small heap through the downward adjustment algorithm starting from the root node. The downward adjustment algorithm has a premise: the left and right subtrees must be a heap before they can be adjusted.

int array[] = { 27,15,19,18,28,34,65,49,25,37 };

3.2 Creation of heap

Below we give an array. This array can be logically regarded as a complete binary tree, but it is not yet a heap. Now we use an algorithm to build it into a heap. The left and right subtrees of the root node are not heaps. How do we adjust them? Here we start adjusting from the first subtree of the last non-leaf node, and adjust it all the way to the tree of the root node, and then we can adjust it into a pile.

int a[] = { 1,5,3,8,7,6 };

3.3 Time complexity of heap construction

Because the heap is a complete binary tree, and a full binary tree is also a complete binary tree, here we use a full binary tree to prove it for simplicity (the time complexity is originally an approximation, and a few more nodes will not affect the final result):

Therefore:The time complexity of building the heap is O(N).

3.4 Heap insertion

First insert a 10 to the end of the array, and then adjust the algorithm upward until the heap is satisfied.

3.5 Heap deletion

Deleting the heap means deleting the data at the top of the heap, replacing the last data at the data root at the top of the heap, then deleting the last data in the array, and then adjusting the algorithm downwards.

3.6 Code implementation of heap

// heap.h

#pragma once

#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#include <stdbool.h>
#include <string.h>
#include <time.h>

typedef int HPDataType;
typedef struct Heap
{
	HPDataType* a;
	int size;
	int capacity;
}HP;

// 向上调整
void AdjustUp(HPDataType* a, int child);
// 向下调整
void AdjustDown(HPDataType* a, int n, int parent);
// 交换
void Swap(HPDataType* p1, HPDataType* p2);
// 打印堆
void HeapPrint(HP* php);
// 堆的初始化
void HeapInit(HP* php);
// 堆的初始化(数组)
void HeapInitArray(HP* php, int* a, int n);
// 堆的销毁
void HeapDestroy(HP* php);
// 堆的插入
void HeapPush(HP* php, HPDataType x);
// 堆的删除
void HeapPop(HP* php);
// 取堆顶的数据
HPDataType HeapTop(HP* php);
// 堆的判空
bool HeapEmpty(HP* php);
// heap.c

#include "heap.h"

void HeapInit(HP* php)
{
	assert(php);
	php->a = NULL;
	php->size = 0;
	php->capacity = 0;
}

void HeapInitArray(HP* php, int* a, int n)
{
	assert(php);
	assert(a);
	php->a = (HPDataType*)malloc(sizeof(HPDataType) * n);
	if (php->a == NULL)
	{
		perror("malloc fail");
		exit(-1);
	}
	php->size = n;
	php->capacity = n;
	memcpy(php->a, a, sizeof(HPDataType) * n);
	// 建堆
	for (int i = 1; i < n; i++)
	{
		AdjustUp(php->a, i);
	}
}

void HeapDestroy(HP* php)
{
	assert(php);
	free(php->a);
	php->a = NULL;
	php->size = php->capacity = 0;
}

void Swap(HPDataType* p1, HPDataType* p2)
{
	HPDataType tmp = *p1;
	*p1 = *p2;
	*p2 = tmp;
}

void AdjustUp(HPDataType* a, int child)
{
	int parent = (child - 1) / 2;
	while (child > 0)
	{
		if (a[child] < a[parent])
		{
			Swap(&a[child], &a[parent]);
			child = parent;
			parent = (parent - 1) / 2;
		}
		else
		{
			break;
		}
	}
}

void AdjustDown(HPDataType* a, int n, int parent)
{
	int child = parent * 2 + 1;
	while (child < n)
	{
		// 找出小的那个孩子
		if (child + 1 < n && a[child + 1] < a[child])
		{
			++child;
		}
		if (a[child] < a[parent])
		{
			Swap(&a[child], &a[parent]);
			// 继续往下调整
			parent = child;
			child = parent * 2 + 1;
		}
		else
		{
			break;
		}
	}
}

void HeapPush(HP* php, HPDataType x)
{
	assert(php);
	// 扩容
	if (php->size == php->capacity)
	{
		int newCapacity = php->capacity == 0 ? 4 : php->capacity * 2;
		HPDataType* tmp = (HPDataType*)realloc(php->a, sizeof(HPDataType) * newCapacity);
		if (tmp == NULL)
		{
			perror("realloc fail");
			exit(-1);
		}
		php->a = tmp;
		php->capacity = newCapacity;
	}
	php->a[php->size] = x;
	php->size++;
	AdjustUp(php->a, php->size - 1);
}

void HeapPrint(HP* php)
{
	assert(php);
	for (int i = 0; i < php->size; i++)
	{
		printf("%d ", php->a[i]);
	}
	printf("\n");
}

void HeapPop(HP* php)
{
	assert(php);
	assert(php->size > 0);
	Swap(&php->a[0], &php->a[php->size - 1]);
	--php->size;
	AdjustDown(php->a, php->size, 0);
}

HPDataType HeapTop(HP* php)
{
	assert(php);
	assert(php->size > 0);
	return php->a[0];
}

bool HeapEmpty(HP* php)
{
	assert(php);
	return php->size == 0;
}

4. Application of heap

4.1 Heap sort

Heap sorting uses the idea of ​​​​the heap to sort, and it is divided into two steps:

1. Building

        1) Ascending order: build a big pile

        2) Descending order: Build a small heap

2. Use the heap deletion idea to sort

Downward adjustment is used in both heap creation and heap deletion, so if you master downward adjustment, you can complete heap sorting.

// 堆排序

void HeapSort(int* a, int n)
{
	// 向下调整建堆
	for (int i = (n - 1 - 1) / 2; i >= 0; i--)
	{
		AdjustDown(a, n, i);
	}
	int end = n - 1;
	while (end > 0)
	{
		Swap(&a[0], &a[end]);
		AdjustDown(a, end, 0);
		--end;
	}
}

4.2 TOP-K problem

TOP-K problem: Find the first K largest elements or smallest elements in data combination. Generally, the amount of data is relatively large.

For example: top 10 professionals, Fortune 500, rich list, top 100 active players in the game, etc.

For the Top-K problem, the simplest and most direct way that can be thought of is sorting. However, if the amount of data is very large, sorting is not advisable (the data may not all be loaded into the memory at once). The best way is to use a heap to solve the problem. The basic idea is as follows:

1. Use the first K elements in the data set to build a heap

        1) For the first k largest elements, build a small heap

        2) For the first k smallest elements, build a big pile

2. Compare the remaining N-K elements with the top element of the heap in sequence. If not satisfied, replace the top element of the heap

After comparing the remaining N-K elements with the top element of the heap in turn, the remaining K elements in the heap are the first K smallest or largest elements required.

// 文件中的TopK问题
void PrintTopK(const char* filename, int k)
{
	// 建堆,用a中前k个元素建堆
	FILE* fout = fopen(filename, "r");
	if (fout == NULL)
	{
		perror("fopen fail");
		return;
	}
	int* minheap = (int*)malloc(sizeof(int) * k);
	if (minheap == NULL)
	{
		perror("malloc fail");
		return;
	}
	for (int i = 0; i < k; i++)
	{
		fscanf(fout, "%d", &minheap[i]);
	}
	// 前k个数建小堆
	for (int i = (k - 2) / 2; i >= 0; --i)
	{
		AdjustDown(minheap, k, i);
	}
	// 将剩余n-k个元素依次与堆顶元素比较
	int x = 0;
	while (fscanf(fout, "%d", &x) != EOF)
	{
		if (x > minheap[0])
		{
			// 替换你进堆
			minheap[0] = x;
			AdjustDown(minheap, k, 0);
		}
	}

	for (int i = 0; i < k; i++)
	{
		printf("%d ", minheap[i]);
	}
	printf("\n");

	free(minheap);
	fclose(fout);
}

void CreateNDate()
{
	// 造数据
	int n = 10000000;
	srand((unsigned int)time(0));
	const char* file = "data.txt";
	FILE* fin = fopen(file, "w");
	if (fin == NULL)
	{
		perror("fopen error");
		return;
	}
	for (int i = 0; i < n; ++i)
	{
		int x = (rand() + i) % 10000000;
		fprintf(fin, "%d\n", x);
	}
	fclose(fin);
}

int main()
{
	CreateNDate();
	PrintTopK("data.txt", 5);
	return 0;
}

End of article

Guess you like

Origin blog.csdn.net/m0_73156359/article/details/133716227