[Data structure] Sequential storage structure of binary tree

Table of contents

The sequential storage structure of the binary tree::

                                            1. The sequential structure of the binary tree

                                            2. The concept and structure of the heap

                                            3. Heap downward adjustment algorithm

                                            4. Heap creation

                                            5. Proof of the time complexity of building a heap

                                            6. Heap insertion

                                            7. Heap deletion

                                            8. Heap code implementation

                                            9. Heap sort

                                           10. Top-K questions

The sequential storage structure of the binary tree::

Sequential structure of binary tree

Ordinary binary trees are not suitable for storage in arrays, because there may be a lot of wasted space. The complete binary tree is more suitable for sequential structure storage. In reality, we usually store the heap ( a binary tree ) in an array of sequential structures. It should be noted that the heap here and the heap in the virtual process address space of the operating system are two different things. One is the data structure, and the other is the management in the operating system. A region of memory is segmented. 

The concept and structure of the heap:

If there is a set of key codes K={k0,k1,k2,...kn-1}, store all its elements in a one-dimensional array in the order of a complete binary tree, and satisfy: Ki<= K2i+1 and Ki<=K2i+2 (Ki>=K2i+1 and Ki>=K2i+2) i=0, 1, 2..., then it is called a small heap (or a large heap), and the root The heap with the largest node is called the largest heap or large root heap, and the heap with the smallest root node is called the smallest heap or small root heap.

Properties of the heap:

1. The value of a node in the heap is always not greater than or not less than the value of its parent node.

2. The heap is always a complete binary tree.

 Multiple choice questions:

1.下列关键字序列为堆的是:()
A 100,60,70,50,32,65
B 60,70,65,50,32,100
C 65,100,70,32,50,60
D 70,65,100,32,50,60
E 32,50,100,70,65,60
F 50,100,70,65,60,32
2.已知小根堆为8,15,10,21,34,16,12,删除关键字 8 之后需重建堆,在此过程中,关键字之间的比较次
数是()。
A 1
B 2
C 3
D 4
3.一组记录排序码为(5 11 7 2 3 17),则利用堆排序方法建立的初始堆为
A(11 5 7 2 3 17)
B(11 5 7 2 17 3)
C(17 11 7 2 3 5)
D(17 11 7 5 3 2)
E(17 7 11 3 5 2)
F(17 7 11 3 2 5)
4.最小堆[0,3,2,5,7,4,6,8],在删除堆顶元素0之后,其结果是()
A[3,2,5,7,4,6,8]
B[2,3,5,7,4,6,8]
C[2,3,4,5,7,8,6]
D[2,3,4,5,6,7,8]
选择题答案:
1.A
2.C
3.C
4.C

Implementation of the heap:

Heap adjustment algorithm:

Now we give an array, which is logically regarded as a complete binary tree. We can adjust it through the downward adjustment algorithm starting from the root node
into a small pile. The downward adjustment algorithm has a premise: the left and right subtrees must be a heap to be adjusted.
int array[] = {27,15,19,18,28,34,65,49,25,37};

 Heap creation:

Below we give an array, which logically can be regarded as a complete binary tree, but it is not a heap yet. Now we use an algorithm to build it into a heap. The left and right subtrees of the root node are not heaps, how do we adjust it? Here we start to adjust from the last subtree of the first non-leaf node to the tree of the root node, and then we can adjust it into a pile.
int a[] = {1,5,3,8,7,6};

  

Proof of the time complexity of building a heap: 

Insertion into the heap:

First insert a 10 to the end of the array, and then perform an upward adjustment algorithm until the heap is satisfied.

 Heap deletion: 

 The code implementation of the heap:
Heap.h

#pragma once
#include<stdio.h>
#include<assert.h>
#include<stdlib.h>
#include<stdbool.h>
#include<time.h>
typedef int HPDataType;
typedef struct Heap
{
	HPDataType* a;
	int size;
	int capacity;
}HP;
void HeapPrint(HP* php)
{
	for (int i = 0; i < php->size; ++i)
	{
		printf("%d ", php->a[i]);
	}
	printf("\n");
}
void AdjustUp(HPDataType* a, int child);
void AdjustDown(HPDataType* a, int n, int parent);
void HeapInit(HP* php);
void Swap(HPDataType* p1, HPDataType* p2);
void HeapDestory(HP* php);
//插入x继续保持堆形态
void HeapPush(HP* php, HPDataType x);
//删除堆顶的元素
void HeapPop(HP* php);
//返回堆顶的元素
HPDataType HeapTop(HP* php);
bool HeapEmpty(HP* php);
int HeapSize(HP* php);

Heap.c

#include"Heap.h"
void HeapInit(HP* php)
{
	assert(php);
	php->a = NULL;
	php->size = php->capacity = 0;
}
void HeapDestory(HP* php)
{
	assert(php);
	free(php->a);
	php->a = NULL;
	php->capacity = php->size = 0;
}
void Swap(HPDataType* p1, HPDataType* p2)
{
	HPDataType tmp = *p1;
	*p1 = *p2;
	*p2 = tmp;
}
//向上调整算法
//堆的向上调整次数为完全二叉树的层数,即向上调整算法(堆中插入x)的时间复杂度为O(logN)
void AdjustUp(HPDataType* a, int child)
{
	int parent = (child - 1) / 2;
	//不要用while(parent>=0)作继续条件 当child=0时 parent仍为0 程序陷入死循环
	while (child > 0)
	{
		//小于改大于变大堆
		if (a[child] < a[parent])
		{
			Swap(&a[child], &a[parent]);
			child = parent;
			parent = (child - 1) / 2;
		}
		else
		{
			break;
		}
	}

}
//插入x继续保持堆形态
void HeapPush(HP* php, HPDataType x)
{
	assert(php);
	if (php->size == php->capacity)
	{
		int newCapacity = php->capacity == 0 ? 4 : php->capacity * 2;
		HPDataType* tmp = (HPDataType*)realloc(php->a, newCapacity * sizeof(HPDataType));
		if (tmp == NULL)
		{
			perror("realloc fail");
			exit(-1);
		}
		php->a[php->size] = x;
		php->size++;
		AjustUp(php->a, php->size - 1);
	}

}
//向下调整算法的前提条件是保证左子树右子树均为小堆
void AdjustDown(HPDataType* a, int n, int parent)
{
	int minChild = parent * 2 + 1;
	while (minChild < n)
	{
		//找出小的那个孩子
		if (minChild + 1 < n && a[minChild + 1] < a[minChild])
		{
			minChild++;
		}
		if (a[minChild] < a[parent])
		{
			Swap(&a[minChild], &a[parent]);
			parent = minChild;
			minChild = parent * 2 + 1;
		}
		else
		{
			break;
		}
	}
}
//删除堆顶的元素——找次大或者次小
//时间复杂度为O(logN)
void HeapPop(HP* php)
{
	assert(php);
	assert(!HeapEmpty(php));
	Swap(&php->a[0], &php->a[php->size - 1]);
	php->size--;
	AdjustDown(php->a, php->size, 0);
}
//返回堆顶的元素
HPDataType HeapTop(HP* php)
{
	assert(php);
	assert(!HeapEmpty(php));
	return php->a[0];
}
bool HeapEmpty(HP* php)
{
	assert(php);
	return php->size == 0;
}
int HeapSize(HP* php)
{
	assert(php);
	return php->size;
}

Heap application:

Heap sort: 

 Heap sort code implementation: 

//堆排序—时间复杂度O(N*logN)
//利用数据结构的堆来实现堆排序的缺陷:
//1.堆的数据结构实现复杂
//2.遍历堆再依次取出来放入新的数组中,空间复杂度为O(N)
//大思路:选择排序 依次选数 从后往前排
//升序—建大堆
//降序—建小堆
//改堆排序的升序和降序只需要改变向下调整算法的大于号和小于号
//如果升序建小堆如何依次选次小的数据出来
//第一个数据排好 剩下的数据看作堆 父子关系全乱了 只能重新建堆选次小的数据 效率降低
void HeapSort(int* a, int n)
{
	//建堆—向上调整建堆—O(N*logN)
	/*for (int i = 1; i < n; ++i)
	{
		AdjustUp(a, i);
	}*/
	//建堆—向下调整建堆—O(N)
	//保证左子树和右子树均为堆结构,从倒数第一个非叶子节点开始向下调整(最后一个节点的父亲) 直到调整到根
	//为什么高效?是因为不用调整完全二叉树的最后一层且节点越多调整的次数越少
	for (int i = (n - 1 - 1) / 2; i >= 0; --i)//不建议for(int i=n-1;i>=0;--i)
	{
		AdjustDown(a, n, i);
	}
	//升序
	//1.建大堆
	//2.第一个和最后位置交换 把最后一个不看做堆里面的 向下调整 选出次大的 后续依次类似调整
	//选数
	//通过向下调整算法选好n-1个数 最小的数自然就在前面了
	int i = 1;
	while (i < n)
	{
		Swap(&a[0], &a[n - i]);
		AdjustDown(a, n - i, 0);
		++i;
	}	
}
int main()
{
	int a[] = { 15.1,19,25,8,34,65,4,27,7 };
	HeapSort(a, sizeof(a) / sizeof(int));
	for (size_t i = 0; i < sizeof(a) / sizeof(int); ++i)
	{
		printf("%d ",a[i]);
	}
	printf("\n");
	return 0;
}

Top-K questions:

TOP-K problem: Find the top K largest elements or smallest elements in the data combination. Generally, the amount of data is relatively large .
For example: the top 10 professional players, the world's top 500 , the rich list, the top 100 active players in the game , etc.
For the Top-K problem, the most simple and direct way that can be thought of is sorting, but: if the amount of data is very large, sorting is not advisable ( possibly
None of the data can be loaded into memory all at once ) . The best way is to use the heap to solve it. The basic idea is as follows:
1. Use the first K elements in the data set to build a heap
For the first k largest elements, build a small heap
For the first k smallest elements, build a large heap
2. Use the remaining NK elements to compare with the top elements in turn, and replace the top elements if they are not satisfied
After comparing the remaining NK elements with the top elements of the heap in turn, the remaining K elements in the heap are the first K smallest or largest elements sought .
//Top-K问题
void CreateDataFile(filename, N)
{
	FILE* fin = fopen(filename, "w");
	if (fin == NULL)
	{
		perror("fopen fail");
		return;
	}
	srand(time(0));
	for (int i = 0; i < N; ++i)
	{
		fprintf(fin, "%d ",rand());
	}
	fclose(fin);
}
void PrintTopK(const char* filename, int k)
{
	assert(filename);
	FILE* fout = fopen(filename, "r");
	if (fout == NULL)
	{
		perror("fopen fail");
		return;
	}
	int* minHeap = (int*)malloc(sizeof(int) * k);
	if (minHeap == NULL)
	{
		perror("fopen fail");
		return;
	}
	//如何读取前K个数据
	for (int i = 0; i < k; ++i)
	{
		fscanf(fout, "%d", &minHeap[i]);
	}
	//建k个数小堆
	for (int j = (k - 2) / 2; j >= 0; --j)
	{
		AdjstDown(minHeap, k, j);
	}
	//继续读取后N-K个
	int val = 0;
	while (fscanf(fout, "%d", &val) != EOF)
	{
		if (val > minHeap[0])
		{
			minHeap[0] = val;
			AdjustDown(minHeap, k, 0);
		}
	}
	for (int i = 0; i < k; ++i)
	{
		printf("%d", minHeap[i]);
	}
	free(minHeap);
	fclose(fout);
}
int main()
{
	const char* filename = "Data.txt";
	int N = 10000;
	int K = 10;
	CreateDataFile(filename, N);
	PrintTopK(filename, K);
	return 0;
}

Guess you like

Origin blog.csdn.net/qq_66767938/article/details/129726236