[Data structure] Heap/heap sorting (including top-k problems) (adjustment method) (concise, including code)

Table of contents

1. The logical structure and physical structure of the heap

    1. Array storage means binary tree

    2. Parent-child relationship in the heap

    3. The basic concept of size heap

2. Two adjustment methods of the heap

    1. Upward adjustment (time complexity O(nlogn))

    2. Down adjustment (time complexity O(n))

3. Building piles

1. Downward adjustment of pile building method

2. Adjust the pile up

4. Heap sorting (using the idea of ​​​​heap deletion to sort)

    1. Arrange ascending order - build a large pile

    2. Sort in descending order - build a small heap

 3. Heap sort complexity analysis

5. Practical application (top-k problem)


1. The logical structure and physical structure of the heap

A heap satisfies two conditions:

1. The value of a node in the heap is always not greater than or not less than the value of its parent node

2. The heap is always a complete binary tree

    1. Array storage means binary tree

Array storage means that binary trees are only suitable for complete binary trees, thinking that a lot of space will be wasted

    2. Parent-child relationship in the heap

    3. The basic concept of size heap

Large root heap: the parent node in the tree is greater than /equal to the child

Small root heap: the parent node of the tree species is less than /equal to the child


2. Two adjustment methods of the heap

PS: On the basis of the existing heap, adjust the elements corresponding to the subscript (parent, child) 

    1. Upward adjustment (time complexity O(nlogn))

[parameters: array, child]

// 除了child这个位置,前面数据构成堆
void AdjustUp(HPDataType* a, int child)
{
	int parent = (child - 1) / 2;
	//while (parent >= 0)
	while(child > 0)
	{
		if (a[child] > a[parent])
		{
			Swap(&a[child], &a[parent]);
			child = parent;
			parent = (child - 1) / 2;
		}
		else
		{
			break;
		}
	}
}

    2. Down adjustment (time complexity O(n))

Note: There is a condition for downward adjustment, the left and right subtrees must be large heaps/small heaps

[In order to meet this condition: if you want to build a heap, you need to start adjusting from the first parent node at the bottom, the code is as follows]

[parameters: array, bounds , parents]

Code embodiment:

// 左右子树都是大堆/小堆
void AdjustDown(HPDataType* a, int n, int parent)
{
	int child = parent * 2 + 1;
	while (child < n)
	{
		// 选出左右孩子中大的那一个
		if (child + 1 < n && a[child+1] > a[child])
		{
			++child;
		}

		if (a[child] > a[parent])
		{
			Swap(&a[child], &a[parent]);
			parent = child;
			child = parent * 2 + 1;
		}
		else
		{
			break;
		}
	}
}

3. Building piles

1. Downward adjustment of pile building method

The downward adjustment of the heap must satisfy that the left and right subtrees are both large/small heaps.

Therefore, to build a heap, start from the penultimate parent node and traverse back to build the heap.

void HeapSort(int* a, int n)
{
	// 建堆 -- 向下调整建堆
	for (int i = ((n-1)-1)/2; i < n; ++i) //((n-1)-1)/2是第一个父母结点
	{
		AdjustUp(a, i);
	}
	// 自己先实现
}

2. Adjust the pile up

traverse directly.

void HeapSort(int* a, int n)
{
	// 建堆 -- 向上调整建堆
	for (int i = 1; i < n; ++i)
	{
		AdjustUp(a, i);
	}
	// 自己先实现
}

4. Heap sorting (using the idea of ​​​​heap deletion to sort)

    1. Arrange ascending order - build a large pile

analyze:

1. It can be ensured that after each replacement, the largest number will reach the end of the number.

2. It can be ensured that after each replacement, the tree other than the last node is adjusted, and the largest number of the rest will go to the ancestor node.

// 排升序 -- 建大堆 -- O(N*logN)
void HeapSort(int* a, int n)
{
	// 建堆 -- 向上调整建堆 -- O(N*logN)
	/*for (int i = 1; i < n; ++i)
	{
		AdjustUp(a, i);
	}*/

	// 建堆 -- 向下调整建堆 -- O(N)
	for (int i = (n - 1 - 1) / 2; i >= 0; --i)
	{
		AdjustDown(a, n, i);
	}

	// 自己先实现 -- O(N*logN)
	int end = n - 1;
	while (end > 0)
	{
		Swap(&a[end], &a[0]);
		AdjustDown(a, end, 0);

		--end;
	}
}

    2. Sort in descending order - build a small heap

       (same as above)

 3. Heap sort complexity analysis

1. Calculated by mathematics: the time complexity of upward adjustment is O(nlogn), and the time complexity of downward adjustment is O(n)

2. From the mathematical relationship of the binary tree, it is roughly concluded that the cost of sorting from the ancestor node to the last layer is about 2^(h-1)*(h-1),

It is known that the number consumed is about half of the total. The time complexity O(nlogn) can be obtained from the time complexity of negligible rational numbers

It can be seen that bloggers related binary tree blogs: (62 messages) Basic introduction to (binary) trees (concise and easy to understand, including code)_YYDsis' Blog-CSDN Blog


5. Practical application (top-k problem)


void CreateNDate()
{
	// 造数据
	int n = 10000000;
	srand(time(0));
	const char* file = "data.txt";
	FILE* fin = fopen(file, "w");
	if (fin == NULL)
	{
		perror("fopen error");
		return;
	}

	for (size_t i = 0; i < n; ++i)
	{
		int x = rand() % 10000;
		fprintf(fin, "%d\n", x);
	}

	fclose(fin);
}
void PrintTopK(const char* file, int k)
{
	// 1. 建堆--用a中前k个元素建小堆
	int* topk = (int*)malloc(sizeof(int) * k);
	assert(topk);

	FILE* fout = fopen(file, "r");
	if (fout == NULL)
	{
		perror("fopen error");
		return;
	}
	// 读出前k个数据建小堆
	for(int i = 0; i < k; ++i)
	{
		fscanf(fout, "%d", &topk[i]);
	}
	for (int i = (k-2)/2; i >= 0; --i)
	{
		AdjustDown(topk, k, i);
	}
	// 2. 将剩余n-k个元素依次与堆顶元素交换,不满则则替换
	int val = 0;
	int ret = fscanf(fout, "%d", &val);//自动跳到下一个
	while (ret != EOF)
	{
		if (val > topk[0])
		{
			topk[0] = val;
			AdjustDown(topk, k, 0);
		}

		ret = fscanf(fout, "%d", &val);
	}
	for (int i = 0; i < k; i++)
	{
		printf("%d ", topk[i]);
	}
	printf("\n");

	free(topk);
	fclose(fout);
}

  

Guess you like

Origin blog.csdn.net/YYDsis/article/details/130109695