Data structure: Seven sorts of hand-torn diagrams (including animation demonstration)

insert image description here

insertion sort

Insertion sorting is divided into direct insertion sorting and Hill sorting, among which Hill sorting is an algorithm worth learning

The basis of Hill sorting is direct insertion sorting, first learn direct insertion sorting

direct insertion sort

Direct insertion sorting is similar to the process of whole cards before playing poker. Suppose we now have four cards of 2 4 5 3, how should we do the whole cards?
The method is very simple, insert 3 between 2 and 4, thus completing the process of the whole card, and the algorithm of insertion sorting is such a process

The basic principle diagram of insertion sort is as follows

insert image description here
Here we define end as the last element of a piece of data that has been checked and sorted, and tmp as the element to be moved. The specific implementation method is as follows: here we find that tmp is indeed smaller than end, so the next step
insert image description here
is To move tmp to the appropriate position in the ordered data in front of end

The idea of ​​the algorithm implementation is: if tmp is smaller than the value of end, then let the value of end move backward, and end points to the previous one, and then compare and overwrite the movement...until the value of tmp is not smaller than end or end goes out of the array directly, if it goes out The array makes tmp the new first element

insert image description here

In this way, an insertion is completed, and then a sort is performed:

insert image description here
It can be seen from this that the overall idea of ​​insertion sorting is not complicated, and the code implementation is relatively simpler. The value of direct insertion sorting lies in preparing for Hill sorting

Insertion sort is implemented as follows:

void InsertSort(int* a, int n)
{
    
    
	for (int i = 0; i < n - 1; i++)
	{
    
    
		int end = i;      // 找到有序数组的最后一个元素
		int tmp = a[i + 1];  // 要参与插入排序的元素
		while (end >= 0)
		{
    
    
			if (a[end] > tmp)
			{
    
    
				// 进行覆盖
				a[end + 1] = a[end];
				end--;
			}
			else
			{
    
    
				break;
			}
		}

		a[end + 1] = tmp;
	}
}

The time complexity of direct insertion sorting is not difficult to analyze. It is O(N^2), which is at the same level as bubble sorting. It is not efficient. Direct insertion sorting is more to pave the way for
Hill sorting. Hill sorting is Very important sorting, the efficiency of processing data can be on par with quick sorting

Hill sort

After learning about insertion sorting above, we must summarize the disadvantages of insertion sorting

  1. Insertion sorting is highly efficient when operating on almost sorted data, that is, it can achieve the efficiency of linear sorting.
  2. But insertion sort is generally inefficient because insertion sort can only move data one bit at a time.

Therefore, Hill sorting is based on the above two problems to solve:
First, insertion sorting has a strong efficiency for sequences that have been sorted, but at the same time it can only adjust the position of one element at a time, so Hill is Invented Hill sorting, its specific operation is almost the same as insertion sorting, but on the basis of insertion sorting, there are more pre-sorting steps in front, pre-sorting is very important, and the size of a piece of data can be sorted basically the same

How is the implementation of pre-sorting achieved?

First, the data is grouped, assuming it is divided into 3 groups. The following picture demonstrates this process

insert image description here

After grouping, perform insertion sort on each group of elements individually

insert image description here
At this point, the data in the sequence is already very close to order, and then insert sorting the data here can perfectly adapt to the advantages of insertion sorting

insert image description here

Here is just how the basic idea of ​​Hill sorting works, and the specific code implementation is more complicated

Then the next question arises, why is the gap 3? If the amount of data is particularly large or 3? How should the choice of gap be chosen?
Here we need to understand the gap, what the gap is used for, and how its size will affect the final result

The gap is the size selected in the preprocessing stage of the data. Through the gap, the data can be made relatively orderly. The larger the gap, the more groups are divided into groups, and the less data is in each group. The smaller the gap, the smaller the gap. The finer it is, the closer it is to the real order. When the gap is 1, there is only one set of sequences at this time, and then the real order is arranged.

The code is implemented as follows:

void ShellSort(int* a, int n)
{
    
    
	int gap = n;
	while (gap > 1)
	{
    
    
		gap = gap / 3 + 1;
		for (int i = 0; i < n - gap; i++)
		{
    
    
			int end = i;
			int tmp = a[end + gap];
			while (end >= 0)
			{
    
    
				if (a[end] > tmp)
				{
    
    
					a[end + gap] = a[end];
					end = end - gap;
				}
				else
				{
    
    
					break;
				}
			}
			a[end + gap] = tmp;
		}
	}
}

Here are two key points to understand

gap = gap / 3 + 1;  // 这句的目的是什么?
gap==1  // 是什么

gap=gap/3+1 will make the final result of gap must be 1
and when gap is 1, it is insertion sorting at this time, and the sequence is also close to order. The advantages of insertion sorting can be well utilized

The time complexity of Hill sorting is quite difficult to calculate, and a complex mathematical model needs to be established. Here is a direct conclusion. The time complexity of Hill sorting is generally close to O(N^1.3). The overall efficiency is not low, and it is worthwhile. In-depth study

selection sort

selection sort

The implementation of the basic version of selection sorting is very simple, and the algorithm idea is as follows

insert image description here

insert image description here

One thing to note here is that maxi may overlap with begin, resulting in bugs when exchanging begin and min, so you only need to add the conditions in front

void SelectSort(int* a, int n)
{
    
    
	int begin = 0, end = n - 1;
	while (begin < end)
	{
    
    
		int maxi = begin, mini = begin;
		for (int i = begin; i <= end; i++)
		{
    
    
			if (a[i] > a[maxi])
			{
    
    
				maxi = i;
			}

			if (a[i] < a[mini])
			{
    
    
				mini = i;
			}
		}

		Swap(&a[begin], &a[mini]);
		// 如果maxi和begin重叠,修正一下即可
		if (begin == maxi)
		{
    
    
			maxi = mini;
		}

		Swap(&a[end], &a[maxi]);

		++begin;
		--end;
	}
}

heap sort

Heap sorting has been explained in detail in previous articles, so I won’t go into details here

Data structure - hand-torn graphic heap

Directly implement the code

void Swap(int* p, int* c)
{
    
    
	int tmp = *p;
	*p = *c;
	*c = tmp;
}

void AdjustDown(int* a, int n, int parent)
{
    
    
	int child = parent * 2 + 1;
	while (child < n)
	{
    
    
		if (child + 1 < n && a[child + 1] > a[child])
		{
    
    
			child++;
		}
		if (a[parent] < a[child])
		{
    
    
			Swap(&a[parent], &a[child]);
			parent = child;
			child = parent * 2 + 1;
		}
		else
		{
    
    
			break;
		}
	}
}

void HeapSort(int* a, int n)
{
    
    
	// 建堆
	for (int i = (n - 2) / 2; i >= 0; i--)
	{
    
    
		AdjustDown(a, n, i);
	}

	// 堆排序
	int end = n - 1;
	while (end)
	{
    
    
		Swap(&a[0], &a[end]);
		AdjustDown(a, end, 0);
		end--;
	}
}

swap sort

Bubble Sort

insert image description here

The first sort of entry and exit, the efficiency is very low

void BubbleSort(int* a, int n)
{
    
    
	for (int i = 0; i < n - 1; i++)
	{
    
    
		int flag = 0;
		for (int j = 0; j < n - i - 1; j++)
		{
    
    
			if (a[j] > a[j + 1])
			{
    
    
				flag = 1;
				int tmp = a[j];
				a[j] = a[j + 1];
				a[j + 1] = tmp;
			}
		}
		if (flag == 0)
		{
    
    
			break;
		}
	}
}

The focus of the following is to learn quick sorting, which is normally the most versatile sorting algorithm

quick sort

Quick sort is the fastest sorting algorithm among all sorting algorithms (under the premise of a huge amount of data), therefore, the sort that comes with many libraries uses quick sort as the underlying implementation, such as qsort and STL sort, therefore, it is necessary to learn it well

First explain its basic idea

The basic idea is to select an element as the key, and through a series of algorithms, the data smaller than the key in the original array is on the left side of the key, and the data larger than the key is on the right side of the key, and then recursively enter the left side of the key, in the recursive function Repeat this operation, and finally the sorting can be completed. Then the first key point is how to realize that the key is used as the dividing point, and the left and right are larger and smaller than it, respectively?

There are many versions of this algorithm, we introduce them one by one

hoare version

The founder of quick sort is hoare. As the ancestor of quick sort, hoare naturally wrote the corresponding algorithm in the research of quick sort, so of course we must first learn the most classic algorithm

The following shows a schematic diagram of the hoare algorithm

insert image description here
insert image description here
After reading the deduction diagram and the flow chart above, you can probably understand the basic idea of ​​the hoare algorithm, but there are still some problems, such as the last exchanged element (3 in the above figure) must be smaller than the key? Wouldn't being larger than the key make the larger element go to the left of the key?

Explain the reason for the above problem

In fact, the reason for the problem lies in the question of who goes first, left or right. In the above algorithm, right goes first. Why?
We assume that the element in the middle is not 3, but 8 (anything greater than the key is fine). Then, when right continues to move forward, it will skip 8 and continue to look forward. In the end, the worst result will find left, and left Corresponding to the elements that are smaller than the key after being exchanged with the front, so as long as the right goes first here, the position where it finally meets right and left must be smaller than the key!

This algorithm is actually not easy to write. There are many variables and problems that need to be controlled. The implementation process is as follows

int PartSort1(int* a, int left, int right)
{
    
    
	int keyi = left;
	while (left < right)
	{
    
    
		while (left < right && a[right] >= a[keyi])
		{
    
    
			right--;
		}

		while (left < right && a[left] <= a[keyi])
		{
    
    
			left++;
		}

		Swap(&a[left], &a[right]);
	}
	Swap(&a[keyi], &a[left]);
	
	return left;
}

important point

  1. The selection of keyi is left instead of 0, because the subscript of the leftmost element is not necessarily 0 when recursing later
  2. When the while loop is looking forward/backward, it is necessary to judge whether the left is smaller than the right at any time to prevent crossing the boundary
  3. Returns an lvalue, which is the next left or right boundary,

Implementation of Quick Sort

void QuickSort(int* a, int begin, int end)
{
    
    
	if (begin >= end)
	{
    
    
		return;
	}
	int keyi = PartSort1(a, begin, end);

	QuickSort(a, begin, keyi - 1);
	QuickSort(a, keyi + 1, end);
}

The following three writing methods only need to replace PartSort1

pit digging

insert image description here
insert image description here
The code is implemented as follows:

int PartSort2(int* a, int left, int right)
{
    
    
	int key = a[left];
	int hole = left;
	while (left < right)
	{
    
    
		while (left<right && a[right]>= key)
		{
    
    
			right--;
		}

		a[hole] = a[right];
		hole = right;

		while (left < right && a[left] <= key)
		{
    
    
			left++;
		}

		a[hole] = a[left];
		hole = left;
	}

	a[hole] = key;

	return hole;
}

void QuickSort(int* a, int begin, int end)
{
    
    
	if (begin >= end)
	{
    
    
		return;
	}
	int keyi = PartSort2(a, begin, end);

	QuickSort(a, begin, keyi - 1);
	QuickSort(a, keyi + 1, end);
}

This implementation is very simple, no extra attention is required, and it is easier to understand than the first algorithm

forward and backward pointer method

The realization principle is shown in the figure below

insert image description here
insert image description here
The code implementation logic is as follows

int PartSort3(int* a, int left, int right)
{
    
    
	int cur = left+1;
	int prev = left;
	int keyi = left;
	while (cur <= right)
	{
    
    
		if (a[cur] < a[keyi])
		{
    
    
			++prev;
			Swap(&a[prev], &a[cur]);
		}
		cur++;
	}

	Swap(&a[prev], &a[keyi]);

	return prev;
}

void QuickSort(int* a, int begin, int end)
{
    
    
	if (begin >= end)
	{
    
    
		return;
	}
	int keyi = PartSort3(a, begin, end);

	QuickSort(a, begin, keyi - 1);
	QuickSort(a, keyi + 1, end);
}

In fact, prev finds cur. If the value corresponding to the cur pointer is smaller than key, then ++prev is exchanged, otherwise cur will continue to move forward, so that all the data between cur and prev will be data larger than key

Recursive Expanded Graph of Quick Sort

After understanding the working principle of quick sort, it is helpful to draw its recursive expansion diagram independently to understand its working principle

insert image description here

Quick sort optimization

Quick sort is indeed an excellent sort in many aspects, but just using the above code cannot completely solve the problem. Assuming that the given sequence is a sequence in ascending order, then the key we choose at this time is the smallest data. The time complexity is O(N^2), but if the data selected each time happens to be the median, then it is the most efficient way for the entire data. The time complexity is O(NlogN), so how to optimize?

Common optimizations include the method of taking the middle of three numbers and recursively selecting insertion sorting into small subranges

middle of three

As the name suggests, it is to take the three numbers at the beginning, the end and the middle, choose the middle one of these three numbers, and let this number be the key

int GetMid(int* a, int left, int right)
{
    
    
	int midi = (left + right) / 2;
	if (a[left] < a[midi])
	{
    
    
		if (a[midi] < a[right])
		{
    
    
			return midi;
		}
		else if (a[left] > a[right])
		{
    
    
			return left;
		}
		else
		{
    
    
			return right;
		}
	}
	else  // a[left] > a[midi]
	{
    
    
		if (a[midi] > a[right])
		{
    
    
			return midi;
		}
		else if (a[left] < a[right])
		{
    
    
			return left;
		}
		else
		{
    
    
			return right;
		}
	}
}

int PartSort1(int* a, int left, int right)
{
    
    
	int midi = GetMid(a, left, right);
	Swap(&a[midi], &a[left]);
	int keyi = left;
	while (left < right)
	{
    
    
		while (left < right && a[right] >= a[keyi])
		{
    
    
			right--;
		}

		while (left < right && a[left] <= a[keyi])
		{
    
    
			left++;
		}

		Swap(&a[left], &a[right]);
	}
	Swap(&a[keyi], &a[left]);

	return left;
}

int PartSort2(int* a, int left, int right)
{
    
    
	int midi = GetMid(a, left, right);
	Swap(&a[midi], &a[left]);
	int key = a[left];
	int hole = left;
	while (left < right)
	{
    
    
		while (left < right && a[right] >= key)
		{
    
    
			right--;
		}

		a[hole] = a[right];
		hole = right;

		while (left < right && a[left] <= key)
		{
    
    
			left++;
		}

		a[hole] = a[left];
		hole = left;
	}

	a[hole] = key;

	return hole;
}

int PartSort3(int* a, int left, int right)
{
    
    
	int midi = GetMid(a, left, right);
	Swap(&a[midi], &a[left]);
	int cur = left+1;
	int prev = left;
	int keyi = left;
	while (cur <= right)
	{
    
    
		if (a[cur] < a[keyi])
		{
    
    
			++prev;
			Swap(&a[prev], &a[cur]);
		}
		cur++;
	}

	Swap(&a[prev], &a[keyi]);

	return prev;
}

void QuickSort(int* a, int begin, int end)
{
    
    
	if (begin >= end)
	{
    
    
		return;
	}
	int keyi = PartSort1(a, begin, end);

	QuickSort(a, begin, keyi - 1);
	QuickSort(a, keyi + 1, end);
}

Non-recursive implementation of quicksort

Quick sorting is implemented using recursion, and any recursion may burst the stack, so here is a non-recursive implementation of quick sorting

The non-recursive implementation is implemented with the help of the stack. The stack is implemented by malloc on the heap. The stack area is generally about tens of Mb, while the heap area has a space of about several G. It is no problem to complete the operation on the heap.

insert image description here

Only when left<keyi-1 will be pushed into the stack, and when keyi+1<right will be pushed into the stack

As the stack is continuously pushed in and out, the interval division becomes smaller and smaller, and the left will eventually be equal to keyi-1, so that it will not be pushed into the stack. The same is true for the right side. If it is not pushed into the stack, it will only be pushed out of the stack. Finally, the stack will be empty. When the stack is When empty, the sort is complete

merge sort

The sorting principle of merge sort is as follows:

insert image description here

It can be seen from this that the principle of merge sorting is to decompose a whole large, unordered array into small arrays until there is only one number in the array, then assemble the array into an ordered array, and then assemble it into an ordered array. The two arrays are merged into an ordered array, and then this ordered array is merged with another...recursively in turn, so that it recurses into a suitable array like a binary tree

insert image description here
The code is implemented as follows:

void _MergeSort(int* a, int begin, int end, int* tmp)
{
    
    
	if (begin == end)
		return;
	int mid = (begin + end) / 2;
	_MergeSort(a, begin, mid, tmp);
	_MergeSort(a, mid + 1, end, tmp);

	int begin1 = begin, end1 = mid;
	int begin2 = mid + 1, end2 = end;
	int i = begin;
	while (begin1 <= end1 && begin2 <= end2)
	{
    
    
		if (a[begin1] < a[begin2])
		{
    
    
			tmp[i++] = a[begin1++];
		}
		else
		{
    
    
			tmp[i++] = a[begin2++];
		}
	}
	while (begin1 <= end1)
	{
    
    
		tmp[i++] = a[begin1++];
	}

	while (begin2 <= end2)
	{
    
    
		tmp[i++] = a[begin2++];
	}

	memcpy(a + begin, tmp + begin, sizeof(int) * (end - begin + 1));
}

void MergeSort(int* a, int n)
{
    
    
	int* tmp = (int*)malloc(sizeof(int) * n);
	_MergeSort(a, 0, n - 1, tmp);

	free(tmp);
}

Guess you like

Origin blog.csdn.net/qq_73899585/article/details/131876783