Data Structure - Eight Big Sorting

insert image description here

One, direct insertion sort

insert image description here

Idea:
Insert a new data on the basis of the already ordered data. The most ordered data has a data subscript end, compare the data to be inserted with the end subscript data, if it is less than the end subscript data, Then the data of the end subscript will be moved to the position of end+1 subscript, and at the same time – end. At the same time, it should also be noted that when the end is reduced to -1, it is necessary to stop the comparison and insert directly to the position where the subscript is 0.
insert image description here
Look at the code below:

void InsertSort(int* a, int size)
{
    
    
	for (int i = 0; i < size - 1; i++)
	{
    
    
		int end = i;
		int tmp = a[end + 1];
		while (end >= 0)
		{
    
    
			if (tmp < a[end])
			{
    
    
				a[end + 1] = a[end];
				--end;
			}
			else
			{
    
    
				break;
			}
		}
		a[end + 1] = tmp;
	}
}

Time complexity of selection sort: O(N^2)
Space complexity: O(1)

Second, Hill sorting

insert image description here

Hill sorting is an upgraded version of direct insertion sorting. Since direct insertion sorting needs to move a large amount of data when the inserted value is small, the improvement made by Hill sorting is to add multiple sets of pre-sorting. The data is grouped into one group, which can be divided into gap groups. When pre-sorting the gap group data, when the data needs to be moved, the data jump step is large, and it is no longer moved step by step like direct insertion sorting.
insert image description here
Therefore, the single-pass sorting of Hill sorting is very similar to the direct insertion sorting.
Let's look at the code of single-pass sorting:

for(int j = 0;j < gap; j++)
{
    
    
 for (int i = j; i < size - gap; i+=gap)
	{
    
    
		int end = i;
		int tmp = a[end + gap];
		while (end >= 0)
		{
    
    
			if (tmp < a[end])
			{
    
    
				a[end + gap] = a[end];
				end -= gap;
			}
			else
			{
    
    
				break;
			}
		}
		a[end + gap] = tmp;
	}
}

The above writing method is to arrange another group after each group is arranged. We can simplify it and directly arrange multiple groups side by side:

for (int i = 0; i < size - gap; i++)
		{
    
    
			int end = i;
			int tmp = a[end + gap];
			while (end >= 0)
			{
    
    
				if (tmp < a[end])
				{
    
    
					a[end + gap] = a[end];
					end -= gap;
				}
				else
				{
    
    
					break;
				}
			}
			a[end + gap] = tmp;
		}

Here is the complete code for Hill sorting:

void ShellSort(int* a, int size)
{
    
    
	int gap=size;
	while (gap > 1)
	{
    
    
		gap = gap / 3 + 1;
		for (int i = 0; i < size - gap; i++)
		{
    
    
			int end = i;
			int tmp = a[end + gap];
			while (end >= 0)
			{
    
    
				if (tmp < a[end])
				{
    
    
					a[end + gap] = a[end];
					end -= gap;
				}
				else
				{
    
    
					break;
				}
			}
			a[end + gap] = tmp;
		}
	}
}

The gap is gradually reduced, and when it is reduced to 1, it is the direct insertion sort, but at this time, compared with the direct insertion sort, the data is mostly in order.

Hill sorting is an excellent sorting
time complexity: O(N^1.3)
space complexity: O(1)

Third, select sort

insert image description here

The idea of ​​selection sorting is:
each time the array is traversed, the subscript of the largest data and the subscript of the smallest data are marked, and they are exchanged with the last data and the beginning data respectively.
Look at the code below:

void SelectSort(int* a, int size)
{
    
    
	int begin = 0;
	int end = size - 1;
	while (begin < end)
	{
    
    
		int mini = begin;
		int maxi = begin;
		for (int i = begin + 1; i <= end; i++)
		{
    
    
			if (a[i] > a[maxi])
			{
    
    
				maxi = i;
			}
			if (a[i] < a[mini])
			{
    
    
				mini = i;
			}
		}
		swep(&a[begin], &a[mini]);
		if (maxi == begin)
		{
    
    
			maxi = mini;
		}
		swep(&a[maxi], &a[end]);
		++begin;
		--end;
	}
}

This way of writing the subscripts of the maximum value and minimum value in a loop will have a small problem, that is, the situation where maxi and begin overlap. When we exchange the data between mini and begin, the data at the beginning subscript position at this time will be It is no longer the largest data, so make a judgment before exchanging the data of the end subscript and maxi subscript position.
insert image description here
Time complexity of selection sort: O(N^2)
Space complexity: O(1)

Fourth, heap sort

insert image description here

The detailed process of heap sorting has been mentioned in the previous blog:
link: heap sorting blog

void AdjustDown(int* a, int size, int parent)
{
    
    
	int child = parent * 2 + 1;
	while (child < size)
	{
    
    
		if (child+1<size&&a[child + 1] > a[child])
		{
    
    
			child++;
		}
		if (a[child] > a[parent])
		{
    
    
			swep(&a[child], &a[parent]);
			parent = child;
			child = parent * 2 + 1;
		}
		else
		{
    
    
			break;
		}
	}
}
void HeapSort(int* a, int size)
{
    
    
	for (int i = (size - 2) / 2; i >= 0; i--)
	{
    
    
		AdjustDown(a, size, i);
	}
	int end = size - 1;
	while (end > 0)
	{
    
    
		swep(&a[0], &a[end]);
		AdjustDown(a, end, 0);
		--end;
	}
}

Heap sorting is also an excellent sorting:
time complexity: O(N*logN)
space complexity: O(1)

Five, bubble sort

insert image description here

It has to be said that bubble sorting is definitely a very classic sorting, and it appears in large numbers in teaching.
Idea:
Compare the data before and after. If the data in the front is greater than the data in the back, exchange it. After the first round of sorting, the maximum value is changed to the end. In this way, the second largest data is placed in the penultimate position. …

void BubbleSort(int* a, int size)
{
    
    
	for (int i = 0; i < size; i++)
	{
    
    
		int flag = 1;
		for (int j = 0; j < size - 1 - i; j++)
		{
    
    
			if (a[j] > a[j + 1])
			{
    
    
				flag = 0;
				swep(&a[j], &a[j + 1]);
			}
		}
		if (flag)
		{
    
    
			break;
		}
	}
}

Bubble sorting can also perform a small optimization, define a flag identifier, and if no data exchange has been performed during a comparison, it proves that the set of data is already in order at this time, then break directly Just jump out of the loop.

Although bubble sorting is widely used in teaching, its performance is not high,
time complexity: O(N^2)
space complexity: O(1)

Six, quick sort

In practice, quick sorting is the most widely used sorting method, and calling it quick sorting is by no means a vain name~
One-way sorting idea of ​​quick sorting: first select a key position, and adjust the key subscript data to it in the entire data The correct position of the key position, and split the left and right intervals of the key position into sub-problems similar to it.

1. Recursive version

(1) Hoare method

insert image description here

Hoare method:
1. Select the key position, usually in each position of the data, that is, the left position.
2. If the key is selected at the left position, the right pointer moves first, and stops when it finds data smaller than the data at the key position.
3. Move the left pointer again, and stop after finding the data larger than the key position data.
4. Exchange the data at the left and right positions, and repeat the previous operation until the left and right positions meet.
5. After the meeting, exchange the data of the key position and the meeting position. At this time, the data on the left of the meeting position is less than or equal to the data of the meeting position, and the data on the right is greater than or equal to the data of the meeting position. So, this data is adjusted to its correct position.

int PartSort1(int* a, int begin, int end)
{
    
    
	int keyi = begin;
	int left = begin;
	int right = end;
	while (left < right)
	{
    
    
		//找小
		while (left < right && a[right] >= a[keyi])
		{
    
    
			--right;
		}
		//找大
		while (left < right && a[left] <= a[keyi])
		{
    
    
			++left;
		}
		swep(&a[left], &a[right]);
	}
	swep(&a[keyi], &a[left]);
	keyi = left;
	return keyi;
}

At this point, some people must have doubts? Why the data at the meeting position must be less than or equal to the data at the key position.
First of all, the encounter is nothing more than two situations:
1. The right pointer finds the left pointer
2, and the left pointer finds the right pointer.
First analyze 1:
When the right pointer moves, it means that there has been an exchange before. After the exchange, the left must point to the position of the key Data with small data. Even if the right pointer moves directly to the leftmost end of the data and meets left when it moves for the first time, the data at the meeting position at this time is the same as the data at the key position.
insert image description here
Re-analysis 2:
Before the left pointer moves, the right pointer must have moved (because it is stipulated that the right pointer moves first), then the data pointed to by right at this time must be smaller than the data at the key position, so when left and right meet, the data at the position where they meet It is smaller than the data at the key position.

(2) digging method

insert image description here

After the emergence of the hoare method, there are many small details, and many new placement methods have appeared later, such as: digging method
1, assigning the data at the left position of the array to the key, forming the first hole position, which is the left position.
Note: At this time, the key is no longer the subscript
2 of the array, and the right pointer moves first, and stops when the data smaller than the key is found, fills the data into the pit, and assigns the right at this time to the hole to form a new pit bit.
3. Move the left pointer backwards, find the data that is larger than the key data and stop, fill it in the pit, assign the left to the hole, and form a new pit.
4. Stop when right and left meet, and fill the key into the pit. At this point, the key data is adjusted to its correct position.

int PartSort2(int* a, int begin, int end)
{
    
    
	int key = a[begin];
	int hole = begin;
	int left = begin;
	int right = end;
	while (left < right)
	{
    
    
		while (left < right && a[right] >= key)
		{
    
    
			--right;
		}
		a[hole] = a[right];
		hole = right;
		while (left < right && a[left] <= key)
		{
    
    
			++left;
		}
		a[hole] = a[left];
		hole = left;
	}
	a[hole] = key;
	return hole;
}

(3) Back and forth pointer method (recommended)

insert image description here

Define a key as the subscript of the first element of the array, prev points to the first element, and cur points to the second element.
1. The cur pointer moves and stops when it finds a number smaller than the subscript of key.
2. ++prev, exchange the data of prev and cur position.
3. When cur points to the next position of the last position in the array, the loop stops.
4. Exchange the data of key subscript and prev subscript.

int PartSort3(int* a, int begin, int end)
{
    
    
	int keyi = begin;
	int prev = begin;
	int cur = begin + 1;
	while (cur <= end)
	{
    
    
		if (a[cur] < a[keyi]&&++prev!=cur)
		{
    
    
			swep(&a[prev], &a[cur]);
		}
		++cur;
	}
	swep(&a[prev], &a[keyi]);
	keyi = prev;
	return keyi;
}

In this code, some people may have doubts about the if condition judgment statement, why write it like this?
The following is an illustration to answer:
insert image description here
the above figure is the whole process of a single cycle, we found that the previous few times cur and prev point to the same position, so there is no need to exchange.

Here's the complete code for the recursive version:

int PartSort3(int* a, int begin, int end)
{
    
    
	int keyi = begin;
	int prev = begin;
	int cur = begin + 1;
	while (cur <= end)
	{
    
    
		if (a[cur] < a[keyi]&&++prev!=cur)
		{
    
    
			swep(&a[prev], &a[cur]);
		}
		++cur;
	}
	swep(&a[prev], &a[keyi]);
	keyi = prev;
	return keyi;
}
void QuickSort1(int* a, int begin, int end)
{
    
    
	if (begin >= end)
	{
    
    
		return;
	}
	swep(&a[begin], &a[mid]);
	int keyi = PartSort3(a, begin, end);
	QuickSort1(a, begin, keyi - 1);
	QuickSort1(a, keyi+1, end);
}

2, non-recursive version

Although the recursive writing method is simple, when the recursion depth is too deep, there will be problems, so there is a non-recursive writing method: non-recursive is
realized by means of the stack of the data structure
, because the fast sort is for the same array. The interval is adjusted, so the interval to be adjusted is stored in the stack, and then the interval at the top of the stack is taken out for adjustment each time. After the adjustment, a new two-segment interval will be formed and pushed onto the stack.
Note: In order to simulate the recursive process, you must first compress the segment interval, and then compress the left end interval. Also, if the number of data in the generated two intervals is less than two, there is no need to push the stack.

void QuickSortNonR(int* a, int begin, int end)
{
    
    
	ST st;
	STInit(&st);
	STPush(&st, end);
	STPush(&st, begin);
	while (!STEmpty(&st))
	{
    
    
		int left = STTop(&st);
		STPop(&st);
		int right = STTop(&st);
		STPop(&st);
		int keyi = left;
		int cur = left + 1;
		int prev = left;
		while (cur <= right)
		{
    
    
			if (a[cur] < a[keyi]&&++prev!=cur)
			{
    
    
				swep(&a[prev], &a[cur]);
			}
			++cur;
		}
		swep(&a[prev], &a[keyi]);
		keyi = prev;
		if (keyi + 1 < right)
		{
    
    
			STPush(&st, end);
			STPush(&st, keyi + 1);
		}
		if (left < keyi - 1)
		{
    
    
			STPush(&st, keyi - 1);
			STPush(&st, begin);
		}
	}
	STDestroy(&st);
}

3. Optimization of quick sorting

insert image description here
When the key selected each time is finally adjusted to the middle position of the array, then the quick sort is similar to a binary structure, and the time complexity is O(N*logN).

(1) Take the middle of the three numbers

When the array itself is ordered or close to order, each time the key is adjusted to the first or last position, resulting in the fast sorting not reaching a dichotomous structure, the time complexity at this time is O(N ^2).
In order to prevent this phenomenon, take the middle of three numbers, select the second largest data in a random position of begin, end, and begin data to exchange, and mark it as a key at the begin position.
In this way, the above problems will not exist

int GetMidIndex(int* a, int begin, int end)
{
    
    
	int mid = begin + rand() % (end - begin);
	if (a[begin] > a[end])
	{
    
    
		if (a[end] > a[mid])
		{
    
    
			return end;
		}
		else if (a[mid] > a[begin])
		{
    
    
			return begin;
		}
		else
		{
    
    
			return mid;
		}
	}
	else
	{
    
    
		if (a[mid] < a[begin])
		{
    
    
			return begin;
		}
		else if (a[mid] > a[end])
		{
    
    
			return end;
		}
		else
		{
    
    
			return mid;
		}
	}
}

(2) Inter-cell optimization

Normal quick sorting is similar to a dichotomous structure,
insert image description here
the number of the last layer accounts for about 1/2 of the total, and the last two layers account for about 3/4 of the total.
Since recursion is costly, when the amount of data in the interval is less than 10, we use insertion sort instead of quick sort to reduce the consumption of recursion.

void QuickSort1(int* a, int begin, int end)
{
    
    
	if (begin >= end)
	{
    
    
		return;
	}
	if ((end - begin + 1) < 10)
	{
    
    
		InsertSort(a, (end - begin + 1));
	}
	int mid = GetMidIndex(a, begin, end);
	swep(&a[begin], &a[mid]);
	int keyi = PartSort3(a, begin, end);
	QuickSort1(a, begin, keyi - 1);
	QuickSort1(a, keyi+1, end);
}

(3) Three-way division

When all or most of the data in a set of data to be sorted is equal, the efficiency of quick sorting will also be greatly reduced.
1. Left and right point to the first and last elements of the array respectively, and assign the first element to key.
2. The cur pointer points to the next element of the first element.
3. When the data pointed to by cur is smaller than key, exchange the data of left and cur, and left++, cur++
4. When the data pointed to by cur is greater than key, exchange the data of right and cur, and right–.
5. When the data pointed to by cur is equal to key, cur++.
6. When cur and right are missed, stop the loop.
insert image description here

Split the array into three segments, begin-left-1 is the data smaller than the key,
left to right is the data equal to the key
right+1 to end is the data greater than the key

void QuickSort2(int* a, int begin, int end)
{
    
    
	if (begin >= end)
	{
    
    
		return;
	}
	if ((end - begin + 1) < 10)
	{
    
    
		InsertSort(a, (end - begin + 1));
	}
	int mid = GetMidIndex(a, begin, end);
	swep(&a[begin], &a[mid]);
	int left = begin;
	int right = end;
	int key = a[begin];
	int cur = begin + 1;
	while (cur <= right)
	{
    
    
		if (a[cur] < key)
		{
    
    
			swep(&a[left], &a[cur]);
			++left;
			++cur;
		}
		else if (a[cur] > key)
		{
    
    
			swep(&a[right], &a[cur]);
			--right;
		}
		else
		{
    
    
			++cur;
		}
	}
	QuickSort2(a, begin, left-1);
	QuickSort2(a, right+1, end);
}

Seven, merge sort

1. Recursive implementation

insert image description here

Idea: Merge-sort (MERGE-SORT) is an effective sorting algorithm based on the merge operation. This algorithm is
a very typical application of divide and conquer (Divide and Conquer). Combine the ordered subsequences to obtain a completely ordered sequence; that is, first make each subsequence in order
, and then make the segments of the subsequences in order. Merging two sorted lists into one sorted list is called a two-way merge. Merge sort core steps:

(1) Split:

insert image description here
(2) merge
insert image description here

Recursive code implementation:

void _MergeSort(int* a, int begin, int end,int* tmp)
{
    
    
	if (begin >= end)
	{
    
    
		return;
	}
	int mid = begin + (end - begin) / 2;
	_MergeSort(a, begin, mid,tmp);
	_MergeSort(a, mid+1, end,tmp);

	int begin1 = begin, end1 = mid;
	int begin2 = mid + 1, end2 = end;
	int i = begin;
	
	while (begin1 <= end1 && begin2 <= end2)
	{
    
    
		if (a[begin1] <= a[begin2])
		{
    
    
			tmp[i++] = a[begin1];
			++begin1;
		}
		else
		{
    
    
			tmp[i++] = a[begin2];
			++begin2;
		}
	}
	while (begin1 <= end1)
	{
    
    
		tmp[i++] = a[begin1++];
	}
	while (begin2 <= end2)
	{
    
    
		tmp[i++] = a[begin2++];
	}
	memcpy(a + begin, tmp + begin, sizeof(int) * (end - begin + 1));
}
void MergeSort(int* a, int size)
{
    
    
	int* tmp = (int*)malloc(sizeof(int) * size);
	if (!tmp)
	{
    
    
		perror("malloc fail");
		exit(-1);
	}
	_MergeSort(a, 0, size - 1,tmp);
	free(tmp);
	tmp = NULL;
}

2. Non-recursive implementation

The recursive call occupies the space of the stack, and each recursion will open up a new space, so when the recursion depth is too deep, the stack may overflow.
Because the first step of merging is splitting, splitting an array of n data into n arrays of each data, merging them into the tmp array, and then copying the tmp array back. Set a rangeN to indicate the number of data in the array when pairwise arrays are merged.
insert image description here
The above is the ideal state.
When the number of data is not an integer multiple of 2, there will be an out-of-bounds situation, and we need to control the interval.
(1) end1 out of bounds
insert image description here
(2) begin2 out of bounds
insert image description here
(3) end2 out of bounds
insert image description here
To deal with the above situation:
insert image description here

void MergeSortNonR(int* a, int size)
{
    
    
	int* tmp = (int*)malloc(sizeof(int) * size);
	if (!tmp)
	{
    
    
		perror("malloc fail");
		exit(-1);
	}
	int rangeN = 1;
	while (rangeN < size)
	{
    
    
		for (int j = 0; j < size; j += rangeN * 2)
		{
    
    
			int begin1 = j;
			int end1 = j + rangeN - 1;
			int begin2 = j + rangeN;
			int end2 = j + rangeN * 2 - 1;
			if (end1 >= size)
			{
    
    
				end1 = size - 1;
				begin2 = size;
				end2 = size - 1;
			}
			else if (begin2 >= size)
			{
    
    
				begin2 = size;
				end2 = size - 1;
			}
			else if (end2 >= size)
			{
    
    
				end2 = size - 1;
			}
			int i = j;
			while (begin1 <= end1 && begin2 <= end2)
			{
    
    
				if (a[begin1] <= a[begin2])
				{
    
    
					tmp[i++] = a[begin1];
					++begin1;
				}
				else
				{
    
    
					tmp[i++] = a[begin2];
					++begin2;
				}
			}
			while (begin1 <= end1)
			{
    
    
				tmp[i++] = a[begin1++];
			}
			while (begin2 <= end2)
			{
    
    
				tmp[i++] = a[begin2++];
			}
		}
		memcpy(a, tmp, sizeof(int) * size);
		rangeN *= 2;
	}
}

Merge sort time complexity: O(N*logN)
space complexity: O(N)

Eight, counting sort

Counting sorting is a non-comparison sorting. Its main idea is
to create a new array whose size is the maximum value of the original array elements minus the minimum value plus one, which is used to record the number of each element in the original array.
Notice: It is not that the size of a certain data in the original array is i, and it is placed in the i-th position of the counting array, which wastes space and cannot be operated when there are negative numbers. So the data with value i is placed at the i-min position in the count array.
Finally, traverse the counting array and copy the data back to the original array. At the same time, it should be noted that the copied data is added to the subscript of the counting array with the min value.

void CountSort(int* a, int size)
{
    
    
	int max = a[0];
	int min = a[0];
	for (int i = 1; i < size; ++i)
	{
    
    
		if (a[i] > max)
			max = a[i];
		if (a[i] < min)
			min = a[i];
	}
	int range = max - min + 1;
	int* tmp = (int*)calloc(range, sizeof(int));
	if (!tmp)
	{
    
    
		perror("calloc fail");
		exit(-1);
	}
	//记录待排序数组中每个数据的个数
	for (int i = 0; i < size; ++i)
	{
    
    
		tmp[a[i] - min]++;
	}
	int j = 0;
	for (int i = 0; i < range; ++i)
	{
    
    
		while (tmp[i]--)
		{
    
    
			a[j++] = i + min;
		}
	}
}

Summary of Nine Ranking Stability

Stable Sort: Bubble Sort, Insertion Sort, Merge Sort
Unstable Sort: Selection Sort, Hill Sort, Heap Sort, Quick Sort

Examples of unstable sorting
1, selection sorting
insert image description here
2, Hill sorting due to multiple groups of pre-sorting, when the same data is assigned to different groups, and the relative order in different groups is different, it will make the sorting unstable.
3. Heap sort
insert image description here
This is a big heap. When the first 8 is exchanged with the last element of the heap, the number of nodes in the heap is -1, adjusted downwards and then exchanged with the last element of the heap, it will break the stability sex.
(4) Quick sort
insert image description here

Guess you like

Origin blog.csdn.net/Djsnxbjans/article/details/128207945