【Data Structure and Algorithm】Anatomy of Wanzi Eight sorts (direct insertion sorting, Hill sorting, selection sorting, heap sorting, bubble sorting, quick sorting, merge sorting, counting sorting)

insert image description here
insert image description here

1. Direct insertion sort

A first look at direct insertion sort Let’s look at a moving picture:
insert image description here
From the moving picture, we can analyze that direct insertion sorting traverses from the second data , compares it with the previous data , and if it is less than, let the previous data move forward Then compare to the previous data until the comparison is greater than or equal to its own data or stop inserting the current position when there is no data to compare .

Performance analysis of direct insertion sorting:
time complexity: O(N^2 ), in the best case , O(N) can be achieved when the data is already in order ;
space complexity: direct insertion sorting does not open a continuous period Space is therefore O(1) ;

The complete code is as follows. We can see that the judgment in the while loop is to let the data smaller than tmp be overwritten continuously until data smaller than tmp is found or when the end is smaller than 0 ( there is no comparable data before ), jump out of the loop and insert the data.

void InsertSort(int* a, int size)
{
    
    
	assert(a);

	for (int i = 0; i < size - 1; i++)
	{
    
    
		int tmp = a[i + 1];
		int end = i;

		while (end >= 0)
		{
    
    
			if (tmp < a[end])
			{
    
    
				a[end + 1] = a[end];
				end--;
			}
			else
			{
    
    
				break;
			}
		}

		a[end + 1] = tmp;
	}
}

2. Hill sort

Hill sorting is also known as shrinking incremental sorting. It selects an integer gap to group the data to be sorted, then directly inserts and sorts the data in the group, and then reduces the gap to change the grouping interval before sorting. This process is called presorting . Finally, when the gap is reduced to 1 , the data is already close to order , and the sorting efficiency of direct insertion of the data is higher . The animation demo is as follows:
insert image description here

Hill sorting performance analysis:
time complexity: worst O(N^2) , the average is given in <data structure> Yan Weimin O(N^1.3) ;
space complexity: O(1);

The complete code is as follows. If you have any questions, feel free to private message or ask in the comment area.

void ShellSort(int* a, int size)
{
    
    
	assert(a);
	int gap = size;

	while (gap > 1)
	{
    
    
		gap /= 2;//使得gap最后能减到1 成为直接插入排序
		//gap /= 3 + 1;
		for (int i = 0; i < size - gap; i += gap)
		{
    
    
			int end = i;
			int tmp = a[i + gap];

			while (end >= 0)
			{
    
    
				if (tmp < a[end])
				{
    
    
					a[end + gap] = a[end];
					end -= gap;
				}
				else
				{
    
    
					break;
				}
			}
			a[end + gap] = tmp;
		}

	}
}

3. Selection sort

Selection sort is the most "honest" sort among the eight sorts. Its basic idea is to first select a key value, which is generally the first digit of the data, and then traverse the rest of the data to record the minimum and maximum values ​​in the remaining data. mark . Then put the minimum value on the left and the maximum value on the right and then reduce the range that needs to be sorted ( because a maximum value and a minimum value have already been arranged ).
insert image description here
The current code logic is implemented as follows:

void SelectSort(int* a, int n)
{
    
    
	int left = 0;
	int right = n - 1;

	while (left < right)
	{
    
    
		int mini = left, maxi = left;
		for (int i = left + 1; i <= right; i++)
		{
    
    
			if (a[mini] > a[i])
			{
    
    
				mini = i;
			}

			if (a[maxi] < a[i])
			{
    
    
				maxi = i;
			}
		}
		Swap(&a[left], &a[mini]);
		Swap(&a[right], &a[maxi]);
		--right;
		++left;
	}
}

When I tested the code, I found that there are some situations that the above code cannot handle.
insert image description here
This is why, we debugged and observed:
insert image description here
at this time the key value is a[3] 6 , after traversing the subscript interval 4 to 6, max is still 6 , and min is 3 . Then min is exchanged with a[left], that is:
insert image description here
when the subsequent maximum value is exchanged with a[right] , a[left] that should have been placed with the maximum value has been replaced with the minimum value , so an error will occur in the exchange. That is, when left and maxi overlap , you need to assign maxi to mini so that no error will occur.

Selection sort performance:
time complexity: O(N^2) ;
space complexity: O(1) ;

The correct code is as follows:

void SelectSort(int* a, int n)
{
    
    
	int left = 0;
	int right = n - 1;

	while (left < right)
	{
    
    
		int mini = left, maxi = left;
		for (int i = left + 1; i <= right; i++)
		{
    
    
			if (a[mini] > a[i])
			{
    
    
				mini = i;
			}

			if (a[maxi] < a[i])
			{
    
    
				maxi = i;
			}
		}
		Swap(&a[left], &a[mini]);
		if (left == maxi)//防止left与maxi重叠
		{
    
    
			maxi = mini;
		}
		Swap(&a[right], &a[maxi]);

		--right;
		++left;
	}
}

4. Heap sort

I have written an article explaining heap sorting before, so I won’t go into details here. It should be noted that: Sort ascending, build large piles, sort descending, build small piles of article piles and links to heap sorting

Five. Bubble sort

insert image description here
Bubble sorting has already been learned when learning C language, so it is easy to implement, the code is as follows:

Selection sort performance:
time complexity: O(N^2) ;
space complexity: O(1) ;

void BubbleSort(int* a, int size)
{
    
    
	for (int i = 0; i < size - 1; i++)
	{
    
    
		for (int j = 1; j < size  - i; j++)
		{
    
    
			if (a[j - 1] > a[j])
			{
    
    
				Swap(&a[j - 1], &a[j]);
			}
		}
	}
}

6. Quick Sort

1.hoare edition

Quick sort, as its name suggests, has high efficiency. It is a binary tree structure exchange sorting method proposed by Hoare in 1962. Its idea is to select a benchmark value , put the data smaller than the benchmark value on the left , and put the data larger than it on the right . Then recurse the left and right intervals respectively.
insert image description here

Quick sort performance:
time complexity: O(N*logN) worst O(N^2)
space complexity: O(1) ;

What needs to be noted in this version is that if the first one on the left is used as the reference value , the right side needs to go first . If the data is smaller than the reference value, the left side should go first .

void QuickSort11(int* a, int left, int right)
{
    
    
	if (left >= right)
		return;

	int begin = left;
	int end = right;

	int keyi = left;
	while (left < right)
	{
    
    
		while (left < right && a[keyi] <= a[right])
			--right;

		while (left < right && a[keyi] >= a[left])
			++left;

		Swap(&a[left], &a[right]);
	}

	Swap(&a[keyi], &a[left]);
	keyi = left;


	QuickSort11(a, begin, keyi - 1);
	QuickSort11(a, keyi + 1, end);
}

2. Digging method

The optimization of the digging method is not obvious, and the sorting idea is similar to the original one.
insert image description here
The code is as follows for your understanding:

void QuickSort22(int* a, int left, int right)
{
    
    
	if (left >= right)
		return;

	int begin = left;
	int end = right;

	int hole = left;
	int key = a[left];
	while (left < right)
	{
    
    
		while (left < right && key <= a[right])
			--right;

		a[hole] = a[right];
		hole = right;

		while (left < right && key >= a[left])
			++left;

		a[hole] = a[left];
		hole = left;
	}

	a[hole] = key;
	QuickSort22(a, begin, hole - 1);
	QuickSort22(a, hole + 1, end);
}

3. Front and rear pointer

Front and rear pointers are the third version of quick sort, and its approach is very wonderful:
insert image description here
the complete code is as follows:

void QuickSort33(int* a, int left, int right)
{
    
    
	if (left >= right)
		return;

	int begin = left;
	int end = right;
	
	int keyi = left;
	
	int prev = left;
	int cur = prev + 1;

	while (cur <= right)
	{
    
    
		if (a[cur] <= a[keyi] && ++prev != cur)
			Swap(&a[prev], &a[cur]);

		++cur;
	}
	Swap(&a[keyi], &a[prev]);
	QuickSort33(a, begin, prev - 1);
	QuickSort33(a, prev + 1, end);
}

4. Select the optimization of the benchmark value

We know that the efficiency of quick sorting is very high, especially when dealing with unordered data, but when the array is ordered, the time complexity will drop to O (N^2) , this is because we always put the first A data is used as a reference value resulting . Here is an optimized method— the middle of the three numbers —it will compare the data of mid left right, and take the median as the benchmark value, so that even if the data is in order, the efficiency will not be too bad.

int GetMidNumi(int* a, int left, int right)
{
    
    
	int mid = (left + right) / 2;

	if (a[left] > a[mid])
	{
    
    
		if (a[mid] > a[right])
			return mid;
		else if (a[left] < a[right])
			return left;
		else
			return right;
	}
	else
	{
    
    
		if (a[mid] < a[right])
			return mid;
		else if (a[right] < a[left])
			return left;
		else
			return right;
	}
}

(1) Quick sort non-recursive

The above three versions of quick sort all use recursion. When the amount of data is too large and the stack frame space created is too much, it is easy to cause stack overflow and cause errors. So we have to learn that changing the code from recursive to non-recursive to avoid errors is a necessary skill for our future work.
The non-recursion of quick sort needs to use the stack to use the stack to assist and change it to a loop.
How to use the stack to change the quick sort into a loop? We push the left and right subscripts of the array onto the stack. In order to pop the stack from left to right, we need to push the right subscript first, then the left subscript . Then when the stack is not empty, continue to pop up the range to sort the quicksort function, and stop pushing the stack when the data in the left and right ranges is less than 1.

void QuickSortNon(int* a, int left, int right)
{
    
    
	ST QS;
	InitST(&QS);
	Push(&QS, right);
	Push(&QS, left);

	while (!STEmpty(&QS))
	{
    
    
		int begin = STTop(&QS);
		Pop(&QS);

		int end = STTop(&QS);
		Pop(&QS);

		int key = QuickSort3(a, begin, end);
		if (begin < key - 1)
		{
    
    
			Push(&QS, key - 1);
			Push(&QS, begin);
		}
		if (key + 1 < end)
		{
    
    
			Push(&QS, end);
			Push(&QS, key + 1);
		}

	}
	DestroyST(&QS);
}

Seven. Merge sort

Merge sort adopts the divide-and-conquer algorithm idea , which divides the data into two intervals and recurses until the two intervals are in order , and then arranges the data in the two intervals into the auxiliary space of the new malloc .
The demo animation is as follows:
insert image description here

Merge sort performance:
Time complexity: O(N*logN) ;
Space complexity: O(N) ; Merge sort requires more auxiliary space, so it is often used to solve the external sorting problem of the disk .

The complete code is as follows:

void _MergeSort(int* a, int left, int right, int* tmp)
{
    
    
	if (left >= right)
		return;
	int mid = (left + right) / 2;

	_MergeSort(a, left, mid, tmp);
	_MergeSort(a, mid+1, right, tmp);

	int begin1 = left, end1 = mid;
	int begin2 = mid + 1, end2 = right;
	int i = begin1;
	while (begin1 <= end1 && begin2 <= end2)
	{
    
    
		if (a[begin1] > a[begin2])
		{
    
    
			tmp[i++] = a[begin2++];
		}
		else
		{
    
    
			tmp[i++] = a[begin1++];
		}
	}

	while(begin1 <= end1)//前面两个区间中还有多余没有拿下来的数据,一并排进去
		tmp[i++] = a[begin1++];
	while (begin2 <= end1)
		tmp[i++] = a[begin2++];

	memcpy(a + left, tmp+left, sizeof(int)*(right-left+1));
}

(2) Merge sort is non-recursive

From the merge sort code above, we can see that at the beginning of sorting, we first find mid to divide the data into two intervals, and then continue to recurse until there is only one data in the interval , that is, it is sorted , and then compared with the data in the right interval, it is sorted into the auxiliary space .
Then can we control the order of the data range without recursion ? You can directly define the gap to control the number of data in a range. The initial value of the gap is 1, which means that the range is ordered. The control is as follows:
insert image description here

	int gap=1;
	while (gap < size)
	{
    
    	
		gap *= 2;
	}

Moreover, the non-recursive version does not want to have mid recursively, which is convenient for controlling the interval. The control of the interval needs to be understood:

	int gap = 1;
	while (gap < size)
	{
    
    
		for (int i = 0; i < size; i += 2 * gap)
		{
    
    
			int begin1 = i, end1 = i + gap - 1;//区间控制
			int begin2 = i + gap, end2 = i + 2 * gap - 1;//好好体会
			int j = i;
			while (begin1 <= end1 && begin2 <= end2)
			{
    
    
				if (a[begin1] > a[begin2])
				{
    
    
					tmp[j++] = a[begin2++];
				}
				else
				{
    
    
					tmp[j++] = a[begin1++];
				}
			}

			while (begin1 <= end1)
				tmp[j++] = a[begin1++];
			while (begin2 <= end2)
				tmp[j++] = a[begin2++];

			memcpy(a, tmp, sizeof(int) * (end2-1+1));

		}

At this point, we can't help but have a question. Unlike the above picture, if the data is odd and cannot be exactly partitioned, what should we do?
We need to judge whether the interval is out of bounds and deal with it after the interval is controlled.

	int begin1 = i, end1 = i + gap - 1;//区间控制
	int begin2 = i + gap, end2 = i + 2 * gap - 1;

The interval is as shown above, and because iit is 0 to size-1, end1 begin2 end2there is a possibility of crossing the boundary. Let's discuss and solve it by category.
insert image description here
There are three situations:

  1. end1 crosses the boundary, and begin2 and end2 must also cross the boundary. At this time, only part of the data in the left interval is valid , so comparison + sorting cannot be performed , so the gap at this time is wrong, and this cycle does not directly break.
  2. When begin2 and end2 are out of bounds, only the left interval is valid at this time, and cannot be compared + the row is the same as the first case and directbreak
  3. In the last case, only end2 is out of bounds, we only need to assign end2 to be within the normal range .
    Adjust the code as follows:
		int begin1 = i, end1 = i + gap - 1;//区间控制
		int begin2 = i + gap, end2 = i + 2 * gap - 1;
		if (end1 >= size || begin2 >= size)
		{
    
    
			break;
		}
		else if (end2 >= size)
		{
    
    
			end2 = size - 1;
		}

The complete code is as follows:

void _MergeSortNon6(int* a, int size)
{
    
    
	int* tmp = (int*)malloc(sizeof(int) * size);

	if (tmp == NULL)
	{
    
    
		perror("malloc fail");
	}
	int gap = 1;
	while (gap < size)
	{
    
    
		for (int i = 0; i < size; i += 2 * gap)
		{
    
    
			int begin1 = i, end1 = i + gap - 1;//区间控制
			int begin2 = i + gap, end2 = i + 2 * gap - 1;//好好体会
			if (end1 >= size || begin2 >= size)
			{
    
    
				break;
			}
			else if (end2 >= size)
			{
    
    
				end2 = size - 1;
			}

			int j = i;
			while (begin1 <= end1 && begin2 <= end2)
			{
    
    
				if (a[begin1] > a[begin2])
				{
    
    
					tmp[j++] = a[begin2++];
				}
				else
				{
    
    
					tmp[j++] = a[begin1++];
				}
			}

			while (begin1 <= end1)
				tmp[j++] = a[begin1++];
			while (begin2 <= end2)
				tmp[j++] = a[begin2++];

			memcpy(a, tmp, sizeof(int) * (end2-1+1));

		}

		gap *= 2;
	}
	free(tmp);
}

The above code is to merge part and copy part , or it can be copied in one time outside the for loop . The one-time copy is more troublesome to control the interval, so I won’t go into details here. Those who are interested can find the code anonymous Unit code cloud in my gitee

Eight. Counting sort

Counting sort is a non-comparative sort, which is suitable for sorting integer arrays with small ranges and small ranges.

Counting sort performance:
time complexity: O(N+range) ;
space complexity: O(range) ;

The complete code is as follows:

void CountSort(int* a, int n)
{
    
    
	int max = a[0], min = a[0];

	for (int i = 0; i < n; i++)
	{
    
    
		if (a[i] > max)
		{
    
    
			max = a[i];
		}
		if(a[i]<min)
		{
    
    
			min = a[i];
		}
	}

	int range = max - min + 1;
	int* countA = (int*)malloc(sizeof(int) * range);
	assert(countA);

	memset(countA, 0, sizeof(int) * range);

	for (int i = 0; i < n; i++)
	{
    
    
		countA[a[i] - min]++;
	}
	int j = 0;
	for (int i = 0; i < range; i++)
	{
    
    
		while (countA[i]--)
		{
    
    
			a[j++] = i + min;
		}
	}
}

9. Analysis of the stability of the eight rankings

The meaning of stability in sorting is as follows:
insert image description here
Bubble sorting arranges a number to the final position in each pass, and can control the equality condition to make the bubble stable.
It is difficult to think of the unstable selection sort:
insert image description here
1 is less than the reference value 2, then 1 and 2 will be exchanged, resulting in a change in the relative order of the two 2s.
Direct insertion sorting can also control the equal exchange conditions to achieve stable sorting.
Hill sorting may cause the same value to be divided into different groups, resulting in instability.
Heap sorting may change the relative order of the same values ​​during heap adjustment, so it is unstable.
When the merge sort is inserted into the auxiliary space, it will not change the order according to the original order, so it is a stable sort.
The situation of quick sort is the same as that of selection sort, and it is also an unstable situation caused by exchanging the reference value.
insert image description here

This is the end of the article, feel free to private message or leave a message in the comment area if you have any questions!

Guess you like

Origin blog.csdn.net/qq_43289447/article/details/130078910