Introduction to data structure (C language version) An article teaches you to tear up the eight sorts

insert image description here

The concept of sorting

Sorting : The so-called sorting is the operation of arranging a string of records in increasing or decreasing order according to the size of one or some of the keywords.
Stability : Assume that there are multiple records with the same keyword in the sequence of records to be sorted. If sorted, the relative order of these records remains unchanged, that is, in the original sequence, r[i]=r[j] , and r[i] is before r[j], and in the sorted sequence, r[i] is still before r[j], the sorting algorithm is said to be stable; otherwise it is called unstable.
Internal sorting : A sorting in which all data elements are placed in memory.
External sorting : Too many data elements cannot be placed in memory at the same time, and the sorting of data cannot be moved between internal and external memory according to the requirements of the sorting process.

Common Sorting Algorithms

insert image description here

Implementation of the sorting algorithm

1. Direct insertion sort

Direct insertion sorting is a simple insertion sorting method. Its basic idea is:
insert the records to be sorted into a sorted sequence one by one according to the size of their key values ​​until all records are inserted. , to get a new ordered sequence.
In practice, when we play poker, we use the idea of ​​insertion sorting.
insert image description here
When inserting the i-th (i>=1) element, the previous array[0], array[1],...,array[i-1] have been After sorting, use the sorting code of array[i] to compare with the sorting codes of array[i-1], array[i-2],..., find the insertion position and insert array[i] at the original position The order of the elements is shifted backwards.
insert image description here

code show as below:

void InsertSort(int* a, int n)
{
    
    
	assert(a);

	for (int i = 0; i < n - 1; ++i)
	{
    
    
		// 将x插入[0, end]有序区间
		int end = i;
		int x = a[end+1];
		while (end >= 0)
		{
    
    
			if (a[end] > x)
			{
    
    
				a[end + 1] = a[end];
				--end;
			}
			else
			{
    
    
				break;
			}
		}
		a[end + 1] = x;
	}
}

Direct insertion sorting is a relatively well-understood sorting, so I won't go into details here.
Summary of the characteristics of direct insertion sort:

  1. The closer the element set is to order, the higher the time efficiency of the direct insertion sort algorithm
  2. Time complexity: O(N^2)
  3. Space complexity: O(1), it is a stable sorting algorithm
  4. Stability: Stable

2. Hill sort

The Hill sorting method is also known as the shrinking increment method. The basic idea of ​​the Hill sorting method is: first select an integer, divide all records in the file to be sorted into groups, and group all records with a distance of 0 into the same group, and sort the records in each group. Then, take and repeat the above grouping and sorting work. When reach = 1, all records are sorted in the same group.
insert image description here
code show as below:

void ShellSort(int* a, int n)
{
    
    
	// 按gap分组数据进行预排序
	int gap = 3;

	for (int j = 0; j < gap; ++j)
	{
    
    
		for (int i = j; i < n - gap; i += gap)
		{
    
    
			int end = i;
			int x = a[end + gap];
			while (end >= 0)
			{
    
    
				if (a[end] > x)
				{
    
    
					a[end + gap] = a[end];
					end -= gap;
				}
				else
				{
    
    
					break;
				}
			}

			a[end + gap] = x;
		}
	}
}

or

void ShellSort(int* a, int n)
{
    
    
	// 多次预排序(gap > 1) +直接插入 (gap == 1)
	int gap = n;
	while (gap > 1)
	{
    
    
		gap = gap / 3 + 1;
		
		for (int i = 0; i < n - gap; ++i)
		{
    
    
			int end = i;
			int x = a[end + gap];
			while (end >= 0)
			{
    
    
				if (a[end] > x)
				{
    
    
					a[end + gap] = a[end];
					end -= gap;
				}
				else
				{
    
    
					break;
				}
			}

			a[end + gap] = x;
		}
	}	
}

One of the two writing methods is that the gap value is given but has defects, while the second method can adjust the gap value as needed. It can be seen that when gap=1, it is a direct insertion sort. It can be said that Hill sort is direct insertion An optimization for sorting.
Summary of the characteristics of Hill sorting:

  1. Hill sort is an optimization of direct insertion sort.
  2. When gap > 1, it is pre-sorted, the purpose is to make the array closer to order. When gap == 1, the array is already close to order, so it will be very fast. In this way, the overall optimization effect can be achieved.
  3. The time complexity of Hill sorting is not easy to calculate, because there are many ways to value the gap, which makes it difficult to calculate. Probably in O(n^1.25) to O(1.6*n^1.25).
  4. Stability: Unstable

3. Selection sort

The basic idea of ​​selection sorting : select the smallest (or largest) element from the data elements to be sorted each time, and store it in the starting position of the sequence until all the data elements to be sorted are exhausted. Direct selection sorting: In
the element
set Select the data element with the largest (smallest) key in array[i]–array[n-1]
If it is not the last (first) element in this group of elements, combine it with the last element in this group of elements One (first) element is exchanged
In the remaining array[i]–array[n-2] (array[i+1]–array[n-1]) collection, repeat the above steps until there is 1 remaining collection Elements
insert image description here
Write an exchange function here first, and the following sorting will also be used:

void Swap(int* px, int* py)
{
    
    
	int tmp = *px;
	*px = *py;
	*py = tmp;
}

The sorting code is as follows:

void SelectSort(int* a, int n)
{
    
    
	int begin = 0, end = n - 1;
	
	while (begin < end)
	{
    
    
		int mini = begin, maxi = begin;
		for (int i = begin; i <= end; ++i)
		{
    
    
			if (a[i] < a[mini])
				mini = i;

			if (a[i] > a[maxi])
				maxi = i;
		}
		Swap(&a[begin], &a[mini]);
		if (begin == maxi)
			maxi = mini;

		Swap(&a[end], &a[maxi]);

		++begin;
		--end;
	}
}

Summary of features of direct selection sort:

  1. Direct selection sort thinking is very easy to understand, but the efficiency is not very good. rarely used in practice
  2. Time complexity: O(N^2)
  3. Space complexity: O(1)
  4. Stability: Unstable

4. Heap sort

**Heap sort (Heapsort)** refers to a sorting algorithm designed using the stacked tree (heap) data structure, which is a type of selection sort. It selects data through the heap.
code show as below:

void AdjustDown(int* a, int n, int parent)//向下调整
{
    
    
	int child = parent * 2 + 1;
	while (child < n)
	{
    
    
		// 选出左右孩子中小的那一个
		if (child + 1 < n && a[child + 1] > a[child])
		{
    
    
			++child;
		}

		// 如果小的孩子小于父亲,则交换,并继续向下调整
		if (a[child] > a[parent])
		{
    
    
			Swap(&a[child], &a[parent]);
			parent = child;
			child = parent * 2 + 1;
		}
		else
		{
    
    
			break;
		}
	}
}

// 堆排序 -- O(N*logN)
void HeapSort(int* a, int n)
{
    
    
	// O(N)
	for (int i = (n - 1 - 1) / 2; i >= 0; --i)
	{
    
    
		AdjustDown(a, n, i);
	}

	// O(N*logN)
	int end = n - 1;
	while (end > 0)
	{
    
    
		Swap(&a[0], &a[end]);
		AdjustDown(a, end, 0);
		--end;
	}
}

It should be noted that you need to build a large heap for ascending order, and a small heap for descending order. The writing method here is ascending order.
Summary of features of direct selection sort:

  1. Heap sort uses the heap to select numbers, which is much more efficient.
  2. Time complexity: O(N*logN)
  3. Space complexity: O(1)
  4. Stability: Unstable

Five, bubble sort

The basic idea of ​​​​exchange sorting : the so-called exchange is to exchange the positions of the two records in the sequence according to the comparison results of the key values ​​​​of the two records in the sequence. The characteristics
of the exchange sorting are : move the record with a larger key value to the end of the sequence , records with smaller key values ​​move to the front of the sequence for
bubble sorting:
insert image description here
the code is as follows:

void BubbleSort(int* a, int n)
{
    
    
	int end = n;
	while (end > 0)
	{
    
    
		int exchange = 0;
		for (int i = 1; i < end; ++i)
		{
    
    
			if (a[i - 1] > a[i])
			{
    
    
				exchange = 1;
				Swap(&a[i - 1], &a[i]);
			}
		}
		--end;

		if (exchange == 0)
		{
    
    
			break;
		}
	}
}

Summary of the characteristics of bubble sort :

  1. Bubble sort is a very easy to understand sort
  2. Time complexity: O(N^2)
  3. Space complexity: O(1)
  4. Stability: Stable

Six, quick sort

Quick sorting is a binary tree structure exchange sorting method proposed by Hoare in 1962. Its basic idea is: any element in the sequence of elements to be sorted is taken as a reference value, and the set to be sorted is divided into two subsequences according to the sorting code , all elements in the left subsequence are less than the reference value, all elements in the right subsequence are greater than the reference value, and then the leftmost subsequence repeats the process until all elements are arranged in the corresponding position.

1. Recursive writing

① Three-digit middle function

code show as below:

int GetMidIndex(int* a, int left, int right)
{
    
    
	int mid = left + ((right - left) >> 1);
	if (a[left] < a[mid])
	{
    
    
		if (a[mid] < a[right])
		{
    
    
			return mid;
		}
		else if (a[left] > a[right])
		{
    
    
			return left;
		}
		else
		{
    
    
			return right;
		}
	}
	else
	{
    
    
		if (a[mid] > a[right])
		{
    
    
			return mid;
		}
		else if (a[left] < a[right])
		{
    
    
			return left;
		}
		else
		{
    
    
			return right;
		}
	}
}

The purpose of taking the middle of the three digits is to prevent the worst case of order from becoming the selected digit as the key and the best case.

② hoare version

insert image description here

code show as below:

int Partion1(int* a, int left, int right)
{
    
    
	int mini = GetMidIndex(a, left, right);
	Swap(&a[mini], &a[left]);

	int keyi = left;
	while (left < right)
	{
    
    
		// 右边先走,找小
		while (left < right && a[right] >= a[keyi])
			--right;

		//左边再走,找大
		while (left < right && a[left] <= a[keyi])
			++left;

		Swap(&a[left], &a[right]);
	}

	Swap(&a[left], &a[keyi]);

	return left;
}

The main idea here is to first designate the first element as the key, first compare the size with the key value from the end of the array, until you find an element smaller than the key, then find the element larger than the key from the left, and recursively exchange in turn until left and right The pointer meets the stop, and finally the key value is exchanged with the intermediate value to complete the first round of quick sorting, and then enter the loop to re-operate the values ​​​​on both sides of the intermediate value, and finally complete the sorting. It should be noted that the three writing methods here are all three-bit optimization.

③ digging method

insert image description here

code show as below:

int Partion2(int* a, int left, int right)
{
    
    
	int mini = GetMidIndex(a, left, right);
	Swap(&a[mini], &a[left]);

	int key = a[left];
	int pivot = left;
	while (left < right)
	{
    
    
		// 右边找小, 放到左边的坑里面
		while (left < right && a[right] >= key)
		{
    
    
			--right;
		}

		a[pivot] = a[right];
		pivot = right;

		// 左边找大,放到右边的坑里面
		while (left < right && a[left] <= key)
		{
    
    
			++left;
		}
		a[pivot] = a[left];
		pivot = left;
	}

	a[pivot] = key;
	return pivot;
}

The idea is to first assign the first element to the key value, and specify the position of the first element as the pit position. It also compares the size from the right, finds that it is smaller than the key value, and assigns the value to the position of the first element (that is, the pit position). The initial position of this element is set as the new pit, and then find a value larger than the key from the left, find it and put it into the pit on the right, and repeat this process, the last pit is filled with the value stored in the key, and the final recursive steps are the same as above .

④ Front and rear pointer version

insert image description here

code show as below:

int Partion3(int* a, int left, int right)
{
    
    
	int mini = GetMidIndex(a, left, right);
	Swap(&a[mini], &a[left]);

	int keyi = left;
	int prev = left;
	int cur = prev + 1;
	while (cur <= right)
	{
    
    
		if (a[cur] < a[keyi] && ++prev != cur)
		{
    
    
			Swap(&a[cur], &a[prev]);
		}

		++cur;
	}

	Swap(&a[prev], &a[keyi]);
	return prev;
}

The basic idea of ​​this algorithm is: first specify the first element as the key value, and then specify two pointers, cur and prev, the prev pointer points to the first element, the cur pointer points to the second element, and the cur pointer goes first to find an element smaller than the key The element of prev stops, and prev moves forward again, and stops when it finds an element larger than key. The value of prev and cur is exchanged, and the cur pointer continues to look for a value smaller than key, so as to recurse until the cur pointer crosses the boundary to stop the loop, and the The value of the first element is exchanged with the value pointed to by prev at this time, the key is the pivot at this time, and the recursion is the same as above.

⑥Quick sort main function

void QuickSort(int* a, int left, int right)
{
    
    
	if (left >= right)
		return;
	{
    
    
		int keyi = Partion1(a, left, right);
		//int keyi = Partion2(a, left, right);
		//int keyi = Partion3(a, left, right);
		QuickSort(a, left, keyi - 1);
		QuickSort(a, keyi + 1, right);
	}	
}

Disadvantages of recursive programs:

  1. Compared with loop programs for early compilers, the performance is poor;
  2. If the recursion depth is too deep, it will cause stack overflow. (For example, when all the numbers in the array are the same).

2. Non-recursive writing

The non-recursive writing method uses the stack. In the C language, the stack needs to be implemented by writing code yourself. Here I apply the blog code about the stack I wrote before: the introduction of the stack
and the implementation of the interface.
The code is as follows:

void QuickSortNonR(int* a, int left, int right)
{
    
    
	ST st;
	StackInit(&st);
	StackPush(&st, left);
	StackPush(&st, right);

	while (!StackEmpty(&st))
	{
    
    
		int end = StackTop(&st);
		StackPop(&st);

		int begin = StackTop(&st);
		StackPop(&st);

		int keyi = Partion3(a, begin, end);
		if (keyi + 1 < end)
		{
    
    
			StackPush(&st, keyi+1);
			StackPush(&st, end);
		}

		if (begin < keyi-1)
		{
    
    
			StackPush(&st, begin);
			StackPush(&st, keyi-1);
		}
	}

	StackDestroy(&st);
}

Summary of quick sort features:
insert image description here

  1. The overall comprehensive performance and usage scenarios of quick sort are relatively good, so I dare to call it quick sort
  2. Time complexity: O(N*logN)
  3. Space complexity: O(logN)
  4. Stability: Unstable

Seven, merge sort

The basic idea of ​​​​merge sorting:
Merge sorting (MERGE-SORT) is an effective sorting algorithm based on the merge operation. This algorithm is
a very typical application of the divide and conquer method (Divide and Conquer). Combine the ordered subsequences to obtain a completely ordered sequence; that is, first make each subsequence in order
, and then make the segments of the subsequences in order. Merging two sorted lists into one sorted list is called a two-way merge. Merge sort core steps:
insert image description here

1. Recursive writing

code show as below:

void _MergeSort(int* a, int left, int right, int* tmp)
{
    
    
	if (left >= right)
	{
    
    
		return;
	}

	int mid = (left + right) / 2;
	_MergeSort(a, left, mid, tmp);
	_MergeSort(a, mid + 1, right, tmp);

	int begin1 = left, end1 = mid;
	int begin2 = mid+1, end2 = right;
	int i = left;
	while (begin1 <= end1 && begin2 <= end2)
	{
    
    
		if (a[begin1] < a[begin2])
		{
    
    
			tmp[i++] = a[begin1++];
		}
		else
		{
    
    
			tmp[i++] = a[begin2++];
		}
	}

	while (begin1 <= end1)
	{
    
    
		tmp[i++] = a[begin1++];
	}

	while (begin2 <= end2)
	{
    
    
		tmp[i++] = a[begin2++];
	}

	for (int j = left; j <= right; ++j)
	{
    
    
		a[j] = tmp[j];
	}
}

void MergeSort(int* a, int n)
{
    
    
	int* tmp = (int*)malloc(sizeof(int)*n);
	if (tmp == NULL)
	{
    
    
		printf("malloc fail\n");
		exit(-1);
	}

	_MergeSort(a, 0, n - 1, tmp);

	free(tmp);
	tmp = NULL;
}

2. Non-recursive writing

code show as below:

void MergeSortNonR(int* a, int n)
{
    
    
	int* tmp = (int*)malloc(sizeof(int)*n);
	if (tmp == NULL)
	{
    
    
		printf("malloc fail\n");
		exit(-1);
	}

	int gap = 1;
	while (gap < n)
	{
    
    
		for (int i = 0; i < n; i += 2 * gap)
		{
    
    
			int begin1 = i, end1 = i + gap - 1;
			int begin2 = i + gap, end2 = i + 2 * gap - 1;
			if (end1 >= n || begin2 >= n)
			{
    
    
				break;
			}
			
			// end2 越界,需要归并,修正end2
			if (end2 >= n)
			{
    
    
				end2 = n- 1;
			}

			int index = i;
			while (begin1 <= end1 && begin2 <= end2)
			{
    
    
				if (a[begin1] < a[begin2])
				{
    
    
					tmp[index++] = a[begin1++];
				}
				else
				{
    
    
					tmp[index++] = a[begin2++];
				}
			}

			while (begin1 <= end1)
			{
    
    
				tmp[index++] = a[begin1++];
			}

			while (begin2 <= end2)
			{
    
    
				tmp[index++] = a[begin2++];
			}

			// 把归并小区间拷贝回原数组
			for (int j = i; j <= end2; ++j)
			{
    
    
				a[j] = tmp[j];
			}
		}

		gap *= 2;
	}

	free(tmp);
	tmp = NULL;
}

Summary of the characteristics of merge sort:

  1. The disadvantage of merging is that it requires O(N) space complexity, and the thinking of merging and sorting is more to solve the problem of external sorting in the disk.
  2. Time complexity: O(N*logN)
  3. Space complexity: O(N)
  4. Stability: Stable

Eight, non-comparative sorting

Idea: Counting sorting, also known as the pigeonhole principle, is a modified application of the hash direct addressing method. Steps:

  1. Count the number of occurrences of the same element
  2. Recycle the sequence to the original sequence according to the statistical results

1. Radix sort

The idea of ​​radix sorting is defined by the maximum number of digits and the base. For example, here we are sorting an array, the maximum number of digits is 3, and 0-9 is set as the base, then the base is 10. Take (278, 109, 63, 930, 589, 184, 505, 269, 8, 83) For this array, the sorting process is as shown in the figure below. First, starting from the single digit of the array from left to right, 0-9 is inserted into the corresponding position in turn, and then taken out in turn from 0-9 , it should be noted that the ten-digit sorting is carried out first, and the process is the same as above, and the same is true for the latter.
insert image description here
insert image description here
insert image description here
insert image description here
insert image description here

insert image description here
The final sorting result is (8, 63, 83, 109, 184, 269, 278, 505, 589, 930).
This is written in C++, which is convenient for calling the queue. Friends who want to write in C language can refer to the previous blogger Regarding the queue blog, the steps to modify the call are almost the same.
code show as below:

#define _CRT_SECURE_NO_WARNINGS 1
#include<iostream>
#include<stdio.h>
#include<queue>
using namespace std;

#define K 3
#define RADIX 10

//定义基数
queue<int>Q[RADIX];

int GetKey(int value, int k)
{
    
    
 int key = 0;
 while (k >= 0)
 {
    
    
  key = value % 10;
  value /= 10;
  k--;
 }
 return key;
}
void Distribute(int arr[], int left, int right, int k)
{
    
    
 for (int i = left; i < right; ++i)
 {
    
    
  int key = GetKey(arr[i], k);
  Q[key].push(arr[i]);
 }
}

void Collect(int arr[])
{
    
    
 int k = 0;
 for (int i = 0; i < RADIX; ++i)
 {
    
    
  while (!Q[i].empty())
  {
    
    
   arr[k++] = Q[i].front();
   Q[i].pop();
  }
 }
}
void RadixSort(int arr[], int left, int right)//[left,right)
{
    
    
 for (int i = 0; i < K; ++i)
 {
    
    
  //分发数据
  Distribute(arr, left, right, i);
  //回收数据
  Collect(arr);
 }
}

Summary of the characteristics of radix sorting:

  1. Time complexity: O(keyword digits d*n)
  2. Space complexity: O (keyword digits d*n)
  3. Stability: Stable

2. Counting sort

The idea of ​​counting sorting is based on a variant of radix sorting. Let’s refer to the following figure first, assuming that the range of array values ​​is 1-9, and the radix is ​​an absolute mapping. The idea is the same as radix sorting. If it is within a certain range, it is a relative mapping. The starting value of the cardinality is the minimum value of the array, and the final value is the maximum value.
insert image description here

code show as below:

void CountSort(int* a, int n)
{
    
    
	int max = a[0], min = a[0];
	for (int i = 1; i < n; ++i)
	{
    
    
		if (a[i] > max)
		{
    
    
			max = a[i];
		}

		if (a[i] < min)
		{
    
    
			min = a[i];
		}
	}

	int range = max - min + 1;
	int* count = (int*)malloc(sizeof(int)*range);
	memset(count, 0, sizeof(int)*range);
	if (count == NULL)
	{
    
    
		printf("malloc fail\n");
		exit(-1);
	}

	for (int i = 0; i < n; ++i)
	{
    
    
		count[a[i] - min]++;
	}

	int j = 0;
	for (int i = 0; i < range; ++i)
	{
    
    
		while (count[i]--)
		{
    
    
			a[j++] = i + min;
		}
	}
}

Summary of features of counting sort:

  1. Counting sort is highly efficient when the data range is concentrated, but its scope of application and scenarios are limited.
  2. Time complexity: O(MAX(N, range d))
  3. Space complexity: O(range d)
  4. Stability: Stable

Complexity and Stability Analysis of Sorting Algorithms

sorting method average case best case worst case auxiliary space stability
selection sort O(n^2) O(n^2) O(n^2) O(1) unstable
Hill sort O(nlogn)~O(n^2) O(n^1.3) O(n^2) O(1) unstable
insertion sort O(n^2) O(n) O(n^2) O(1) Stablize
Bubble Sort O(n^2) O(n) O(n^2) O(1) Stablize
heap sort O(nlogn) O(nlogn) O(nlogn) O(1) unstable
quick sort O(nlogn) O(nlogn) O(n^2) O(logn) unstable
merge sort O(nlogn) O(nlogn) O(nlogn) O(n) Stablize
radix sort O(d*n) O(d*n) O(d*n) O(n) Stablize
counting sort O(d+n) O(d+n) O(d+n) O(d) Stablize

epilogue

Interested friends can pay attention to the author, if you think the content is good, please give a one-click triple link, you crab crab! ! !
It is not easy to make, please point out if there are any inaccuracies
Thank you for your visit, UU watching is the motivation for me to persevere.
With the catalyst of time, let us all become better people from each other! ! !
insert image description here

Guess you like

Origin blog.csdn.net/kingxzq/article/details/130298502