Basic sorting algorithm [insertion sort] and [Hill sort]

[Direct insertion sort]

> Basic idea:

Insert the data to be sorted into a sorted sequence one by one according to the requirements (in ascending or descending order), until all the data are inserted, and a new sequence is obtained.

Playing poker in life is to use the idea of ​​insertion sort
insert image description here

1. Sort

When inserting into the a[i]th element, the previous a[0], a[1]s...a[i-1]d have been sorted, then you want to insert a[i] To enter into order, it must be compared with a[i-1], a[i-2]..., if the condition is met, it will be inserted, and the element in the original position will be moved backward.

insert image description here

By default, the first position in the array has been inserted, which is equivalent to ordering. When the second element of the array is to be inserted, it must be compared with the first element. If the element to be inserted is smaller than the first element, then The first element can only move backwards, and the inserted element enters the first position. Then the third element of the array is equivalent to the element to be inserted, and the first two data are already in order. If you want to insert this element, you still need to compare it with the first and second elements. When it is smaller than the second element , the second element moves backward, and the second position is vacant, but it is not necessarily empty for the element to be inserted. The inserted element needs to be compared with all the sequences in the previous ordered sequence to determine the position. If it is smaller than the first element, then the first element will move back to the position of the second element, and the first position is empty, so insert it into the position of the first element, and then data and so on.

When we write sorting, it is best to write the sorting effect first, and then write the overall sorting. This makes it easy to understand how the sorting works.

//先写一趟排序--
		int end;//表示最后一个元素的下标
		int tmp;//表示要插入的数据
		while (end >= 0)//这里插入数据要与前面已经排好序的数据全部比较完才可以确定位置
		{
    
    
			if (a[end] > tmp)//当插入的元素比最后一个元素要小时
			{
    
    
				a[end + 1] = a[end];//最后一个元素就要往后挪动,该位置空着
				end--;//表示插入元素要和倒数第二,倒数第三个……比较
			}
			else
			{
    
    
				//这里表明要插入的数据终于大于a[end],但要注意的是哪里才是空的位置
				break;
			}
		}
		//走到这里有两种可能,一种是a[end]比tmp小,循环停止break跳出来了,另一种就是循环一直走,直到结束
		//也就是数组里每个元素都比要插入的元素要大那end就跑到-1去了
		//最后tmp总归要去end+1的位置上去的
		a[end + 1] = tmp;//这是一趟
	}

insert image description here
The position where the inserted element is to be inserted is always a[end+1]
insert image description here
insert image description here

After writing the single pass, we can write the overall one
. We want to sort an array, use the insertion sort algorithm, first regard the first element of the array as already ordered, and start inserting from the second element.

//插入排序
void Insert(int* a,int n)
{
    
    
	for (int i = 1; i < n; i++)//这里从从数组第二个元素开始往后插入,默认数组第一个元素就插入进去
	{
    
    
		int end=i-1;//表示最后一个元素的下标----第一个元素位置下标为0
		int tmp=a[i];//表示要插入的数据---从第二个元素开始插入,第二个元素位置是i
		while (end >= 0)
		{
    
    
			if (a[end] > tmp)
			{
    
    
				a[end + 1] = a[end];
				end--;
			}
			else
			{
    
    
				//这里表明a[end]比tmp要小了,但end+1上才是空位
				break;
			}
		}
		//最后tmp总归要去end+1的位置上去的
		a[end + 1] = tmp;//这是一趟
	}
}

The insertion sort is finished, simple and easy.
Let's try the effect.

int main()
{
    
    
	int a[] = {
    
     9,8,7,6,5,4,3,2,1,0 };
	Insert(a, 10);
	for (int i = 0; i < 10; i++)
	{
    
    
		printf("%d ", a[i]);
	}
	return 0;
}

insert image description here
How efficient is the insertion sort test: randomly generate 10,000 numbers, and use the insertion sort algorithm to see how long it takes to run.

#include <stdlib.h>
#include <time.h>
void InsertTest()
{
    
    
	srand(time(0));
	const int N = 100000;
	int* a1 = (int*)malloc(sizeof(int) * N);
	for (int i = 0; i < N; ++i)
	{
    
    
		a1[i] = rand();
	}
	int begin1 = clock();
	Insert(a1, N);
	int end1 = clock();
	printf("InsertSort:%d\n", end1 - begin1);
	free(a1);
}

insert image description here

2. Summary of features

1. When the element set is closer to order, the time efficiency of the direct insertion sorting algorithm is very high

2. The time complexity is: O(N^2)

In the worst case, it is completely reversed, and the time complexity is O(N^2)
In the best case, it is completely ordered, and the time complexity is O(N)

3. Space complexity: O(1), it is a stable sorting algorithm

4. Stability: stable

【Hill sorting】

1. Sort

> Basic idea:

The Hill sorting method is also known as the shrinking increment method.
The overall process can be divided into –> pre-sorting –> sorting
First select an integer gap, and divide the data to be sorted into gap groups. All the data whose distance is gap is divided into the same group, and the data in each group is sorted, and then, the gap is continuously selected, and the above grouping and sorting work is repeated. When the gap reaches 1, it is the real sorting. At this time, all the data will be sorted in the same group.

Hill sorting is actually an optimization of insertion sorting. Hill thinks that the efficiency of insertion sorting is a bit low, but analysis shows that when the data is close to order, the time efficiency of insertion sorting is very fast. Hill thinks that it can You can't let a set of data tend to be in order first, and then use direct insertion sort, so the efficiency will be fast.

All Hill divides the sorting into pre-sorting. The function of this pre-sorting is to make the data tend to be orderly, so how to make the data tend to be orderly?

Group pre-arrangement: let the big numbers jump to the back faster, and let the small numbers jump to the front faster. After grouping and pre-arranging in multiple groups, the group becomes more and more orderly. The smaller the spacing, the larger the component's data.
The last gap must be 1.

Hill divides the data into gap groups, and divides each element into a group with a distance of gap distance, and then uses the insertion sort algorithm for each group to make them orderly in their respective groups, then all the data in the end must be better than the beginning. The regions are ordered, why?
Setting the gap actually improves the speed of data exchange. The original large data may need to be moved n-1 times, but if the gap is set, the large data may be moved by a gap step at a time. When the gap is larger, the data exchange is faster. .

insert image description here
Let's write a single-pass sort first - such as a red group sort
insert image description here

//希尔排序的特点就是分成gap区间 假设gap为3的话,就分成3部分
			for (int i = 0; i < n - gap; i += gap)
			{
    
    
				//先写一趟--也就是红色一组
				int end = i;
				int tmp = a[i + gap];//距离gap才为一组,所以下一个要插入的元素在i+gap处
				while (end >= 0)
				{
    
    
					if (a[end] > tmp)
					{
    
    
						//往前移动gap位置,end每次移动gap
						a[end + gap] = a[end];
						end -= gap;
					}
					else
					{
    
    
						break;
					}
				}
				//tmp总归要插入到a[end+gap]位置上去的
				a[end + gap] = tmp;
			}

After writing a trip, the following is simple, because the array is divided into 3 groups, the red group, the blue group, and the purple group, and the operations of each group are the same.
Therefore, on the basis of a single trip, the cycle is repeated three times, and red, blue and purple are all performed once.

     for (int j = 0; j < gap; j++)//因为数组被分成gap组,每一组的操作都类似,所以循环gap次
		{
    
    
			//希尔排序的特点就是分成gap区间 假设gap为3的话,就分成3部分
			for (int i = j; i < n - gap; i += gap)
			{
    
    
				//先写一趟--也就是红色
				int end = i;
				int tmp = a[i + gap];
				while (end >= 0)
				{
    
    
					if (a[end] > tmp)
					{
    
    
						//往前移动gap位置,end每次移动gap
						a[end + gap] = a[end];
						end -= gap;
					}
					else
					{
    
    
						break;
					}
				}
				//tmp总归要插入到a[end+gap]位置上去的
				a[end + gap] = tmp;
			}
		}

However, the Hill sorting has not been completed here. We just assume that the gap is 3, but what is the real gap?
insert image description here

In fact, the value of the gap is constantly changing, but one thing is certain, the value of the gap must be equal to 1 in the end.
When the gap is 1, it is insertion sorting, and Hill is like pre-sorting to make the data orderly , and then use insertion sorting to improve efficiency, so the final value of gap must be 1.
How to make gap be 1 at the end?
1. At the beginning, the gap is usually n, and the gap is divided by 2 each time, and the gap must be equal to 1 in the end.
2. Or divide the gap by 3 and add 1 each time, and the final value of the gap will be equal to 1.

void Shellsort(int *a,int n)
{
    
    
	int gap = n;
	while (gap)
	{
    
    
		gap/=2;//gap每次都除以2,最终gap的值一定会等于1
		for (int j = 0; j < gap; j++)
		{
    
    
			for (int i = j; i < n - gap; i += gap)
			{
    
    
				//先写一趟--也就是红色
				int end = i;
				int tmp = a[i + gap];
				while (end >= 0)
				{
    
    
					if (a[end] > tmp)
					{
    
    
						//往前移动gap位置,end每次移动gap
						a[end + gap] = a[end];
						end -= gap;
					}
					else
					{
    
    
						break;
					}
				}
				//tmp总归要插入到a[end+gap]位置上去的
				a[end + gap] = tmp;
			}
		}
	}

}

Here we test the efficiency of Hill sort:

void ShellTest()
{
    
    
	srand(time(0));
	const int N = 100000;
	int* a2 = (int*)malloc(sizeof(int) * N);
	for (int i = 0; i < N; ++i)
	{
    
    
		a2[i] = rand();
	}
	int begin1 = clock();
	Shellsort(a2, N);
	int end1 = clock();
	printf("Shellsort:%d\n", end1 - begin1);
	free(a2);
}

insert image description here
Let's take a look at the efficiency of insertion sort and Hill sort when processing the same data at the same time:
insert image description here

Our analysis found that compared with insertion sort, Hill sort will greatly improve the efficiency.

2. Summary of features

1. Hill sorting is an optimization of direct insertion sorting
2. When gap>1, it is pre-sorting, the purpose is to make the array closer to order. When gap==1, the array is already close to order, so it is very fast to use insertion sort again. So as to achieve the optimization effect.
3. The time complexity of Hill sorting is not easy to calculate, because there are many ways to get the value of gap, which makes it difficult to calculate. So the time complexity of Hill sorting does not have a fixed value.
insert image description here
insert image description here

But it is generally believed that the time complexity of Hill sorting is on the order of O(n^1.3)
and O(o*logn).
4. Stability: Unstable.

Guess you like

Origin blog.csdn.net/Extreme_wei/article/details/129971881