Article directory
1 Introduction
Sorting algorithms play a vital role in computer science and have a profound impact on tasks such as data organization and search. Hill sort is an improved version of insertion sort. By introducing the concept of increment, it can significantly improve the efficiency of sorting in some cases.
This article will introduce in detail the principle and implementation of the Hill sorting algorithm, as well as analyze its performance.
2. Principle of Hill sorting algorithm
Hill sorting is an improved algorithm based on insertion sort, proposed by Donald L. Shell in 1959. The core idea is to group the records to be sorted by a certain increment of the subscript, and use the direct insertion sort method for each group. As the increment gradually decreases, each group contains more and more records until the increment is 1. , the entire sequence is divided into exactly one group, and the sorting is completed.
2.1 Example description
Consider an input file containing 16 records, and the decreasing increment sequence is 8, 4, 2, 1 8, 4, 2, 1 8,4,2,1. Initially, the input file is divided into 8 groups:
- Set 1: R 1 , R 9 R_1, R_9 R1,R9
- Set 2: R 2 , R 10 R_2, R_{10} R2,R10
- Set 3: R 3 , R 11 R_3, R_{11} R3,R11
- …
- Set 8: R 8 , R 16 R_8, R_{16} R8,R16
Sort each group using the direct insertion sort algorithm. Then, take the increment value to 4 and divide the files into 4 groups:
- 组1: R 1 , R 5 , R 9 , R 13 R_1, R_5, R_9, R_{13} R1,R5,R9,R13
- 组2: R 2 , R 6 , R 10 , R 14 R_2, R_6, R_{10}, R_{14} R2,R6,R10,R14
- 组3: R 3 , R 7 , R 11 , R 15 R_3, R_7, R_{11}, R_{15} R3,R7,R11,R15
- 组4: R 4 , R 8 , R 12 , R 16 R_4, R_8, R_{12}, R_{16} R4,R8,R12,R16
Use direct insertion sort again for each group. Repeat this process, taking increment values of 2 and 1, and finally complete the entire sorting.
2.2 Time complexity analysis
Hill sorting performance is closely related to the selected grouping length sequence. The worst-case time complexity is O ( n 2 ) O(n^2) O(n2), but different packet length sequences will affect the actual performance of the algorithm.
- When the packet length sequence is taken n 2 i \frac{n}{2^i} 2inWhen , the worst-case time complexity is O ( n 2 ) O(n^2) O(n2)。
- In practical applications, taking 2.2 as the reduction factor is more effective.
- When the packet length sequence is in the form 2 p 3 q 2^p3^q 2p3When q is the set of all positive integers less than n, the time complexity of Hill sorting is O ( n ⋅ ( log 2 n ) 2 ) O(n \cdot (\log_2 n)^2) O(n⋅(log2n)2)。
In 1969, V. Pratt proved the above conclusion. The best currently known packet length sequence is { 1 , 4 , 10 , 23 , 57 , 132 , 301 , 701 , 1750 , . . . } \{1, 4, 10, 23, 57, 132, 301, 701, 1750, ... \} { 1,4,10,23,57,132,301,701,1750,...}, Hill sort with this grouping sequence is faster than insertion sort and heap sort. It is faster than quick sort for small arrays, but may be slower than quick sort for large arrays. In addition, Hill sort is an unstable sorting algorithm.
3. Experiment content
3.1 Experimental questions
Implement the Hill sorting algorithm ShellSort using {7, 5, 3, 1} as the decreasing increment sequence.
(1) Input requirements
The first set of input data:
{27, 32, 33, 21, 57, 96, 64, 87, 14, 43, 15, 62, 99, 11}< /span> {99, 96, 87, 64, 62, 57, 43, 33, 32, 27, 21, 15 ,14,11} The third set of input data: {11, 14, 15, 21, 27, 32, 33, 43, 57, 62, 64, 87, 96, 99}
The second set of input data:
(2) Output requirements
For each set of input data, output the following information (clear prompt information about the output data is required):
- Output the total number of keyword comparisons and the total number of record moves in the entire sorting process;
- Each time a record insertion occurs, the entire file is output once;
- Output the number of keyword comparisons and the number of record moves when the increment is 7, 5, 3, 1
3.2 Algorithm implementation
#include <stdio.h>
#define n 14
void ShellSort(int R[n]){
int r,i,j,k,Compare=0,Move=0;
int d=7; //初始化增量值为7
while(d>0){
//不断分组,并对各组排序
int compare=0,move=0;
for(i=d;i<n;i++){
//对各组做直接插入排序
r=R[i];
j=i;
while(j>d-1&&R[j-d]>r){
compare++;
R[j]=R[j-d];
j-=d;
}
if(j!=i){
move++;
R[j]=r;
for(k=0;k<n;k++){
printf("%d ", R[k]);
}
printf("\n");
}
}
printf("\n增量值为%d时的关键词比较次数是%d,记录移动次数是%d\n\n",d,compare,move);
d=d-2; //计算新的增量值,{7,5,3,1}
Compare+=compare;
Move+=move;
}
printf("关键词的总比较次数是%d,总的记录移动次数是%d\n",Compare,Move);
}
int main(){
int i;
//int R[n]={27,32,33,21,57,96,64,87,14,43,15,62,99,11};
int R[n]={
11,14,15,21,27,32,33,43,57,62,64,87,96,99};
//int R[n]={99,96,87,64,62,57,43,33,32,27,21,15,14,11};
ShellSort(R);
printf("最后结果:");
for(i=0;i<n;i++){
printf("%d ",R[i]);
}
}
3.3 Code analysis
- Macro definition
#define n 14
Define macron
, indicating that the length of the array is 14. You can easily use n
to indicate the length of the array in subsequent code without hardcode.
-
Hill sorting function
The parameter is an integer arrayR
, which represents the array to be sorted. Within the function, the data is inserted and sorted by continuously reducing the increment. Specifically, after each cycle ends, the incremental value is updated and decremented in a certain way. Here choose the increment sequence decreasing by 2{7, 5, 3, 1}
.int d = 7; while (d > 0) { // ... d=d-2; //计算新的增量值,{7,5,3,1} // ... }
Use
while
loop, continuously reduce the incrementd
, and perform insertion sort in each round of loop. The choice of increment is key, here the initial setting is 7, and then gradually decreases.for (i = d; i < n; i++) { // ... }
For each group, perform insertion sort starting from the
d
element.while (j > d - 1 && R[j - d] > r) { // ... }
In the process of insertion sort, by comparing and moving elements, the elements within the group are ensured to be in order.
-
Output results
printf("\n增量值为%d时的关键词比较次数是%d,记录移动次数是%d\n\n", d, compare, move);
After each round of sorting, output the number of comparisons and the number of recorded moves for that round of sorting, so as to understand the performance of the algorithm under different synchronization lengths.
printf("关键词的总比较次数是%d,总的记录移动次数是%d\n", Compare, Move);
After the entire sorting is completed, the total number of comparisons and the number of recorded moves are output, providing information on the overall performance of the algorithm.
- main function
int main(){
int i;
// int R[n]={27,32,33,21,57,96,64,87,14,43,15,62,99,11};
// int R[n]={11,14,15,21,27,32,33,43,57,62,64,87,96,99};
int R[n]={
99,96,87,64,62,57,43,33,32,27,21,15,14,11};
ShellSort(R);
printf("最后结果:");
for(i=0;i<n;i++){
printf("%d ",R[i]);
}
}
Create an array of 14 elements R
and call the ShellSort
function to sort it. Finally output the sorted array.
3.4 Experimental results
4. Experimental conclusion
Hill sorting is an efficient sorting algorithm. By introducing increments, it can significantly improve the performance of insertion sort in some cases. Choosing an appropriate block length sequence has an important impact on the actual performance of the algorithm, and the best known sequence { 1 , 4 , 10 , 23 , 57 , 132 , 301 , 701 , 1750 , . . . } \{1, 4, 10, 23, 57, 132, 301, 701, 1750, ... \} { 1,4,10,23,57,132,301,701,1750,...}Excellent performance in practice.
It should be noted that Hill sorting is an unstable sorting algorithm. In practical applications, it is important to choose different sorting algorithms based on data size and characteristics. Hill sorting may be more suitable than other sorting algorithms in some scenarios. The performance of Hill sorting is very sensitive to the choice of grouping length sequence, so it needs to be tuned according to specific situations in actual use.