[Data Structure Experiment] Sorting (2) Detailed introduction and performance analysis of Hill sorting algorithm

1 Introduction

  Sorting algorithms play a vital role in computer science and have a profound impact on tasks such as data organization and search. Hill sort is an improved version of insertion sort. By introducing the concept of increment, it can significantly improve the efficiency of sorting in some cases.

  This article will introduce in detail the principle and implementation of the Hill sorting algorithm, as well as analyze its performance.

2. Principle of Hill sorting algorithm

  Hill sorting is an improved algorithm based on insertion sort, proposed by Donald L. Shell in 1959. The core idea is to group the records to be sorted by a certain increment of the subscript, and use the direct insertion sort method for each group. As the increment gradually decreases, each group contains more and more records until the increment is 1. , the entire sequence is divided into exactly one group, and the sorting is completed.

2.1 Example description

  Consider an input file containing 16 records, and the decreasing increment sequence is 8, 4, 2, 1 8, 4, 2, 1 8,4,2,1. Initially, the input file is divided into 8 groups:

  • Set 1: R 1 , R 9 R_1, R_9 R1,R9
  • Set 2: R 2 , R 10 R_2, R_{10} R2,R10
  • Set 3: R 3 , R 11 R_3, R_{11} R3,R11
  • Set 8: R 8 , R 16 R_8, R_{16} R8,R16

  Sort each group using the direct insertion sort algorithm. Then, take the increment value to 4 and divide the files into 4 groups:

  • 组1: R 1 , R 5 , R 9 , R 13 R_1, R_5, R_9, R_{13} R1,R5,R9,R13
  • 组2: R 2 , R 6 , R 10 , R 14 R_2, R_6, R_{10}, R_{14} R2,R6,R10,R14
  • 组3: R 3 , R 7 , R 11 , R 15 R_3, R_7, R_{11}, R_{15} R3,R7,R11,R15
  • 组4: R 4 , R 8 , R 12 , R 16 R_4, R_8, R_{12}, R_{16} R4,R8,R12,R16

  Use direct insertion sort again for each group. Repeat this process, taking increment values ​​of 2 and 1, and finally complete the entire sorting.
Insert image description here

2.2 Time complexity analysis

  Hill sorting performance is closely related to the selected grouping length sequence. The worst-case time complexity is O ( n 2 ) O(n^2) O(n2), but different packet length sequences will affect the actual performance of the algorithm.

  • When the packet length sequence is taken n 2 i \frac{n}{2^i} 2inWhen , the worst-case time complexity is O ( n 2 ) O(n^2) O(n2)
  • In practical applications, taking 2.2 as the reduction factor is more effective.
  • When the packet length sequence is in the form 2 p 3 q 2^p3^q 2p3When q is the set of all positive integers less than n, the time complexity of Hill sorting is O ( n ⋅ ( log ⁡ 2 n ) 2 ) O(n \cdot (\log_2 n)^2) O(n(log2n)2)

  In 1969, V. Pratt proved the above conclusion. The best currently known packet length sequence is { 1 , 4 , 10 , 23 , 57 , 132 , 301 , 701 , 1750 , . . . } \{1, 4, 10, 23, 57, 132, 301, 701, 1750, ... \} { 1,4,10,23,57,132,301,701,1750,...}, Hill sort with this grouping sequence is faster than insertion sort and heap sort. It is faster than quick sort for small arrays, but may be slower than quick sort for large arrays. In addition, Hill sort is an unstable sorting algorithm.

3. Experiment content

3.1 Experimental questions

Implement the Hill sorting algorithm ShellSort using {7, 5, 3, 1} as the decreasing increment sequence.

(1) Input requirements

The first set of input data:
{27, 32, 33, 21, 57, 96, 64, 87, 14, 43, 15, 62, 99, 11}< /span> {99, 96, 87, 64, 62, 57, 43, 33, 32, 27, 21, 15 ,14,11} The third set of input data: {11, 14, 15, 21, 27, 32, 33, 43, 57, 62, 64, 87, 96, 99}
The second set of input data:


(2) Output requirements

For each set of input data, output the following information (clear prompt information about the output data is required):

  1. Output the total number of keyword comparisons and the total number of record moves in the entire sorting process;
  2. Each time a record insertion occurs, the entire file is output once;
  3. Output the number of keyword comparisons and the number of record moves when the increment is 7, 5, 3, 1

3.2 Algorithm implementation

#include <stdio.h>
#define n 14
void ShellSort(int R[n]){
    
    
    int r,i,j,k,Compare=0,Move=0;
    int d=7;	//初始化增量值为7
    while(d>0){
    
    		//不断分组,并对各组排序
   		int compare=0,move=0;
        for(i=d;i<n;i++){
    
    	//对各组做直接插入排序
            r=R[i];
            j=i;
            while(j>d-1&&R[j-d]>r){
    
    
            	compare++;
                R[j]=R[j-d];
                j-=d;
            }
            if(j!=i){
    
    
            	move++;
            	R[j]=r;
            	for(k=0;k<n;k++){
    
    
           	 	    printf("%d ", R[k]);
            	}
            	printf("\n");
			}  
        }
        printf("\n增量值为%d时的关键词比较次数是%d,记录移动次数是%d\n\n",d,compare,move);
        d=d-2; 	//计算新的增量值,{7,5,3,1}
   		Compare+=compare;
		Move+=move;
    }
    printf("关键词的总比较次数是%d,总的记录移动次数是%d\n",Compare,Move);
}
int main(){
    
    
    int i;
    //int R[n]={27,32,33,21,57,96,64,87,14,43,15,62,99,11};
    int R[n]={
    
    11,14,15,21,27,32,33,43,57,62,64,87,96,99};
    //int R[n]={99,96,87,64,62,57,43,33,32,27,21,15,14,11};
    ShellSort(R);
    printf("最后结果:");
    for(i=0;i<n;i++){
    
    
        printf("%d ",R[i]);
    }
}

3.3 Code analysis

  • Macro definition
#define n 14

  Define macron, indicating that the length of the array is 14. You can easily use n to indicate the length of the array in subsequent code without hardcode.

  • Hill sorting function
      The parameter is an integer array R, which represents the array to be sorted. Within the function, the data is inserted and sorted by continuously reducing the increment. Specifically, after each cycle ends, the incremental value is updated and decremented in a certain way. Here choose the increment sequence decreasing by 2 {7, 5, 3, 1}.

    int d = 7;
    while (d > 0) {
          
          
    	// ...
    	d=d-2; 	//计算新的增量值,{7,5,3,1}
    	// ...
    }
    

      Use while loop, continuously reduce the increment d, and perform insertion sort in each round of loop. The choice of increment is key, here the initial setting is 7, and then gradually decreases.

    for (i = d; i < n; i++) {
          
          
    // ...
    }
    

    For each group, perform insertion sort starting from the d element.

    while (j > d - 1 && R[j - d] > r) {
          
          
    // ...
    }
    

    In the process of insertion sort, by comparing and moving elements, the elements within the group are ensured to be in order.

  • Output results

printf("\n增量值为%d时的关键词比较次数是%d,记录移动次数是%d\n\n", d, compare, move);

  After each round of sorting, output the number of comparisons and the number of recorded moves for that round of sorting, so as to understand the performance of the algorithm under different synchronization lengths.

printf("关键词的总比较次数是%d,总的记录移动次数是%d\n", Compare, Move);

  After the entire sorting is completed, the total number of comparisons and the number of recorded moves are output, providing information on the overall performance of the algorithm.

  • main function
int main(){
    
    
    int i;
    // int R[n]={27,32,33,21,57,96,64,87,14,43,15,62,99,11};
    // int R[n]={11,14,15,21,27,32,33,43,57,62,64,87,96,99};
    int R[n]={
    
    99,96,87,64,62,57,43,33,32,27,21,15,14,11};
    ShellSort(R);
    printf("最后结果:");
    for(i=0;i<n;i++){
    
    
        printf("%d ",R[i]);
    }
}

  Create an array of 14 elements R and call the ShellSort function to sort it. Finally output the sorted array.

3.4 Experimental results

Insert image description here
Insert image description here
Insert image description here

4. Experimental conclusion

  Hill sorting is an efficient sorting algorithm. By introducing increments, it can significantly improve the performance of insertion sort in some cases. Choosing an appropriate block length sequence has an important impact on the actual performance of the algorithm, and the best known sequence { 1 , 4 , 10 , 23 , 57 , 132 , 301 , 701 , 1750 , . . . } \{1, 4, 10, 23, 57, 132, 301, 701, 1750, ... \} { 1,4,10,23,57,132,301,701,1750,...}Excellent performance in practice.

  It should be noted that Hill sorting is an unstable sorting algorithm. In practical applications, it is important to choose different sorting algorithms based on data size and characteristics. Hill sorting may be more suitable than other sorting algorithms in some scenarios. The performance of Hill sorting is very sensitive to the choice of grouping length sequence, so it needs to be tuned according to specific situations in actual use.

Guess you like

Origin blog.csdn.net/m0_63834988/article/details/134700897