[Data structure] Insertion sort: Summary of learning knowledge of direct insertion sort, half insertion sort, and Hill sort

Table of contents

1. Basic concepts of sorting

2. Direct insertion sort

2.1 Algorithmic Thoughts

2.2 Code implementation

3. Half-way insertion sort

3.1 Algorithmic Thoughts

3.2 Code implementation

4. Hill sorting

4.1 Algorithmic Thoughts

4..2 Code implementation

1. Basic concepts of sorting

Sorting is the process of arranging a set of data in a predetermined order. The basic concepts of sorting include the following:

Keyword: The field according to which the sorting is performed is called the keyword.
Sorting rule: Sort in ascending or descending order. Ascending order means sorting from small to large, descending order means sorting from large to small.
Stability: If the sorting algorithm is sorted, whether the relative order of elements with the same keywords before and after sorting remains unchanged. If it remains unchanged, the sorting algorithm is stable.
Time complexity: The time complexity required by the sorting algorithm for sorting.
Space complexity: The additional space complexity required by the sorting algorithm for sorting, that is, the additional memory size required by the algorithm.

2. Direct insertion sort

2.1 Algorithmic Thoughts

The idea of the direct insertion sort algorithm is to insert the elements to be sorted into the already sorted element sequence, thereby obtaining a new, larger ordered sequence.

Specifically, the algorithm traverses the sequence to be sorted starting from the second element, and inserts the current element into the correct position in the sequence of elements that has been sorted, so that order remains after insertion. Because there is already an ordered sequence of elements at the beginning, each element inserted during the sorting process will be smaller than the elements in the already sorted element sequence, so it will not affect the ordering of the already sorted element sequence. sex. When the entire sequence is traversed, the sequence to be sorted is completely inserted into the already sorted element sequence, and the sorting is completed.

The time complexity of direct insertion sort is O(n^2) and the space complexity is O(1). Although the time complexity is high, the sorting efficiency for small-scale data is high and it is stable.

2.2 Code implementation

The following is a program written in C language that directly inserts sort and counts the number of comparisons:

#include <stdio.h>
#define MAXSIZE 100

void InsertionSort(int A[], int n, int *cnt) {
    int i, j, temp;
    for(i = 1; i < n; i++) {
        temp = A[i];
        for(j = i - 1; j >= 0; j--) {
            (*cnt)++;
            if(A[j] > temp)
                A[j + 1] = A[j];
            else
                break;
        }
        A[j + 1] = temp;
    }
}

int main() {
    int A[MAXSIZE];
    int n, cnt = 0;
    printf("请输入待排序数列元素个数（不超过%d）：", MAXSIZE);
    scanf("%d", &n);
    printf("请输入待排序数列：");
    for(int i = 0; i < n; i++)
        scanf("%d", &A[i]);
    InsertionSort(A, n, &cnt);
    printf("排序后结果：");
    for(int i = 0; i < n; i++)
        printf("%d ", A[i]);
    printf("\n比较次数：%d\n", cnt);
    return 0;
}

The function in the program InsertionSortimplements direct insertion sorting and uses pointer parameters to cntcount the number of comparisons. In the main function, first input the number of array elements to be sorted and the array elements, then call InsertionSortthe function to sort, and output the sorted results and the number of comparisons.

The following is a sample code in C language for designing a direct insertion sort algorithm on a chained storage structure:

typedef struct Node {
    int data;
    struct Node *next;
}Node;

void insertSort(Node **head) {
    if (*head == NULL || (*head)->next == NULL) {
        return;
    }
    Node *p = (*head)->next;
    (*head)->next = NULL; // 设置新的有序链表头节点
    while (p != NULL) {
        Node *q = p->next;
        Node *prev = NULL;
        Node *cur = *head;
        while (cur != NULL && cur->data < p->data) {
            prev = cur;
            cur = cur->next;
        }
        if (prev == NULL) { // 插入到头节点之前
            p->next = *head;
            *head = p;
        } else { // 插入到prev和cur之间
            prev->next = p;
            p->next = cur;
        }
        p = q;
    }
}

This code first judges the head node of the linked list. If the linked list is empty or has only one node, no sorting is required. Then the pointer p points to the successor node of the head node, sets the successor node of the head node to empty, and the head node of the new ordered linked list is the head node of the original linked list. Next, perform an insertion sort operation on each node of p, find the position where p should be inserted, and insert it. Finally, the sorted list head node is returned.

3. Half-way insertion sort

3.1 Algorithmic Thoughts

The half-fold insertion sort algorithm is a variant of the insertion sort algorithm. Its basic idea is to divide the sequence to be sorted into two parts, the first half is the sorted part, and the second half is the unsorted part. During the sorting process, one element is selected from the unsorted part at a time, and a binary search is used to find where it should be inserted into the sorted part, and then the element is inserted into the sorted part.

The specific implementation steps are as follows:

1. Treat the first element of the sequence to be sorted as the sorted part, and the remaining elements as the unsorted part.

2. Select an element from the unsorted part and use binary search to find the position where it should be inserted into the sorted part.

3. Insert the element into the sorted part, while increasing the length of the sorted part by 1 and decreasing the length of the unsorted part by 1.

4. Repeat steps 2 and 3 until the unsorted section is empty.

Compared with the ordinary insertion sort algorithm, the half insertion sort algorithm reduces the search time complexity from O(n) to O(log n), but this does not affect the overall time complexity of the algorithm, which is still O(n^ 2). However, in some specific scenarios, the half insertion sort algorithm may be more efficient than the ordinary insertion sort algorithm.

3.2 Code implementation

Half-way insertion sort is an optimization algorithm for insertion sort. It uses the idea of binary search to determine the insertion position, thereby reducing the number of comparisons and moves and improving sorting efficiency.

The following is an example code for implementing halved insertion sort in C language:

void binary_insertion_sort(int arr[], int len) {
    int i, j, left, right, mid, tmp;
    for (i = 1; i < len; i++) {
        tmp = arr[i];
        left = 0;
        right = i - 1;
        // 找到插入位置
        while (left <= right) {
            mid = (left + right) / 2;
            if (tmp < arr[mid]) {
                right = mid - 1;
            } else {
                left = mid + 1;
            }
        }
        // 移动元素
        for (j = i - 1; j >= left; j--) {
            arr[j + 1] = arr[j];
        }
        // 插入元素
        arr[left] = tmp;
    }
}

The time complexity of this algorithm is O(nlogn) and the space complexity is O(1).

4. Hill sorting

4.1 Algorithmic Thoughts

Hill sorting algorithm is an improved algorithm of insertion sort, also known as reducing incremental sorting. The basic idea of Hill sorting is to divide the sequence to be sorted into several subsequences, perform insertion sorting on each subsequence, and then continuously reduce the step size until the step size is 1 to complete the sorting.

During specific implementation, first determine an increment, divide the sequence into several groups under each increment, and perform insertion sorting on each group. Then gradually reduce the increment and repeat the above operation until the increment is reduced to 1, and perform insertion sort again. Different incremental sequences will affect the efficiency of Hill sorting. Generally, Hibbard incremental sequence or Sedgewick incremental sequence is used to improve sorting efficiency.

The time complexity of the Hill sorting algorithm is between O(nlogn) and O(n^2), depending on the choice of the increment sequence. In general, Hill sorting has good sorting efficiency and stability, and is especially suitable for sequences with large amounts of data and strong disorder.

4..2 Code implementation

The following is the code to implement Hill sorting in C language:

void shell_sort(int arr[], int len)
{
    int gap, i, j, temp;
    for (gap = len / 2; gap > 0; gap /= 2) {  // gap为步长，每次减半直到为1
        for (i = gap; i < len; i++) {
            temp = arr[i];
            for (j = i - gap; j >= 0 && arr[j] > temp; j -= gap) {
                arr[j + gap] = arr[j];  // 向后移动gap位
            }
            arr[j + gap] = temp;  // 插入到正确位置
        }
    }
}

This code first divides the entire sequence to be sorted into several subsequences, and performs insertion sorting according to the step size. Then gradually reduce the step size until it is 1, and finally perform an ordinary insertion sort to complete the entire sorting process.