Basic introduction to several common sorting methods, performance analysis, and c language

This article describes the seven common sorting algorithm, as well as their principles, and performance analysis c language:

To enable coherent, all algorithms and interpreted herein for all sorted in ascending order

 

First prepare a disordered array element arr [], the length of the array length, a swap function swap,

Implementation calls the sort function in the main function, and sort the output results:

void swap(int*x , int*y) {
    int temp = *x;
    *x = *y;
    *y = temp;
}
int main() {
    int arr[] = { 1,8,5,7,4,6,2,3}; int length = sizeof(arr) / sizeof(int); sort(arr, length); for (int i = 0;i < length;i++) { printf("%d\n", arr[i]); } return 0; }

 


 

Insertion Sort

The first cycle:

Second cycle:

Third cycle:

 

 Each execution of the outer loop is inserted into a data area in order from a disordered region arr [i]

  Inserted in the loop control layer data arr [i] of its immediately preceding data comparison, if the data is smaller than the previous one, let data before and after a shift

  ... repeating the above steps until find than arr [I] arr small data [J], as arr [J] after the following data are shifted one, it directly arr [I] in free arr [j + 1] position

 

c program implementation:

void CRsort(int arr[], int length) {
    int temp;
    for (int i = 0;i < length;i++) {
        temp = arr[i];
        for (int j = i - 1;j >= 0;j--) {
            if (arr[j] > temp) { arr[j + 1] = arr[j]; } else { arr[j + 1] = temp; break; } } } }

 

Performance Analysis

Stability: Stable             

  -> After the inner loop when executed, only encountered greater than arr [i] of shift will be equal to the arr [i] does not shift

Time complexity: (worst n², preferably n, the average N²)    

  -> the amount of data is n, n times the outer loop, the inner loop up to n times, if the data is ordered only execute the inner loop

   The more orderly data, insertion sort of higher efficiency

Space complexity: 1

  -> program does not use recursion, temporary variables do not take up storage resources, complex space of 1

 


 

 Shell sort

Shell sort called "zoom increment sort", an improved insertion sort is present row:

Insertion sort advantage is suitable for handling proximity ordered elements, the disadvantage is only one element each comparison, Hill sorting utilizes these two characteristics:

 

Different from the ordinary insertion sort, Shell sort has an incremental sequence r , the initial value of r and generally int (length / 2), i.e. the number of elements divided by 2 rounded down

 All units apart r a group of elements, performs the ordering in the group

Every binary increments, until the last sort, in increments of 1, then the element must be ordered

 

Delta outermost loop control group, i.e., each cycle increment

  Inside the loop is a two basic insertion sort, but not ordered set each addition arr [i + 1], but arr [i + delta]

 c program implementation:

void ShellSort(int arr[], int length) {
    int r, temp, j;
    for (r = length / 2;r >= 1;r = r / 2) {
        for (int i = r;i < length;i++) {
            temp = arr[i];
            j = i - r;
            while (j >= 0 && temp < arr[j]) {
                arr[j + r] = arr[j];
                j = j - r;
            }
            arr[j + r] = temp;
        }
    }
}

Performance Analysis

Stability: Unstable             

  -> Each element is in its own group in Sort, the same values ​​for different elements of the two groups to sort their relative position overall may vary

Time complexity: (worst n², preferably n, the average n ^ 1.3)    

  -> Shell sort of analysis is a complex issue, that it is time to "incremental" sequence taken function, which involves a number of unsolved mathematical problems, taken from the Internet.

    And preferably the same general insertion sort, if the data is ordered only execute the inner loop

   Also has the characteristics of insertion sort: the order data, the higher the efficiency of its execution

Space complexity: 1

  -> program does not use recursion, temporary variables do not take up storage resources, complex space of 1

 


 

Bubble Sort

Bubble Sort

 

 

An outer loop element placed every ordered regions

  Into inner loop control elements: a setting pointer j, take arr [j] value and arr [j + 1] are compared:

  If arr [j] <arr [j + 1], will arr [j] value and arr [j + 1] exchange, or j ++, until completion of disorder Comparative

It is worth mentioning that this bubble sort can be optimized:

Setting a status code change, a data exchange takes place when there is a change set

If you do not have a full time sorting exchange takes place, then this is an ordered set of data, it can directly end the cycle

c program to achieve :

int MPsort(int arr[], int length) {
    int temp, change = 0;
    for (int i = 0;i < length;i++) {
        for (int j = 0;j < length - i - 1;j++) {
            if (arr[j] > arr[j + 1]) {
                swap(&arr[j],&arr[j+1]); change = 1; } } if (change == 0) { return 0; } } }

性能分析

稳定性 : 稳定             

  -->数据只有在大于或小于时才会交换,相等时不会交换,因此相同数据的相对位置不会发生改变

时间复杂度 : (最坏n²,最好n,平均n²)    

  -->在数据完全有序时,不会有数据交换,根据上面的优化处理,状态码change不会改变,外层循环执行一次就结束,时间复杂度为n+1,也就是n.

    而如果是非常无序的数据,外层循环执行满n次,时间复杂度就是n²

空间复杂度: log(2)(N)

  -->程序没有用到递归,临时变量不占用存储资源,因此空间复杂度为1

 


 

快速排序

快速排序是一种效率比较高的排序方法,这张图片我认为介绍的很清楚,搬运来用一下,出处在文章下面:

c程序实现:

void KSSort(int arr[], int left, int right) {
    if (left < right) {
        int key = arr[left];
        int l = left;
        int r = right;
        while (l < r) { while (l < r && arr[r] >= key) { r--; } if (l < r) { arr[l] = arr[r]; l++; } else { break; } while (l < r && arr[l] <= key) { l++; } if (l < r) { arr[r] = arr[l]; r--; } else { break; } } arr[l] = key; KSSort(arr, left, l - 1); KSSort(arr, l + 1, right); } }

性能分析

稳定性 : 不稳定 

  --> 假设是稳定的,举个反例: 

    5 | 3 1 2 | 9 7 8 9 | 4 6 3

    这时遍历unvisited部分 刚到了4 (array[8])

    显然4<5 ,这是4应该从 unvisited 部分去到 lower 部分。 因此 higher部分第一个元素 9 (array[4]) 和 4互换。变成了这样:

    5 | 3 1 2 4 | 7 8 9 9 | 6 3

时间复杂度 : (最坏n²,最好nlogn,平均nlogn)    

  -->快速排序最差的情况就是每次取到的基准数baes都是排序组的边界值(不是最小的就是最大的),这时外层循环要遍历n次才能将所有数据比较完,时间复杂度就是n²

    这种情况多发生在排好序(或接近排好序)的数据中,要在这种数据中避免使用快速排序算法

   最好的情况是基准数每次都能取到接近排序组的中位数,用最短的循环次数将程序完全分割,n个数据每次减半,也就是log(2)(N)次后数据被完全分割,算法的时间复杂度为

   最差的情况不容易取到,平均时间复杂度取nlogn

空间复杂度: logn

  -->在我这个程序里没有用到递归,空间复杂度为1,当然也可以使用递归实现,递归log(2)(N)次,空间复杂度为logn

最后附一张各种排序算法比较图

 


 

选择排序

选择排序是最简单的排序方式,也是比较低效的一种排序方式:

第一次循环 :

 

 第二次循环:

 

 第三次循环:

 

外层循环每执行一次向有序区添加一个元素

  内层循环遍历到最大的元素,与刚刚填入有序区的元素交换

......直到数据全部加入有序区

c程序实现:

void XZsort(int arr[] , int length) {
    int check;
    for (int i = 0;i < length - 1;i++) {
        check = i;
        for (int j = i + 1;j < length;j++) {
            if (arr[j] < arr[check]) {
                check = j;
            }
        }
        if (i != check) {
            swap(&arr[i],&arr[check]);
        }
    }
}

 

性能分析

稳定性 : 稳定

  -->存在两个相同的元素时,肯定是下标小的元素先进入有序区,而且在值相等的情况下有序区元素不会被替换

时间复杂度 : (最坏n²,最好n²,平均n²)

  -->外层循环要执行n次,才能将n个数据全部加入有序区

   无论数据是否有序,内层循环都要将所有的数据比较一遍,找到最小的元素

空间复杂度: 1

  -->程序没有用到递归,临时变量不占用存储资源,因此空间复杂度为1

 


 

堆排序

堆排序是选择排序的改进版本,它比堆排序的性能要高很多.

要实现堆排序,首先要了解这几个知识点:

大根堆:每个节点的值都大于他的左右子树(小根堆反之)

完全二叉树:除了最后一层之外的其他每一层都被完全填充,并且所有结点都保持向左对齐。 

 在n个节点的完全二叉树中,叶子节点有(n+1)/2个,非叶子节点有(n-1)/2个

 在堆中,下标为n的数的左子树为2n+1,右子树为2n+2

 

首先要能够将一组无序的数排列成大根堆,大根堆的构造方法如下:

 

               

 

 

            

 

 ...

外层循环从最后一个非叶子节点( length/2-1 )向根节点遍历,每遍历到一个数据,就执行下面操作:

设置一个根指针i,指向当前要操作树的根节点

比较该树的左右子树的值,将biger指针指向较大的子树

如果该子树的值比根节点还大(arr[biger]>arr[i]),将该子树的值与根节点交换,并将i指针指向该子树,使之成为新的根节点

  ......递归执行上一步操作,直到有根节点不小于左右子树为止

 

 构造出大根堆之后,每次取整个堆的根节点(也就是第一个元素)存入有序区,将堆的最后一个元素作为根节点

剩下的元素继续构造大根堆,直到数据完全存入有序区

c程序实现:

void MkHeap(int arr[], int i, int length) {
    int bigger = 2 * i + 1;
    int temp;
    if (bigger<length){
        if (arr[bigger] < arr[bigger + 1]) {
            bigger++;
        }
        if (arr[i]<arr[bigger]){//如果子树比爹树大:把子树的值与爹树值交换,并让该子树成为新的爹树
            swap(&arr[i], &arr[bigger]);
            MkHeap(arr,bigger,length);    
        }
    }
}
void HeapSort(int arr[], int length) {
    //从最后一个非叶子节点往根找
    for (int i = length / 2 - 1;i >= 0;i--) {
        MkHeap(arr, i, length);
    }
    for (int j = length - 1;j > 0;j--) {
        swap(&arr[j],&arr[0]);
        MkHeap(arr, 0, j-1);
    }
}

性能分析

稳定性 : 不稳定             

  -->先假设稳定,然后举个反例:

   {6,7,7}这棵树左右子树的值都是7,本来左子树是应该是在前面的,但是6加入有序区之后右子树的7会被提到最上面,这样他们的顺序就被调换了

时间复杂度 : (最坏nlogn,最好nlogn,平均nlogn   <底数可以写2也可以不写,根据时间复杂度的性质无论底数是几结果都是一样的>   )    

  -->根据性质,外层循环会执行n次(n代表排序的元素个数),将所有数据全部加入有序区.

   而内层递归函数指针从0到n,每次增长前一个的2n+1,忽略掉常数1,也就是执行log(2)(N)次

   因此为Nlog(2)(N)次,忽略掉底数2,时间复杂度为nlogn

   插入/希尔排序适合接近有序的数据,而堆排序适合非常无序的数据,因为无论数据多么杂乱,希尔排序的时间复杂度都是nlogn

     (不放心再说一下 : log(2)(N)是log以2为底n的对数 , 不是log2乘n , 因为底数打不出来只能这样写)

空间复杂度: log(2)(N)

  -->这里程序递归了log(2)(N)次,每次递归都占用内存,因此空间复杂度为log(2)(N)

   当然也可以使用循环的方式实现,但是那样写出来结构略显混乱,不如递归方式清晰

 

 

 不稳定算法口诀:快些选队

 

参考资料(文中的部分图片和思想来自以上材料):

1. 《新编数据结构习题与解析》

2. 文章

https://www.cnblogs.com/skywang12345/p/3603935.html

https://www.cnblogs.com/jingmoxukong/p/4302891.html

http://www.sohu.com/a/341037266_115128

https://www.toutiao.com/a6593273307280179715/?iid=6593273307280179715

3. 关于快速排序算法的稳定性? - 知遥其实是德鲁伊的回答:https://www.zhihu.com/question/45929062/answer/262452296

Guess you like

Origin www.cnblogs.com/iszhangk/p/11942617.html