《基础数据结构-排序》之交换排序

#PS:这个只是《我自己》理解，如果和你的

#原则相冲突，请谅解，勿喷

前提

此系列大概会有三类文章，包括我在学校不太掌握的三种类型的相关算法，分别为排序、树、图中的基础算法。这里主要对这些算法进行一个巩固，期望使自己的相关技术有一定的提高。

实验平台和数据

MEM:2GB
CPU:Intel® Core™ i5-7400 CPU @ 3.00GHz × 4
SYSTEM:Linux 4.13.0-38-generic #43~16.04.1-Ubuntu SMP Wed Mar 14 17:48:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux（虚拟机）
测试数据（100000条数据，数据范围1-100000）下载链接：https://download.csdn.net/download/u011728480/10399194

简介

排序算法要解决的问题是：把一系列已存在的数据（乱序），按照一定规律（有序）进行存放。我们常见的就是数字的按照其大小进行排序。排序按照其实现的核心原理分为约4中类型：交换排序、插入排序、选择排序和归并排序。下面依次对这些排序进行说明和总结。
ps：上面内容都是扯淡，大佬直接忽略。

交换排序

交换排序有两种常见算法：冒泡排序和快速排序

冒泡排序

就是遍历，交换。把符合某规律的数据元素交换到数据顶端（有条件的末端），重复此过程，直到数据序列有序为止。
使用上文数据进行排序，结果如图：
这里写图片描述
平均时间复杂度：O(n ^ 2)
空间复杂度：O(1)
实现代码（下面算法是未做任何优化的，一种优化方法为：如果为发生数据交换，则数据有序且退出整个循环）：

/*

worst time:
surrounding loop=n-1
compare=n(n-1)/2  (等差数列求和，1,2....n-1)
exchange=3n(n-1)/2 (等差数列求和，1,2....n-1,每次交换3次)

best time:(针对此代码，未优化)
surrounding loop=n-1
compare=n(n-1)/2
exchange=0


average-time:O(n ^ 2)
space:O(1)

*/


#include <iostream>
#include <fstream>
#include <chrono>
#include <string>
#include <assert.h>
#include <functional>


template <typename T, typename T1>
int bubble_sort(T * const head, const T1 &len, std::function<void(T&, T&)> swap_func = nullptr){
//int bubble_sort(T * const head, const T1 &len, void (*swap_func)(T&,T&) = nullptr){

    //data_len data_tmp must init for type deduction
    auto data_len = len;//
    auto data_tmp = *head;//
    auto i = len,j = len;
    //len-1 traverse
    for (j = 1; j < data_len; j ++){//n-1

        for (i = 0; i < data_len -j; i ++){

            if ( nullptr != swap_func )
                swap_func(*(head+i), *(head+i+1));
            else{

                if ( *(head+i) > *(head+i+1) ){

                    data_tmp = *(head+i);
                    *(head+i) = *(head+i+1);
                    *(head+i+1) = data_tmp;
                }
            }
        }
    } 
    return 0;
}

#define DATA_LEN 100000

int main(int argv, char * argc[]){

    long * buffer;
    std::filebuf *pbuf;  
    std::ifstream in;
    std::ofstream out;
    in.open("data.txt");
    assert(in.is_open());
    long idx = 0;

    buffer = new long[DATA_LEN];

    std::string data;
    for ( ; idx < DATA_LEN; idx++ ){

        getline(in, data);  
        *(buffer+idx) = atol(data.c_str());
    }
    in.close();

    std::chrono::time_point<std::chrono::high_resolution_clock> p0 = std::chrono::high_resolution_clock::now();
    bubble_sort(buffer, DATA_LEN);
    std::chrono::time_point<std::chrono::high_resolution_clock> p1 = std::chrono::high_resolution_clock::now();
    std::cout << "stitch high_resolution_clock time:" << (float)std::chrono::duration_cast<std::chrono::microseconds>(p1 - p0).count() / 1000 << "ms" << std::endl;

    out.open("bubble_sort.txt");
    assert(out.is_open());

    for (idx = 0 ; idx < DATA_LEN; idx++ ){

        out<<*(buffer+idx);
        out<<"\n";
    }
    out.close();

    delete []buffer;  
    return 0;
}

快速排序

选定一个基准数（一般为第一个数），然后把小于此数的数放到其左边，大于此数放到右边，这时，此数组被分为了两组。然后分别对这两组数进行同样的操作，直到数无法继续分组为止。
使用上文数据进行排序，结果如图：
这里写图片描述
平均时间复杂度：O(nlog(n))
空间复杂度：O(log(n)~n)
实现代码：

/*
worst time:O(n^2)
(每次我们指定的分割数是数列里面最大，或者最小，这样导致只会把一个数放到了正确的位置，类似选择排序或者冒泡排序，由于是递归调用，可参考二叉树结构分析)

（此时二叉树结构为最不平衡的二叉树，退化为一条线）
compare=n(n-1)/2(第一次比较n-1,第二次比较n-2  ...   1)
exchange=n(n-1)/2

best time: nlog(n) (注意，取的是数量级)
(参考平衡二叉树)
tree-deepth=log(n)
compare=(n/2)(1-(0.5^n))/(1-0.5) (等比数列求和,n/2 , n/4, ... n/2^n)
exchange=0


average-time:O(nlog(n)) (这里要用什么来证明一下，我搞不清楚原理了)
space:O(log(n)~n)(最坏时间为n(递归深度n)，最优时间为log(n)(递归深度log(n)))

*/


#include <iostream>
#include <fstream>
#include <chrono>
#include <string>
#include <assert.h>
#include <functional>


template <typename T, typename T1>
int quick_sort(T * const head, T1 low, T1 high, std::function<T1(T * const, T1 , T1 )> split_func){
//int bubble_sort(T * const head, const T1 &len, void (*swap_func)(T&,T&) = nullptr){

    //split must init for type deduction
    auto split = low;
    if (low < high){

        split =  split_func(head, low, high);//split into two arrary and get split val 
        quick_sort(head, low, split-1, split_func);//deal < split 
        quick_sort(head, split+1, high, split_func);//deal > split 

    }


    return 0;
}


long int split_func_imp(long int * const head, long int low, long int high){

    auto split_val = head[low];
    for (; low < high ;){

        for ( ; low < high && head[high] <= split_val ; )
            high--;
        head[low] = head[high];

        for ( ; low < high && head[low] >= split_val ; )
            low++;
        head[high] = head[low];
    }
    head[low] = split_val;
    return low;
}


#define DATA_LEN 100000

int main(int argv, char * argc[]){

    long * buffer;
    std::filebuf *pbuf;  
    std::ifstream in;
    std::ofstream out;
    in.open("data.txt");
    assert(in.is_open());
    long idx = 0;

    buffer = new long[DATA_LEN];

    std::string data;
    for ( ; idx < DATA_LEN; idx++ ){

        getline(in, data);  
        *(buffer+idx) = atol(data.c_str());
    }
    in.close();
    long low = 0;
    long high = DATA_LEN;
    std::function<long int (long int * const, long int , long int )> func = split_func_imp;
    std::chrono::time_point<std::chrono::high_resolution_clock> p0 = std::chrono::high_resolution_clock::now();
    quick_sort(buffer, low, high, func);
    std::chrono::time_point<std::chrono::high_resolution_clock> p1 = std::chrono::high_resolution_clock::now();
    std::cout << "stitch high_resolution_clock time:" << (float)std::chrono::duration_cast<std::chrono::microseconds>(p1 - p0).count() / 1000 << "ms" << std::endl;

    out.open("quick_sort.txt");
    assert(out.is_open());

    for (idx = 0 ; idx < DATA_LEN; idx++ ){

        out<<*(buffer+idx);
        out<<"\n";
    }
    out.close();

    delete []buffer;  
    return 0;
}

#PS：请尊重原创，不喜勿喷

有问题请留言，看到后我会第一时间回复