Ten Common Sorting Algorithms

1. Classification of common sorting algorithms

Ten common sorting algorithms are generally divided into the following categories: 
1) Non-linear time comparison sorting: exchange sorting (quick sorting and bubble sorting), insertion sorting (simple insertion sorting and Hill sorting), selection sorting (simple selection sort and heap sort), merge sort (two-way merge sort and multi-way merge sort);

2) Linear time non-comparative sorting: counting sort, radix sort, and bucket sort.

Summary: 
1) In comparison sorting, merge sorting is the fastest, followed by quick sorting and heap sorting. The two are comparable, but one thing to note is that the initial sorting state of the data will not have much impact on the heap sorting. Quick sort does the exact opposite.

2) Linear time non-comparison sorting is generally better than nonlinear time comparison sorting, but the former has stricter requirements for sorting elements. For example, counting sorting requires that the maximum value of the number to be sorted should not be too large, and bucket sorting requires elements to be classified according to hash. The number of elements in the bucket after the bucket should be even. The typical characteristic of linear time non-comparative sorting is that space is exchanged for time.

Note: The sample code in this blog post has been sorted incrementally for the purpose.


2. Algorithm description and implementation

2.1 Exchange Class Sorting

The basic method of exchange sorting is: compare the sorting codes of the records to be sorted two by two, and exchange the pairs that do not meet the order requirements until all the positions are satisfied. The common bubble sort and quick sort belong to the exchange sort.

2.1.1 Bubble sort

Algorithm idea: 
start from the first number in the array, traverse each number in the array in turn, compare and exchange adjacent ones, find the largest number among the remaining unsorted numbers in each round and "bubble" to the array top of the .

Algorithm steps: 
1) Starting from the first number in the array, compare with the next number in turn and exchange the number smaller than itself until the last number. If the exchange occurs, continue with the following steps. If no exchange occurs, the array is in order and the sorting ends. At this time, the time complexity is; 2) After O(n)each 
round of "bubbling" ends, the largest number will appear in the disordered order The last digit of the sequence. Repeat step (1).

Stability: Stable sorting.

Time complexity: O(n) to O(n^{2}), the average time complexity is O(n^{2}).

Best case: If the data sequence to be sorted is in positive order, the sorting can be completed in one trip of bubbling, the number of comparisons of the sorting code is n-1, and there is no movement, and the time complexity is O(n).

Worst case: If the data sequence to be sorted is in reverse order, bubble sorting requires n-1 times of bubbles, and each time performs ni times of comparison and movement of sorting codes, that is, the number of comparisons and movements reaches the maximum: number of 
comparisons :  C{max} =\sum_{i=1}^{n-1}(n-i) = n(n-1)/2 = O(n^{2})
 
The number of moves is equal to the number of comparisons, so the worst time complexity is O(n^{2}).

Sample code:

void bubbleSort(int array[],int len){
    //循环的次数为数组长度减一,剩下的一个数不需要排序
    for(int i=0;i<len-1;++i){
        bool noswap=true;
        //循环次数为待排序数第一位数冒泡至最高位的比较次数
        for(int j=0;j<len-i-1;++j){
            if(array[j]>array[j+1]){
                array[j]=array[j]+array[j+1];
                array[j+1]=array[j]-array[j+1];
                array[j]=array[j]-array[j+1];
                //交换或者使用如下方式
                //a=a^b;
                //b=b^a;
                //a=a^b;
                noswap=false;
            }
        }
        if(noswap) break;
    }
}

2.1.2 Quick Sort

Bubble sorting is to compare and exchange two adjacent records. Each exchange can only move up or down by one position, resulting in more comparisons and moves in total. Quick sort, also known as partition exchange sort, is an improvement to bubble sort. The idea adopted by quick sort is divide and conquer.

Algorithm principle: 
1) Randomly select a record (usually the first record) from the n records to be sorted as the partition standard;

2) Move all records smaller than the sorting code to the left, move all records larger than the sorting code to the right, and put the selected records in the middle, which is called the first sorting;

3) Then repeat the above process for the two subsequences before and after, until all records are sorted.

Stability: Unstable sorting.

Time complexity: O(nlog_{2} n)to O(n^{2}), the average time complexity is O(nlgn).

The best situation: After each sorting, each division makes the lengths of the two sub-files roughly equal, and the time complexity is O(nlog_{2} n).

Worst case: the records to be sorted have been sorted, the first record keeps its position after n-1 comparisons in the first pass, and a sub-record with n-1 elements is obtained; the second pass passes through n -2 comparisons, locate the second record at the original position, and get a subfile including n-2 records, and so on, so the total number of comparisons is: C{max} =\sum_{i=1}^{n-1}(n-i) = n(n-1)/2 = O(n^{2})

Sample code:

//a:待排序数组,low:最低位的下标,high:最高位的下标
void quickSort(int a[],int low, int high)
{
    if(low>=high)
    {
        return;
    }
    int left=low;
    int right=high;
    int key=a[left];    /*用数组的第一个记录作为分区元素*/
    while(left!=right){
        while(left<right&&a[right]>=key)    /*从右向左扫描,找第一个码值小于key的记录,并交换到key*/
            --right;
        a[left]=a[right];
        while(left<right&&a[left]<=key)
            ++left;
        a[right]=a[left];    /*从左向右扫描,找第一个码值大于key的记录,并交换到右边*/
    }
    a[left]=key;    /*分区元素放到正确位置*/
    quickSort(a,low,left-1);
    quickSort(a,left+1,high);
}

2.2 Insertion class sorting

The basic method of insertion sorting is: each step inserts a record to be sorted into the appropriate position in the previously sorted file according to the size of its sorting code until all are inserted.

2.2.1 Direct insertion sort

Principle: Starting from the second record among the n records to be sorted, compare it with the previous records in turn and find the insertion position. After each outer loop ends, insert the current number into the appropriate position.

Stability: Stable sorting.

Time complexity: O(n) to O(n^{2}), the average time complexity is O(n^{2}).

Best case: When the records to be sorted are already sorted, the number of comparisons required is C_{min} = n-1=O(n).

Worst case: If the records to be sorted are in reverse order, the maximum number of comparisons is C_{max}=\sum_{i=1}^{n-1}(i)=\frac{n(n-1)}{2}=O(n^{2}).

Sample code:

//A:输入数组,len:数组长度
void insertSort(int A[],int len)
{
    int temp;
    for(int i=1;i<len;i++)
    {
      int j=i-1;
      temp=A[i]; 
      //查找到要插入的位置
      while(j>=0&&A[j]>temp)
      {
          A[j+1]=A[j];
          j--;
      }
      if(j!=i-1)
        A[j+1]=temp;
    }
}

2.2.2 Shell sorting

Shell sorting, also known as shrinking incremental sorting, was proposed by DL Shell in 1959 and is an improvement on direct insertion sorting.

Principle: The Shell sorting method is to compare the adjacent elements with a specified distance (called increment), and continuously reduce the increment to 1 to complete the sorting.

At the beginning of Shell sorting, the increment is large, there are many groups, and the number of records in each group is small, so it is faster to use direct insertion sorting in each group. Later, the increment gradually decreases, the number of groups decreases, and the number of records in each group increases d_{i}. However, since the files have been d_{i-1}sorted by group, the files are called close to the ordered state, so the new sorting process is faster. Therefore, Shell sorting has a greater improvement in efficiency than direct insertion sorting.

On the basis of direct insertion sorting, all 1 in direct insertion sorting can be changed to increment d, because the increment d of the last round of Shell sorting is 1.

Stability: Unstable sorting.

Time complexity:O(n^{1.3}) to O(n^{2}). The time complexity analysis of the Shell sorting algorithm is relatively complicated, and the actual time required depends on the number of increments and the value of the increments in each sorting. The research proves that if the value of the increment is reasonable, the time complexity of the Shell sorting algorithm is approx O(n^{1.3}).

For the choice of increments, Shell initially suggested that the increments be chosen to be n/2, and the increments were halved until 1; Professor D. Knuth suggested d_{i+1} = \left \lfloor \frac{d_{i}-1}{3} \right \rfloorsequences.

//A:输入数组,len:数组长度,d:初始增量(分组数)
void shellSort(int A[],int len, int d)
{
    for(int inc=d;inc>0;inc/=2){        //循环的次数为增量缩小至1的次数
        for(int i=inc;i<len;++i){       //循环的次数为第一个分组的第二个元素到数组的结束
            int j=i-inc;
            int temp=A[i];
            while(j>=0&&A[j]>temp)
            {
                A[j+inc]=A[j];
                j=j-inc;
            }
            if((j+inc)!=i)//防止自我插入
                A[j+inc]=temp;//插入记录
        }
    }
}

Note: It can be seen from the code that each change of the increment takes the same as the previous increment. When the increment d is equal to 1, the shell sort degenerates into a direct insertion sort.


2.3 Selection class sorting

The basic method of selection class sorting is: select the record with the smallest sorting code from the records to be sorted at each step, and place it behind the sequence of sorted records until all of them are sorted.

2.3.1 Simple selection sort (also known as direct selection sort)

Principle: Select the smallest data element from all records to exchange with the record at the first position; then find the smallest record exchange with the second position among the remaining records, and cycle until only the last data is left element.

Stability: Unstable sorting.

Time complexity: both the worst and best average complexity O(n^{2}), therefore, simple selection sort is also the worst performing sorting algorithm among common sorting algorithms. The number of comparisons for simple selective sorting has nothing to do with the initial state of the file. To select the record with the smallest sort code in the i-th sorting, ni comparisons are required, so the total number of comparisons is: \sum_{i=1}^{n-1}(n-i)=n(n-1)/2=O(n^{2}).

Sample code:

void selectSort(int A[],int len)
{
    int i,j,k;
    for(i=0;i<len;i++){
       k=i;
       for(j=i+1;j<len;j++){
           if(A[j]<A[k])
               k=j;
       }
       if(i!=k){
           A[i]=A[i]+A[k];
           A[k]=A[i]-A[k];
           A[i]=A[i]-A[k];
       }
    }
}

2.3.2 Heap sort

In direct selection sorting, the first selection has gone through n-1 comparisons, only a minimum sorting code is selected from the sorting code sequence, and other intermediate comparison results are not saved. Therefore, many comparison operations have to be repeated in the next sorting, which reduces the efficiency. J. Williams and Floyd proposed the heap sort method in 1964 to avoid this shortcoming.

2.3.2.1 Properties of the heap

1) Nature: a heap is a complete binary tree, and a complete binary tree is not necessarily a heap; 
2) Classification: large top heap: the parent node is not less than the key value of the child node, small top heap: the parent node is not greater than the child node key value; the following figure shows a small top heap:
Write picture description here

3) Left and right children: There is no order of size. 
4) Heap storage: Generally, an array is used to store the heap, and the subscript of the parent node of the i node is (i-1)/2. The subscripts of its left and right child nodes are 2*i+1and respectively 2*i+2. For example, the subscripts of the left and right child nodes of the 0th node are 1 and 2 respectively. 
Write picture description here

2.3.2.2 Basic operation of the heap

1) Establishment Take 
the minimum heap as an example. If an array is used to store elements, an array has a corresponding tree representation, but the tree does not meet the conditions of the heap, and elements need to be rearranged. A "heaped" tree can be established.

Write picture description here

2) Insert 
When inserting a new element into the end of the table, that is, the end of the array, if the newly formed binary tree does not satisfy the nature of the heap, the elements need to be rearranged. The following figure demonstrates the adjustment of the heap when inserting 15.

Write picture description here

3) Delete. 
In heap sort, deleting an element always occurs at the top of the heap, because the element at the top of the heap is the smallest (in a small top heap). The last element in the table is used to fill the vacant position, and the resulting tree is updated to satisfy the heap condition.

Write picture description here

2.3.2.3 Heap operation implementation

1) The insertion code realizes that 
each insertion puts new data at the end of the array. It can be found that from the parent node of this new data to the root node must be an ordered sequence, the task now is to insert this new data into this ordered data, which is similar to combining a data in direct insertion sorting Into the ordered range, this is the node "floating" adjustment. It is not difficult to write the adjustment code of the heap when inserting a new data:

//新加入i结点,其父结点为(i-1)/2
//参数:a:数组,i:新插入元素在数组中的下标  
void minHeapFixUp(int a[], int i)  
{  
    int j, temp;  
    temp = a[i];  
    j = (i-1)/2;      //父结点  
    while (j >= 0 && i != 0)  
    {  
        if (a[j] <= temp)//如果父节点不大于新插入的元素,停止寻找  
            break;  
        a[i]=a[j];     //把较大的子结点往下移动,替换它的子结点  
        i = j;  
        j = (i-1)/2;  
    }  
    a[i] = temp;  
}  

Therefore, when inserting data into the min heap:

//在最小堆中加入新的数据data  
//a:数组,index:插入的下标,
void minHeapAddNumber(int a[], int index, int data)  
{  
    a[index] = data;  
    minHeapFixUp(a, index);  
} 

2) Deletion code implementation 
By definition, only the 0th data can be deleted each time in the heap. In order to facilitate the reconstruction of the heap, the actual operation is to combine the last data of the array with the root node, and then perform a top-down adjustment from the root node.

When adjusting, first find the smallest child node among the left and right child nodes. If the parent node is not larger than the smallest child node, it means that there is no need to adjust. Otherwise, change the smallest child node to the position of the parent node. At this time, the parent node does not actually need to change to the position of the smallest child node, because this is not the final position of the parent node. But logically, the parent node replaces the smallest child node, and then consider the influence of the parent node on the following nodes. It is equivalent to the process of "sinking" a data from the root node. The code is given below:

//a为数组,从index节点开始调整,len为节点总数 从0开始计算index节点的子节点为 2*index+1, 2*index+2,len/2-1为最后一个非叶子节点  
void minHeapFixDown(int a[],int len,int index)
{
    if(index>(len/2-1))//index为叶子节点不用调整
        return;
    int tmp=a[index];
    lastIndex=index;
    while(index<=len/2-1)        //当下沉到叶子节点时,就不用调整了
    { 
        if(a[2*index+1]<tmp)     //如果左子节点小于待调整节点
        {
            lastIndex = 2*index+1;
        }
        //如果存在右子节点且小于左子节点和待调整节点
        if(2*index+2<len && a[2*index+2]<a[2*index+1]&& a[2*index+2]<tmp)
        {
            lastIndex=2*index+2;
        }
        //如果左右子节点有一个小于待调整节点,选择最小子节点进行上浮
        if(lastIndex!=index) 
        {  
            a[index]=a[lastIndex];
            index=lastIndex;
        }
        else break;             //否则待调整节点不用下沉调整
    }
    a[lastIndex]=tmp;           //将待调整节点放到最后的位置
}

Depending on the idea, there can be different versions of the code implementation. From personal experience, I suggest that you write your own code based on your understanding of the heap adjustment process. Do not look at the sample code to understand the algorithm, but understand the algorithm idea and write the code, otherwise you will soon forget it.

3) Building a heap 
After inserting and deleting the heap, consider how to perform a heap operation on a piece of data. Take out data from the array one by one to build a heap, no! First look at an array, as shown below:
Write picture description here

Obviously, for the leaf node, it can be considered as a legal heap, that is, 20, 60, 65, 4, and 49 are respectively a legal heap. Just start with A[4]=50 and adjust downwards. Then take A[3]=30, A[2]=17, A[1]=12, A[0]=9 and make a downward adjustment operation respectively. The following diagram illustrates these steps:
Write picture description here

Write the code to heap the array:

//建立最小堆
//a:数组,n:数组长度
void makeMinHeap(int a[], int n)  
{  
    for (int i = n/2-1; i >= 0; i--)  
        minHeapFixDown(a, i, n);  
}  

2.3.2.4 Implementation of heap sort

Since the heap is also stored in an array, after the array is heaped, A[0] is exchanged with A[n - 1] for the first time, and then the heap is restored for A[0...n-2]. Exchange A[0] with A[n - 2] for the second time, and then restore the heap for A[0...n - 3], and repeat this operation until A[0] is exchanged with A[1]. Since the smallest data is merged into the following ordered interval each time, the entire array is ordered after the operation is completed. It is somewhat similar to direct selection sort.

Therefore, the insertion operation described above is not used to complete the heap sort, but only the operations of building a heap and adjusting nodes downward. The operation of heap sorting is as follows:

//array:待排序数组,len:数组长度
void heapSort(int array[],int len)
{
    //建堆
    makeMinHeap(array,len); 

    //最后一个叶子节点和根节点交换,并进行堆调整,交换次数为len-1次
    for(int i=len-1;i>0;--i)
    {
        //最后一个叶子节点交换
        array[i]=array[i]+array[0];
        array[0]=array[i]-array[0];
        array[i]=array[i]-array[0];

        //堆调整
        minHeapFixDown(array, 0, len-i-1);  
    }
}  

1) Stability: Unstable sorting.

2) Heap sorting performance analysis 
Since the time complexity of restoring the heap is O(logN) each time, a total of N - 1 heap adjustment operations, plus N / 2 downward adjustments when the heap was established earlier, each adjustment time The complexity is also O(logN). The addition of the two operation times is still O(N * logN). Therefore, the time complexity of heap sorting is O(N * logN).

Worst case: If the array to be sorted is ordered, O(N * logN) complexity comparison operation is still required, but the movement operation is missing;

Best case: If the array to be sorted is in reverse order, not only a comparison operation of O(N * logN) complexity is required, but also an exchange operation of O(N * logN) complexity is required. The total time complexity is still O(N * logN).

Therefore, heap sorting and quick sorting are similar in efficiency, but the important point that heap sorting is generally better than quick sorting is that the initial distribution of data has no major impact on the efficiency of heap sorting.

2.4 Merge sort

2.4.1 Algorithm idea 

Merge sort belongs to the comparative non-linear time sorting, and the one with the best performance in the comparative sorting is widely used.

Merge sort is a typical application of divide and conquer (Divide and Conquer). Combine the ordered subsequences to obtain a completely ordered sequence; that is, first make each subsequence in order, and then make the subsequence segments in order. Merging two sorted lists into one sorted list is called a two-way merge. In general, merge sort refers to two-way merge sort.

2.4.2 Two-way merge sort process description

Set the sequence {16, 23 , 100, 3, 38, 128, 23} 
initial state: 16, 23 , 100, 3, 38 
, 128, 23 after the first merge: {6, 23 }, {3,100}, {38,128},{23}; 
after the second merge: {3,6, 23 ,100},{23,38,128}; 
after the third merge: {3,6,23 , 23,38,100,128}. 
Finish sorting.

2.4.3 Two-way merge complexity analysis

Time complexity: The worst, best, and average time complexity are all O(nlgn). The sorting performance is not affected by the chaos of the data to be sorted, and is relatively stable. This is also the advantage over fast sorting. 
The space complexity is: O(n). 
Stability: Stable sorting algorithm, as can be seen from the above sorting process, bold 23 is always in front.

2.4.4 Implementation of two-way merge

2.4.4.1 C/C++ serial implementation

/************************************************
*函数名称:mergearray
*参数:a:待归并数组;first:开始下标;mid:中间下标;
*     last:结束下标;temp:临时数组
*说明:将有二个有序数列a[first...mid]和a[mid...last]合并 
*************************************************/
void mergearray(int a[], int first, int mid, int last, int temp[])  
{  
    int i = first, j = mid + 1,k =0;    
    while (i <= mid && j <= last)  
    {  
        if (a[i] <= a[j])  
            temp[k++] = a[i++];  
        else  
            temp[k++] = a[j++];  
    }    
    while (i<= mid)  
        temp[k++] = a[i++];  

    while (j <= last)  
        temp[k++] = a[j++];   
    for (i=0; i < k; i++)  
        a[first+i] = temp[i];  
}  
/************************************************
*函数名称:mergesort
*参数:a:待归并数组;first:开始下标;
*     last:结束下标;temp:临时数组
*说明:实现给定数组区间的二路归并排序 
*************************************************/
void mergesort(int a[], int first, int last, int temp[])  
{  

    if (first < last)  
    {  
        int mid = (first + last) / 2;       
        mergesort(a, first, mid, temp);    //左边有序  
        mergesort(a, mid + 1, last, temp); //右边有序  
        mergearray(a, first, mid, last, temp); //再将二个有序数列合并      
    }  
}

 

2.4.4.2 C/C++ Parallel Implementation

1) Parallel thinking

The array to be sorted is logically divided into multiple blocks by the offset, and each block is passed to multiple threads to call the two-way merge sort function for sorting. After each block is ordered, each block is merged into an ordered sequence.

2) Parallel code

Thread function, for the created thread to call.

/*******************************************
*函数名称:merge_exec
*参数:   para指针,用于接收线程下边,表示第几个线程
*说明:   调用二路归并排序
*******************************************/
void* merge_exec(void *para)
{
  int threadIndex=*(int*)para; 
  int blockLen=DataNum/threadNum;
  int* temp=new int[blockLen];
  int offset=threadIndex*blockLen;
  mergesort(randInt,offset,offset+blockLen-1,temp);
}

Merges multiple sorted blocks. code show as below:

/***********************************************
*函数名称:mergeBlocks
*参数:   pDataArray:块内有序的数组 arrayLen:数组长度
*        blockNum:块数 resultArray:存放排序的结果
*说明:   合并有序的块
************************************************/
inline void mergeBlocks(int* const pDataArray,int arrayLen,const int blockNum,int* const resultArray)
{
    int blockLen=arrayLen/blockNum;
    int blockIndex[blockNum];//各个块中元素在数组中的下标,VC可能不支持变量作为数组的长度,解决办法可使用宏定义
    for(int i=0;i<blockNum;++i)//初始化块内元素起始下标
    {
        blockIndex[i]=i*blockLen;
    }
    int smallest=0;
    for(int i=0;i<arrayLen;++i)//扫描所有块内的所有元素
    {  
      for(int j=0;j<blockNum;++j)//以第一个未扫描完的块内元素作为最小数
      {
       if(blockIndex[j]<(j*blockLen+blockLen))
       {
        smallest=pDataArray[blockIndex[j]];
        break;
       }
      }
      for(int j=0;j<blockNum;++j)//扫描各个块,寻找最小数
      {
        if((blockIndex[j]<(j*blockLen+blockLen))&&(pDataArray[blockIndex[j]]<smallest))
        {
          smallest=pDataArray[blockIndex[j]];
        }
      }
      for(int j=0;j<blockNum;++j)//确定哪个块内元素下标进行自增
      {
        if((blockIndex[j]<(j*blockLen+blockLen))&&(pDataArray[blockIndex[j]]==smallest))
        {
          ++blockIndex[j];
          break;
        }
      }
      resultArray[i]=smallest;//本次循环最小数放入结果数组
    }
}

Create multiple threads in the main function to complete parallel sorting, the code is as follows:

int main(int argc,char* argv[])
{
    int threadBum=8;
    int blockNum=threadNum;
    struct timeval ts,te;
    srand(time(NULL));
    for(int i=0;i<DataNum;++i)
    {
      randInt[i]=rand();
    }
    pthread_t tid[blockNum],ret[blockNum],threadIndex[blockNum];

    //--------Two-way Merge Sort-------
    gettimeofday(&ts,NULL);
    for(int i = 0; i < threadNum; ++i)
    {
        threadIndex[i]=i;
        ret[i] = pthread_create(&tid[i], NULL,merge_exec,(void *)(threadIndex+i));
        if(ret[i] != 0){
             cout<<"thread "<<i<<" create error!"<<endl;
             break;
         }
    }
    for(int i = 0; i <threadNum; ++i)
    {
         pthread_join(tid[i], NULL);
    }
    mergeBlocks(randInt,DataNum,threadNum,resultInt);
    gettimeofday(&te,NULL);
    cout<<"MergeSort time: "<<(te.tv_sec-ts.tv_sec)*1000+(te.tv_usec-ts.tv_usec)/1000<<"ms"<<endl;
}

 


2.5 Linear time non-comparative sorting

2.5.1 Counting sort

Counting sort is a non-comparison-based sorting algorithm, which was proposed by Harold H. Seward in 1954. Its advantage lies in the sorting of integers in a small range. Its complexity is Ο(n+k) (where k is the range of numbers to be sorted), which is faster than any comparison sorting algorithm. The disadvantage is that it consumes a lot of space. Obviously, if and when O(k)>O(n*log(n)), its efficiency is not as good as comparison-based sorting, such as heap sorting, merge sorting, and quick sorting.

Algorithm principle: 
The basic idea is that for each element x in a given input sequence, determine the number of elements in the sequence whose value is less than x. Once we have this information, we can store x directly in the correct position of the final output sequence. For example, if there are only 17 elements in the input sequence whose value is less than the value of x, then x can be directly stored in the 18th position of the output sequence. Of course, if there are multiple elements with the same value, we cannot place these elements at the same position in the output sequence, and just make appropriate modifications in the code.

Algorithm steps: 
1) Find the largest element in the array to be sorted; 
2) Count the number of occurrences of each element whose value is i in the array, and store it in the i-th item of the array C; 
3) Accumulate all counts (from Start with the first element in C, and add each item to the previous item); 
4) Reversely fill the target array: put each element i in the C(i)th item of the new array, and place each element i Subtract 1 from C(i).

Time complexity: Ο(n+k).

Space complexity: Ο(k).

Requirements: The maximum value in the number to be sorted cannot be too large.

Stability: Stable.

Code example:

#define MAXNUM 20    //待排序数的最大个数
#define MAX    100   //待排序数的最大值
int sorted_arr[MAXNUM]={0};

//计算排序
//arr:待排序数组,sorted_arr:排好序的数组,n:待排序数组长度
void countSort(int *arr, int *sorted_arr, int n)  
{   
    int i;   
    int *count_arr = (int *)malloc(sizeof(int) * (MAX+1));  

    //初始化计数数组   
    memset(count_arr,0,sizeof(int) * (MAX+1));

    //统计i的次数   
    for(i = 0;i<n;i++)  
        count_arr[arr[i]]++;  
    //对所有的计数累加,作用是统计arr数组值和小于小于arr数组值出现的个数
    for(i = 1; i<=MAX; i++)  
        count_arr[i] += count_arr[i-1];   
    //逆向遍历源数组(保证稳定性),根据计数数组中对应的值填充到新的数组中   
    for(i = n-1; i>=0; i--)  
    {  
        //count_arr[arr[i]]表示arr数组中包括arr[i]和小于arr[i]的总数
        sorted_arr[count_arr[arr[i]]-1] = arr[i];  

        //如果arr数组中有相同的数,arr[i]的下标减一
        count_arr[arr[i]]--;    
    }
    free(count_arr);
}

Note: Counting sort is a typical sorting algorithm that trades space for time. It has strict requirements on the data to be sorted. For example, the values ​​to be sorted contain negative numbers, and the maximum value is limited. Please use it with caution.

2.5.2 Radix sort

2.5.2.1 Algorithm idea

Radix sorting belongs to "distribution sort", which is a kind of non-comparative linear time sorting, also known as "bucket sort". As the name implies, it distributes the elements to be sorted into certain "buckets" through partial key-value information, so as to achieve the sorting effect.

2.5.2.2  Algorithm process description

Radix sorting (take plastic as an example), split the integer decimal by each bit, and then compare each bit in turn from low to high. It is mainly divided into two processes: 
1) Allocation, starting from the ones digit, and putting them into buckets 0~9 according to the digit value (0-9) (for example, 64, if the ones digit is 4, put it into bucket No. 4 ); 
2) Collect, and then put the data placed in buckets 0~9 into the array in order; 
repeat (1) (2) process, from the ones digit to the highest digit (for example, the maximum number of 32-bit unsigned integer 4294967296 , the highest bit is the 10th bit). The radix sorting method can be LSD (Least significant digital) or MSD (Most significant digital). The sorting method of LSD starts from the far right of the key value, while MSD starts from the far left of the key value. 
Taking the sequence [520 350 72 383 15 442 352 86 158 352] as an example, the sorting process is described as follows:
Write picture description here

Write picture description here

Write picture description here

Sort done!

2.5.2.2  Complexity Analysis

Average time complexity: O(dn) (d means the highest number of digits in shaping). 
Space complexity: O(10n) (10 means 0~9, used to store temporary sequences). 
Stability: Stable. (Code implementation refers to  the introduction of radix sorting and its parallelization )

2.5.3 Bucket Sort

Bucket sorting is also a type of allocation sorting, but it is based on comparison sorting, which is the biggest difference from radix sorting.

Idea: The bucket sort algorithm idea is similar to a hash table. First of all, it is assumed that the element input to be sorted conforms to a certain uniform distribution, for example, the data is evenly distributed on the [0,1) interval, then this interval can be divided into 10 small intervals, called buckets, and the pairs are scattered into the same bucket The elements are reordered.

Requirement: The length of the number to be sorted is consistent.

Sorting process: 
1) Set a quantitative array as an empty bucket; 
2) Search the sequence, and put the records one by one into the corresponding bucket; 
3) Sort each bucket that is not empty. 
4) Put items back into the original sequence from buckets that are not empty.

For example, the column to be sorted K={49, 38, 35, 97, 76, 73, 27, 49}. These data are all between 1-100. Therefore, we customize 10 buckets, and then determine the mapping function f(k)=k/10. Then the first keyword 49 will be located in the fourth bucket (49/10=4). All keywords are piled into buckets in turn, and quick sort is performed in each non-empty bucket.

Time complexity: 
The time complexity of bucket sorting N keywords is divided into two parts: 
1) The bucket mapping function of each keyword is calculated in a loop, and the time complexity is O(N).

2) Use the advanced comparison sorting algorithm to sort all the data in each bucket. For N data to be sorted, M buckets, and an average of [N/M] data in each bucket, the time complexity of sorting in the bucket for \sum_{i=1}^{M}O(N_{i}*logN_{i})=O(N*log\frac{N}{M}) . where N_{i}is the amount of data in the i-th bucket.

Therefore, the average time complexity is linear O(N+C), where C is the time spent sorting in the bucket. When each bucket has only one number, the best time complexity is: O(N).

Sample code:

typedef struct node
 { 
     int keyNum;//桶中数的数量
     int key;   //存储的元素
     struct node * next;  
 }KeyNode;    

 //keys待排序数组,size数组长度,bucket_size桶的数量
 void inc_sort(int keys[],int size,int bucket_size)
 { 
     KeyNode* k=(KeyNode *)malloc(sizeof(KeyNode)); //用于控制打印
     int i,j,b;
     KeyNode **bucket_table=(KeyNode **)malloc(bucket_size*sizeof(KeyNode *)); 
     for(i=0;i<bucket_size;i++)
     {  
         bucket_table[i]=(KeyNode *)malloc(sizeof(KeyNode)); 
         bucket_table[i]->keyNum=0;//记录当前桶中是否有数据
         bucket_table[i]->key=0;   //记录当前桶中的数据  
         bucket_table[i]->next=NULL; 
     }    

     for(j=0;j<size;j++)
     {   
         int index;
         KeyNode *p;
         KeyNode *node=(KeyNode *)malloc(sizeof(KeyNode));   
         node->key=keys[j];  
         node->next=NULL;  

         index=keys[j]/10;        //映射函数计算桶号  
         p=bucket_table[index];   //初始化P成为桶中数据链表的头指针  
         if(p->keyNum==0)//该桶中还没有数据 
         {    
             bucket_table[index]->next=node;    
             (bucket_table[index]->keyNum)++;  //桶的头结点记录桶内元素各数,此处加一
         }
         else//该桶中已有数据 
         {   
             //链表结构的插入排序 
             while(p->next!=NULL&&p->next->key<=node->key)   
                 p=p->next;    
             node->next=p->next;     
             p->next=node;      
             (bucket_table[index]->keyNum)++;   
         }
     }
     //打印结果
     for(b=0;b<bucket_size;b++)   
         //判断条件是跳过桶的头结点,桶的下个节点为元素节点不为空
         for(k=bucket_table[b];k->next!=NULL;k=k->next)  
         {
             printf("%d ",k->next->key);
         }
 }  

--------------------- 
Reposted from:
https://blog.csdn.net/K346K346/article/details/50791102

Guess you like

Origin blog.csdn.net/my8688/article/details/85285695