DSAA之快速排序(一)

1. 基本原理

  • The basic algorithm to sort an array S consists of the following four easy steps:
    • If the number of elements in S is 0 or 1, then return.
    • Pick any element v in S. This is called the pivot.
    • Partition S - {v} (the remaining elements in S) into two disjoint groups: S 1 = { x S { v } | x v } , and S 2 = { x S { v } | x v } .
    • Return { quicksort(S1) followed by v followed by quicksort(S2)}.

 上面是基本步骤,在选取v和分割S上面有很多讨论,笔者直接记录DSAA中提到的比较不错的选择:

  • Median-of-Three Partitioning:the common course is to use as pivot the median of the left, right and center elements.直接选用center的元素作为枢纽元素
  • Partitioning Strategy:
    • The first step is to get the pivot element out of the way by swapping it with the last element.
    • i starts at the first element and j starts at the next-to-last element.
    • While i is to the left of j, we move i right, skipping over elements that are smaller than the pivot. We move j left, skipping over elements that are larger than the pivot. When i and j have stopped, i is pointing at a large element and j is pointing at a small element. If i is to the left of j, those elements are swapped.
    • We then swap the elements pointed to by i and j and repeat the process until i and j cross.At this stage, i and j have crossed, so no swap is performed. The final part of the partitioning is to swap the pivot element with the element pointed to by i.
      以上步骤可以不参考具体代码,自己撸一个快排,至于为什么这样做的原因,可以简单的认为是经过理论和实践证明的最好选择。不论是选取枢纽元素或者分割数组。
  • Small Files:For very small files ( n 20 ) , quicksort does not perform as well as insertion sort.
    • A common solution is not to use quicksort recursively for small files, but instead use a sorting algorithm that is efficient for small files, such as insertion sort.
      对于输入比较小的数据,直接使用插入排序优于快排,DSAA给出这个界线为20

2. 编程实现

  其实快排的核心在于分割策略,只要正确的实现了分割,就能很快写出快排其他的逻辑代码:

void quick_sort(int * array,int left,int right){
  int i,j,center;
  //递归基准
  if(right-left+1 < 20){
      insert_sort(array,left,right);
      return ;
   }
  //笔者直接取中值
  center=(left+right)/2;
  swap(array,center,right);
  //核心部分,就是分割(兼带排序效果)
  for(i=left,j=right-1;i<=j && i<right && j>=left;){
    if(array[i]<array[right])
        i++;
    else if(array[j]>array[right])
        j--;
    else if(array[i] == array[right] && array[j] == array[right])
        //防止特殊情况的发生
        i++,j--;
    else 
        //这种情况需要交换
        swap(array,i,j);
  } 
  swap(array,i,right);
  quick_sort(array,left,i-1);
  quick_sort(array,i+1,right);
}

void swap(int * array,int left, int right){
  int tmp;
  tmp=array[left];
  array[left]=array[right];
  array[right]=tmp;
}

void insert_sort( int * array, int left,int right ){
    unsigned int j, p;
    int tmp;
    for( p=left+1; p <= right; p++ ){
        tmp = a[p];
        for( j = p;  j>left; j-- )
            if(tmp<a[j-1])
                a[j] = a[j-1];
            else
                break;
        a[j] = tmp;
    }
}

  笔者的实现和书上有点出入,如果不是细致思考。假设在20分钟以内手写快排,很多人写出来差不多就是笔者这样的版本。但是这种是有情况不乐观的case的:

  • Suppose the input is 2,3,4, …,n -1, n, 1. What is the running time of this version of quicksort?
  • Suppose the input is in reverse order. What is the running time of this version of quicksort?

  思考这两个问题会解决上面的问题:到底在三数中值法时,需要对三个数进行排序取中间值,还是直接取中间位置值?现在笔者还是放一下。

3. 时间复杂度

  根据上面的步骤,可以得到 T ( n ) = T ( i ) + T ( n i + 1 ) + C n i 为分割在一侧的数据个数。分治这种递推结果很容易得到,另外如果治的时间不为线性,就不会采用分治策略了。
  最坏的情况就是极端的不均衡的分割数组,在每次递归的过程枢纽元素总是最小元素,此时递推公式为 T ( n ) = T ( n 1 ) + C n ,使用累加法得到最终的时间复杂度为 O ( n 2 )
  最好的情况就是每次都能均分数组元素,此时和归并法的时间复杂度分析一致,为 O ( n l o g n )
  一般情况就是考虑到每次递归可能存在的不平衡分割,作为快速学习的目的,直接记忆结论 O ( n l o g n )

4. 最后

  本节跳过了以上程序的AC环节,笔者在自己本机上已经通过了自己设置的几个测试用例,包含有重复元素的情况。快排的实现并不复杂,但是有些优化细节背后解决的问题确不太容易发现。最后留下了Median-of-Three Partitioning是否需要排序的问题。。。

猜你喜欢

转载自blog.csdn.net/lovestackover/article/details/80389738