1. 基本原理
- Let denote the number of elements in S. The steps of quickselect are
- If , then and return the elements in S as the answer. If a cutoff for small files is being used and , then sort S and return the kth smallest element.
- Pick a pivot element, .
- Partition into and , as was done with quicksort.
- If , then the kth smallest element must be in . In this case, return . If , then the pivot is the kth smallest element and we can return it as the answer. Otherwise, the kth smallest element lies in , and it is the st smallest element in . We make a recursive call and return .
特别强调,如何判断k位于哪个子集呢?因为|Si|代表其中元素个数,那么k和|Si|比较大小,如果前者小,则说明kth元素在|Si|中。其他的情况就不翻译英语了。
从上面的描述,可以发现和快速排序相比,快速选择排序只会调用一次递归。
2. 编程实现
和书上的策略不同,笔者直接在quick_sort
返回值给出第k个元素,比较简单如下:
int quick_sort(int * array,int k,int left,int right){
int i,j,center;
//递归基准
if(right-left+1 < 20){
insert_sort(array,left,right);
return array[k-1];
}
//笔者直接取中值
center=(left+right)/2;
swap(array,center,right);
//核心部分,就是分割(兼带排序效果)
for(i=left,j=right-1;i<=j && i<right && j>=left;){
if(array[i]<array[right])
i++;
else if(array[j]>array[right])
j--;
else if(array[i] == array[right] && array[j] == array[right])
//防止特殊情况的发生
i++,j--;
else
//这种情况需要交换
swap(array,i,j);
}
}
swap(array,i,right);
//i是中间元素的下标
if(k<i+1)
return quick_sort(array,k,left,i-1);
else if(k==i+1)
return array[k];
else
//k-(i+1)
return quick_sort(array,k-i-1,i+1,right);
}
void swap(int * array,int left, int right){
int tmp;
tmp=array[left];
array[left]=array[right];
array[right]=tmp;
}
void insert_sort( int * array, int left,int right ){
unsigned int j, p;
int tmp;
for( p=left+1; p <= right; p++ ){
tmp = a[p];
for( j = p; j>left; j-- )
if(tmp<a[j-1])
a[j] = a[j-1];
else
break;
a[j] = tmp;
}
}
3. 时间复杂度
- In contrast to quicksort, quickselect makes only one recursive call instead of two.
- The worst case of quickselect is identical to that of quicksort and is .
- Intuitively, this is because quicksort’s worst case is when one of and is empty; thus, quickselect is not really saving a recursive call.
在快排中的极端情况就是每次只调用一个递归,所以快速选择的最坏时间复杂度也是O(n^2)
- The average running time, however, is .
目前来说,还是当结论记住是最正确的选择。