Introduction to Algorithms: Quick Sort Optimization Algorithm!

Common quick sort: time complexity is O(nlgn) in the best case and O(n2) in the worst case

Quicksort is based on the divide and conquer model:

Decomposition : Array A[p..r] is divided into two (possibly empty) subarrays A[p..q-1] and A[q+1..r] such that A[p..q-1 Each element in ] is less than or equal to A(q), and is less than or equal to the elements in A[q+1..r]. The subscript q is also computed during this division.

Solution : Sort the subarrays A[p..q-1] and A[q+1..r] by recursively calling quicksort.

Merge : Since the two subarrays are sorted in-place, merging them requires no operation: the entire array A[p..r] is sorted.

void quicksort(int arr[], int low, int high)
{
    if(low < high)
    {
        int pi = partition(arr, low, high);
        
        quicksort(arr, low, pi-1);
        quicksort(arr, pi+1, high);
    }
}

The running time of quicksort is related to the partition of Partition

The worst case is that the input array has been completely sorted, then the left and right regions of each division are n-1 and 0 respectively, and the efficiency is O( n^2 ).

For other constant ratio division, even if the left and right are divided according to the ratio of 9:1, the effect is as fast as dividing in the middle (detailed analysis in the introduction to the algorithm)

That is, for either division by a constant scale, the total running time is O(n lg n).

 

Randomized version of quicksort:

 Because there may be "bad" in the division generated by the Partition in the quicksort, and the key to the division lies in the selection of the pivot A[r]. We can use a different randomization technique called random sampling to swap a random element of the pivots A[r] and A[p..r], which is equivalent to that our pivot is not The last A[r] is fixed, but randomly sampled from the range of p,...,r. In this way, the partition of Partition can be relatively symmetrical when the average is expected.

 

Discussion of further optimization of quicksort:

 1. Tail recursion:

 The traditional recursive algorithm is often regarded as a monstrous beast. Its reputation seems to be always associated with inefficiency. Tail recursion is extremely important. Without tail recursion, the stack consumption of functions is immeasurable and needs to save a lot of intermediate function stack.

Stack depth in quicksort:

The QUICKSORT algorithm consists of two recursive calls to itself, that is, after calling PARTITION, the left subarray and the right subarray are sorted recursively, respectively. The second recursive call in QUICKSORT is not necessary, it can be replaced by an iterative control structure, this technique is called "tail recursion", and most compilers also use this technique

The following version simulates tail recursion:

 

QUICKSORT'(A, p, r)

1  while p < r

2        do ▸ Partition and sort left subarray.

3             q ← PARTITION(A, p, r)

4             QUICKSORT'(A, p, q - 1)

5 p q + 1

 

Note that the first line is while instead of if

But in the worst case of this version, when the division is not good, the recursion depth is O(n). Can it be further optimized to make the stack depth O(lg n)?

 

Using the idea of ​​dichotomy, in order to make the depth of the stack Θ(lgn) in the worst case, we must make the left subarray after PARTITION half the size of the original array, so that the depth of recursion is at most Θ(lgn).

A possible algorithm is to first obtain the median of (A, p, r) as the pivot element of PARTITION, which can ensure that the number of elements on the left and right sides is as balanced as possible.

Because the time complexity of the process of finding the median is Θ(n), the expected time complexity of the algorithm can be guaranteed to be O(nlgn) unchanged.

 

2. "Mid-of-three" division

The so-called "middle of three numbers" means that three elements are randomly selected from the subarray, and the middle number is taken as the pivot. This is an upgraded version of the previous randomization version. Although it is an upgraded version, it can only affect the constant factor of quicksort time complexity O(nlgn).

The following will give a final quicksort version that integrates the "optimized tail recursion" + "three numbers" version:

// Optimized tail recursion + middle-of-three version quicksort
#include<cstdio>
#include<ctime>
#include<cstdlib>
  
 
void Swap(int &a,int &b){ if(a!=b){a^=b;b^=a;a^=b;} } //If the two numbers are equal, do not perform bit operation swap. Because if the two numbers are equal, the result will be 0
 
 
int Partition(int *A,int p,int r){
    int x, i;
    x=A[r];
    i=p-1;
    for(int j=p; j<=r-1; ++j){
        if(A[j]<=x) {
            Swap(A[++i], A[j]);    
        }
    }
    Swap(A[++i],A[r]);
    return i;
}

inline int Random(int m,int n){
    srand((unsigned)time(NULL));
    return m+(rand()%(n-m+1));
}

// Function to take the middle number (the second largest number) of three numbers
inline int MidNum(int a,int b,int c){
	if(c<b) Swap(c,b);
	if(b<a) Swap(b,a); // After these two swaps, a becomes the smallest of the three
	return b<c?b:c;
}

int ThreeOne_Partition(int *A,int p,int r){
	int i,j,k,mid;
	
	// randomly select three numbers
	i=Random(p,r);
	j=Random(p,r);
	k=Random(p,r);

	// get the "middle number"
	mid=MidNum(A[i],A[j],A[k]);
	
	// Swap "middle number" with A[r]
	if(A[i]==mid) Swap(A[i],A[r]);
	else if(A[j]==mid) Swap(A[j],A[r]);
	else if(A[k]==mid) Swap(A[k],A[r]);

	return Partition(A,p,r);
}

void Final_QuickSort(int *A,int p,int r){
	while(p<r){
		int q=ThreeOne_Partition(A,p,r);
		if(q-p<r-q){
			Final_QuickSort(A,p,q-1);
			p=q+1;
		}
		else{
			Final_QuickSort(A,q+1,r);
			r=q-1;
		}
	}
}

intmain()
{
    int arr[12]={2,7,4,9,8,5,7,8,2,0,7,-4};
    Final_QuickSort(arr,0,11);
    for(int i=0; i<12; ++i)
        printf("%d ",arr[i]);
    putchar('\n');
    return 0;
}

In addition, quicksort can also be optimized:

Non-recursive method: Simulate recursion so that recursive calls can be completely eliminated.

Three-division quick sort: The basic idea is to divide the sorted array A[p..r] into three segments A[p,j] on the left, middle, and right with V=A[r] as the benchmark,

A[j+1..q-1], A[q..r], where the value of the left array element is less than V, the interrupt array is equal to V, and the elements of some arrays are greater than V. After that, the algorithm evaluates the left and right

Sort two arrays recursively. This method greatly improves the sorting efficiency of arrays with a large number of identical data, even if there are not a large number of identical elements, it does not

Reduce the efficiency of the original quicksort algorithm.

The above two have the opportunity to implement the code again in the future.

 

This concludes the quicksort summary.

At present, the five major sortings are mainly summarized, all of which are based on comparison , and the fastest is only O(n lg n). Is there a faster one?

The next article will summarize linear time sorting , the speed of which will break through this bottleneck.

Reference for this article: http://blog.csdn.net/shuangde800/article/details/7599509

 

--I am not talented, if there are any omissions, please supplement and improve!

 

 

 

 

 

 

 

 

 

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325445397&siteId=291194637