【Data Structure】 | Quick Sort

foreword

Quick sort is the sort with the best comprehensive performance at present , and well-known IT companies such as Tencent and Microsoft like to test this. During the interview, the interviewer often asks you to write a quick sort by hand. So today I will take you to learn more about the various ways of writing quick sort (Hoare version, digging method, front and rear pointer method), and quick sortNon-recursive writing method ( school recruitment regular exam )Time complexity analysis of quick sorting, defects of quick sorting, and optimization methods.


1. Introduction to Quick Sort

core idea

  • Determine the benchmark number key
  • Adjust the interval so that the adjusted interval satisfies that the numbers on the left side of the interval are all less than or equal to the reference number , and the books on the right side of the interval are all greater than or equal to the reference number .
  • Divide the left and right sides of the reference number into new intervals , and repeat the first and second steps until the entire interval is in order.

2. Code implementation

1. hoare version

Let's first look at the version of hoare, which is also the most original version.

full code

void QuickSort1(int* a, int left, int right)
{
    
    
	//left == right  区间只有一个数,不需要排序
	//left > right 此时L和R交错,区间不存在,同理不需排序 
	if (left >= right) 
		return;

	int begin = left, end = right;
	//左端点取key
	int keyi = left;

	while (left < right)
	{
    
    
		//如果定义keyi为左端点,那么就要右边先走; 反之,左边先走。
		//只有这样才能保证最后L和R相遇的位置的值小于keyi的值。

		while (left < right && a[right] >= a[keyi])
			right--;

		while (left < right && a[left] <= a[keyi])
			left++;

		Swap(&a[left], &a[right]);
	}

	Swap(&a[left], &a[keyi]);
	keyi = left;

	// [begin, keyi-1] keyi [keyi+1, end] 
	// 递归
	QuickSort1(a, begin, keyi - 1);
	QuickSort1(a, keyi + 1, end); 

}

insert image description here
insert image description here
Of course, the process here is actually carried out in the original array, and I draw it like this just for the convenience of everyone's understanding. Have you noticed that this is very similar to a binary tree? Two-way recursion, first traverse the root, in the left subtree, in the right subtree, and return when it encounters empty.

Note: When our reference value takes the left endpoint, the right pointer must go first; on the contrary, when the reference value takes the right endpoint, the left pointer must go first. The purpose of this is to ensure that the value at the position where the left pointer and the right pointer meet must be smaller than the reference value key .
Next, let's try the speed of quick sorting. Here we randomly generated 1 million pieces of data, and we can see that the fast sort takes 75 milliseconds, the heap sort takes 109 milliseconds, and the Hill sort takes 110 milliseconds.
insert image description here
insert image description here

2. Digging method

full code

void QuickSort2(int* a, int left, int right)
{
    
    
	if (left >= right)
		return;

	int begin = left, end = right;
	//左端点取key
	int key = a[left];
	int hole = left;

	while (left < right)
	{
    
    
		while (left < right && a[right] >= key)
			right--;

		Swap(&a[right], &a[hole]);
		hole = right;

		while (left < right && a[left] <= key)
			left++;

		Swap(&a[left], &a[hole]);
		hole = left;
	}

	a[hole] = key;

	// [begin, hole-1] hole [hole+1, end] 
	// 递归
	QuickSort2(a, begin, hole - 1);
	QuickSort2(a, hole + 1, end);
}

Define a zero-time variable key to save the value of the left endpoint. At this time, the left endpoint forms a pit, and the right endpoint moves to find a small one.
After finding it, give the value to the pit position, the right end point forms a new pit position, the left end point moves to find a larger one, and so on. When the left and right pointers meet, give the value of the temporary variable key to the meeting point.

insert image description here

3. Back and forth pointer method

  • We need to define two pointers cur, prev.
  • Initially, prev points to the left end of the interval, and cur points to the next position of prev.
  • Let cur move to the right. When cur finds a number smaller than key, we ++prev, then exchange the positions of cur and prev, and then ++cur
  • When cur finds a value larger than key, ++cur;
  • When cur reaches the next position of the last data in the interval, the loop ends.
  • We exchange the position pointed to by prev with key .
  • At this point, the single-pass sorting of quick sort is completed.
  • Then recursively process the left and right intervals.

The operation here can be understood as turning the value larger than the key to the right, and turning the value smaller than the key to the left . Look at the picture to understand.
insert image description here
Explanation:
1. prev is either followed by cur (prev is next to cur)
2. prev and cur are separated by a value range larger than the key

full code

void QuickSort3(int* a, int left, int right)
{
    
    
	if (left >= right)
		return;

	int begin = left, end = right;
	//左端点取key
	int keyi = left;
	int prev = left;
	int cur = left + 1;

	while (cur <= right)
	{
    
    
		if (a[cur] < a[keyi])
		{
    
    
			++prev;
			Swap(&a[cur], &a[prev]);
		}

		cur++;
	}

	Swap(&a[prev], &a[left]);
	keyi = prev;

	// [begin, keyi-1] keyi [keyi+1, end] 
	// 递归
	QuickSort3(a, begin, keyi - 1);
	QuickSort3(a, keyi + 1, end);
}

3. Time complexity

1. The average time complexity of quick sort is O(nlogn).
Quick sort single-pass sorting needs to process n data, and the recursive process is a binary tree ( ideally ), so its recursive depth is logn, so the overall time complexity is O(nlogn).
insert image description here

Disadvantages of quick sort

Worst case O(n^2) for quicksort.
When the data is in order, our benchmark value is the maximum or minimum value in the data. At this time, the single-pass sorting can only divide the data into one interval. At this time, the recursion depth becomes n layers, and the time complexity is O( n^2).

Optimization

1. Random key selection

Use rand()%(right-left) to randomly generate a subscript that satisfies this interval, and then exchange the value of the left endpoint and the subscript, and we continue to take the left endpoint as the reference number. At this time, the probability that the benchmark number is the maximum or minimum value in the interval is very low, because the amount of data we sort is large, mostly 100W, 1000W. Note that we need to add left here,
because this may be the right interval of our benchmark number ( we always operate on the array a ), we need to add left.

void QuickSort3(int* a, int left, int right)
{
    
    
	if (left >= right)
		return;

	int begin = left, end = right;

	//随机取key
	int randi = left + rand() % (right - left);
	Swap(&a[left], &a[randi]);

	//左端点取key
	int keyi = left;
	int prev = left;
	int cur = left + 1;

	while (cur <= right)
	{
    
    
		if (a[cur] < a[keyi])
		{
    
    
			++prev;
			Swap(&a[cur], &a[prev]);
		}

		cur++;
	}

	Swap(&a[prev], &a[left]);
	keyi = prev;

	// [begin, keyi-1] keyi [keyi+1, end] 
	// 递归
	QuickSort3(a, begin, keyi - 1);
	QuickSort3(a, keyi + 1, end);
}

2. Take the middle of the three numbers

Taking the middle of three numbers, as the name implies, is to take the final one from the three numbers, which is the second largest number. Here we generally take the left endpoint, the right endpoint and the middle value of the interval. In this way, the number taken out will definitely not be the largest or smallest number.
Here we need to use an operation pairwise exchange, the specific code is below.

int GetMidNumi(int*a,int left,int right)
{
    
    
	int mid = (left + right) / 2;
	if (a[left] < a[mid])
	{
    
    
		if (a[mid] < a[right])
		{
    
    
			return mid;
		}
		//a[mid]>=a[right]
		else if (a[left] < a[right])
		{
    
    
			return right;
		}
		//a[left]>=a[right]
		else
		{
    
    
			return left;
		}
	}
	//a[left]>=a[mid]
	else
	{
    
    
		if (a[mid] > a[right])
		{
    
    
			return mid;
		}
		//a[mid]<=a[right]
		else if (a[left] < a[right])
		{
    
    
			return left;
		}
		else
		{
    
    
			return right;
		}
	}
}
void QuickSort3(int* a, int left, int right)
{
    
    
	if (left >= right)
		return;

	int begin = left, end = right;

	//三数取中
	int midi = GetMidNumi(a, left, right);
	if (midi != left)
		Swap(&a[left], &a[midi]);

	//左端点取key
	int keyi = left;
	int prev = left;
	int cur = left + 1;

	while (cur <= right)
	{
    
    
		if (a[cur] < a[keyi])
		{
    
    
			++prev;
			Swap(&a[cur], &a[prev]);
		}

		cur++;
	}

	Swap(&a[prev], &a[left]);
	keyi = prev;

	// [begin, keyi-1] keyi [keyi+1, end] 
	// 递归
	QuickSort3(a, begin, keyi - 1);
	QuickSort3(a, keyi + 1, end);
}

Since the logic of two-two exchange here is a bit convoluted, if you need to write quick sort during the interview, you can use random key selection.

3. Optimization between cells

It can be seen that when there are only five numbers left in the interval, we need to make it orderly at this time, and it takes 6 recursions. Since the calling function needs to open up a stack frame, it consumes a lot. At this time, we can use direct insertion sort to sort the range without recursion.
insert image description here

//小区间优化
void QucikSort(int* a, int left, int right)
{
    
    
	if (left >= right)
		return;

	if (right - left + 1 > 10)
	{
    
    
		int keyi = PartSort1(a, left, right);
		QucikSort(a, left, keyi - 1);
		QucikSort(a, keyi + 1, right);
	}
	else
	{
    
    
		InsertSort(a+left, right - left + 1);
	}
	
}

int PartSort1(int* a, int left, int right)
{
    
    
	int begin = left, end = right;

	//三数取中
	int midi = GetMidNumi(a, left, right);
	if (midi != left)
		Swap(&a[left], &a[midi]);

	//左端点取key
	int keyi = left;
	int prev = left;
	int cur = left + 1;

	while (cur <= right)
	{
    
    
		if (a[cur] < a[keyi])
		{
    
    
			++prev;
			Swap(&a[cur], &a[prev]);
		}

		cur++;
	}

	Swap(&a[prev], &a[left]);
	keyi = prev;

	return keyi;
}

Four, quick sort non-recursive

We use a stack to simulate recursion, and the subscripts of the left and right endpoints of the interval are stored in the stack.

  • Go to a section in the stack and sort in a single pass.
  • Single-pass split sub-intervals are pushed into the stack
  • If the subinterval has only one value or the interval does not exist, it will not be pushed into the stack
  • Exit the loop when the stack is empty
void QuickSortNonR(int* a, int left, int right)
{
    
    
	ST st;
	STInit(&st);
	STPush(&st, right);
	STPush(&st, left);

	while (!STEmpty(&st))
	{
    
    
		int begin = STTop(&st);
		STPop(&st);
		int end = STTop(&st);
		STPop(&st);

		int keyi = PartSort1(a, begin, end);
		// [begin,keyi-1] keyi [keyi+1, end]
		if (keyi + 1 < end)
		{
    
    
			STPush(&st, end);
			STPush(&st, keyi + 1);
		}

		if (begin < keyi - 1)
		{
    
    
			STPush(&st, keyi - 1);
			STPush(&st, begin);
		}
	}

	STDestroy(&st);
}

Guess you like

Origin blog.csdn.net/2301_77412625/article/details/129998840