Sorting algorithm quick sort, why is it so fast?

This article will introduce the most commonly used and most important sorting algorithm, quick sort.

Table of contents

1 Quick sort example

Quick sort was proposed by CAR Hoare in 1960 and is an improvement of bubble sort. Quick sort, just like its name, is very efficient. Compared with bubble sort and selection sort, they use the same space size, but quick sort can achieve a time complexity of O(nlogn), which is much faster than O(n^2). So, how is quick sort done? Its core idea is to find a benchmark number and let the benchmark number go to where it should go. Moreover, the numbers to the left of the base number are smaller than it, and the numbers to the right of it are larger than it. According to this idea, every time we go, we can at least ensure that the base number is where it should be, and the number on the right is greater than the number on the left, and the whole thing is basically in order. So how to deal with the numbers on the left and right parts of the base number? It's very simple, just recurse the same process on the left and right sides respectively, and it's ok. This is quick sorting, because it can sort a number every time, and it can ensure that the numbers in the left area only need to be sorted in the left area, and the numbers in the right area only need to be sorted in the right area, and they have a high probability of being in the right position. , greatly reducing the number of exchanges required.

Let's look at an example. Given an array [4,5,1,10,3,6,9,2], how to sort it using quick sort?

First, select the first number as the base number

Set pointers i and j to point to the leftmost and rightmost respectively

The pointer j moves to the left and finds the first position less than the base number 4. Since the 2 pointed to is smaller than 4, the pointer j does not move.

The pointer i moves to the right and finds the first position greater than the base number 4

Swap the elements corresponding to pointers i and j

The pointer j continues to move to the left and finds 3 which is smaller than the base number.

The pointer i continues to move to the right and finds 10 which is greater than the base number.

Swap the elements corresponding to pointers i and j

Pointer j continues to move left, meets pointer i, and stops.

Swap the element at the position of pointer i and the base number.

This quick sort is over, and the base number 4 has reached the position it should be. The numbers on the left are smaller than 4, and the numbers on the right are larger than 4. Next we just need to recurse this process.

2 Code analysis

Lines 49-62 are the operations done in the example just now, and the details are stated in the code comments.

Lines 66 and 67 recursively process the left and right sides respectively, and finally complete the overall sorting.

There is another question here: Why do we have to start searching from the right every time, but not from the left?

If you start looking from the left, then in this state, i will stop after taking one step to the right. At this time, if the elements of the base number and pointer i are exchanged, 10 will be swapped to the left, which does not meet the restriction of quick sorting that "the numbers on the left are smaller than the base number". The reason why the pointer j is moved first is because it must finally stop at an element smaller than the base number. This can only be guaranteed by moving the right side first.

3 Efficiency analysis

The efficiency of quick sort has a lot to do with the choice of benchmark number.

If the benchmark number is chosen well, the benchmark number can be ranked exactly in the middle every time . When recursing, the sizes of the two sub-problems will be balanced. If it is divided into two parts continuously, the final time complexity will be

T(n)=T(n/2)+T(n/2)+O(n)=O(nlogn)

If the benchmark number is poorly chosen, each benchmark number happens to be the maximum or minimum value , and the size of the sub-problem is only reduced by 1 each time. This will undoubtedly make the efficiency much worse, and the final time complexity is

T(n)=T(n-1)+T(1)+O(n)=O(n^2)