Data structure and algorithm (Golang implementation) (25) Sorting algorithm-quick sorting

Quick sort

Quicksort is a sorting algorithm for a divide-and-conquer strategy. It was invented by a British computer scientist Tony Hoare. The algorithm was released in 19612010 Communications of the ACM 国际计算机学会月刊.

Note: The ACM = Association for Computing MachineryInternational Computer Society, a worldwide professional organization for computer practitioners, founded in 1947, is the world's first scientific and educational computer society.

Quick sorting is an improvement to bubble sorting and also belongs to the sorting algorithm of the exchange class.

1. Algorithm introduction

Quick sorting divides the data to be sorted into two independent parts by one sort, and all the data in one part is smaller than all the data in the other part, and then this method is used to quickly sort the two parts of the data, the entire sorting The process can be performed recursively, so that the entire data becomes an ordered sequence.

Proceed as follows:

  1. First take a number from the sequence as the reference number. Generally take the first number.
  2. In the partitioning process, all numbers larger than this number are placed on the right side, and numbers smaller than or equal to it are placed on the left side.
  3. Repeat the second step for the left and right intervals until there is only one number in each interval.

As an 5 9 1 6 8 14 6 49 25 4 6 3example: .

一般取第一个数 5 作为基准,从它左边和最后一个数使用[]进行标志,

如果左边的数比基准数大,那么该数要往右边扔,也就是两个[]数交换,这样大于它的数就在右边了,然后右边[]数左移,否则左边[]数右移。

5 [9] 1 6 8 14 6 49 25 4 6 [3]  因为 9 > 5,两个[]交换位置后,右边[]左移
5 [3] 1 6 8 14 6 49 25 4 [6] 9  因为 3 !> 5,两个[]不需要交换,左边[]右移
5 3 [1] 6 8 14 6 49 25 4 [6] 9  因为 1 !> 5,两个[]不需要交换,左边[]右移
5 3 1 [6] 8 14 6 49 25 4 [6] 9  因为 6 > 5,两个[]交换位置后,右边[]左移
5 3 1 [6] 8 14 6 49 25 [4] 6 9  因为 6 > 5,两个[]交换位置后,右边[]左移
5 3 1 [4] 8 14 6 49 [25] 6 6 9  因为 4 !> 5,两个[]不需要交换,左边[]右移
5 3 1 4 [8] 14 6 49 [25] 6 6 9  因为 8 > 5,两个[]交换位置后,右边[]左移
5 3 1 4 [25] 14 6 [49] 8 6 6 9  因为 25 > 5,两个[]交换位置后,右边[]左移
5 3 1 4 [49] 14 [6] 25 8 6 6 9  因为 49 > 5,两个[]交换位置后,右边[]左移
5 3 1 4 [6] [14] 49 25 8 6 6 9  因为 6 > 5,两个[]交换位置后,右边[]左移
5 3 1 4 [14] 6 49 25 8 6 6 9  两个[]已经汇总,因为 14 > 5,所以 5 和[]之前的数 4 交换位置
第一轮切分结果:4 3 1 5 14 6 49 25 8 6 6 9  

现在第一轮快速排序已经将数列分成两个部分:

4 3 1 和 14 6 49 25 8 6 6 9

左边的数列都小于 5,右边的数列都大于 5。

使用递归分别对两个数列进行快速排序。

The quick sorting mainly relies on the reference number to divide the sequence into two parts, one part is smaller than the reference number, and the other part is larger than the reference number.

In the best case, each round can be divided evenly, so that traversing the elements n/2can divide the sequence into two parts as long as the time O(n). The time complexity of each round is:. Because the problem is the size of each binary, halving the number of columns continues recursively segmentation, which is the total time complexity is calculated as follows: T(n) = 2*T(n/2) + O(n). According to the calculation of the main theorem formula, we can know that the time complexity is: O(nlogn)Of course, we can calculate it specifically:

我们来分析最好情况,每次切分遍历元素的次数为 n/2

T(n) = 2*T(n/2) + n/2
T(n/2) = 2*T(n/4) + n/4
T(n/4) = 2*T(n/8) + n/8
T(n/8) = 2*T(n/16) + n/16
...
T(4) = 2*T(2) + 4
T(2) = 2*T(1) + 2
T(1) = 1

进行合并也就是:

T(n) = 2*T(n/2) + n/2
     = 2^2*T(n/4)+ n/2 + n/2
     = 2^3*T(n/8) + n/2 + n/2 + n/2
     = 2^4*T(n/16) + n/2 + n/2 + n/2 + n/2
     = ...
     = 2^logn*T(1) + logn * n/2
     = 2^logn + 1/2*nlogn
     = n + 1/2*nlogn

因为当问题规模 n 趋于无穷大时 nlogn 比 n 大,所以 T(n) = O(nlogn)。

最好时间复杂度为:O(nlogn)。

In the worst case, it cannot be divided equally every time, because each time the division is the largest or the smallest, it cannot be divided into two series, so the time complexity becomes T(n) = T(n-1) + O(n), according to the main theorem calculation can know the time The complexity is:, O(n^2)we can actually calculate:

我们来分析最差情况,每次切分遍历元素的次数为 n

T(n) = T(n-1) + n
     = T(n-2) + n-1 + n
     = T(n-3) + n-2 + n-1 + n
     = ...
     = T(1) + 2 +3 + ... + n-2 + n-1 + n
     = O(n^2)

最差时间复杂度为:O(n^2)。

According to the concept of entropy, the larger the number, the higher the randomness and the more spontaneously disordered, so when the size of the data to be sorted is very large, the worst case occurs less. In a comprehensive case, the average time complexity of quick sort is: O(nlogn). Compared with the sorting algorithm introduced earlier, quick sorting is better than the basic sorting algorithm that is square.

The result of segmentation greatly affects the performance of quick sort. In order to avoid the occurrence of uneven segmentation, there are several ways to improve:

  1. Each time quick sorting is performed, the sequence of numbers is randomly shuffled, and then segmented. This adds a random shock to reduce unevenness. Of course, you can choose a reference number at random instead of the first one.
  2. Each time, take three numbers in the head, middle and tail of the sequence, and take the median of the three numbers as the reference number for segmentation.

Method 1 is relatively good, and method 2 introduces additional comparison operations. In general, we can randomly choose a reference number.

Quick sort-place sorting, storage space complexity: O(1). Due to the influence of the recursive stack, the recursive program stack has a range of layers logn~n, so the space complexity of the recursive stack is:, the O(logn)~log(n)worst is:, log(n)when there are many elements, the program stack may overflow. By improving the algorithm and using pseudo-tail recursion for optimization, the space complexity of the recursive stack can be reduced to the O(logn)following algorithm optimization.

Quick sorting is unstable because elements are exchanged during segmentation, and elements of the same value may change position.

2. Algorithm implementation

package main

import "fmt"

// 普通快速排序
func QuickSort(array []int, begin, end int) {
    if begin < end {
        // 进行切分
        loc := partition(array, begin, end)
        // 对左部分进行快排
        QuickSort(array, begin, loc-1)
        // 对右部分进行快排
        QuickSort(array, loc+1, end)
    }
}

// 切分函数,并返回切分元素的下标
func partition(array []int, begin, end int) int {
    i := begin + 1 // 将array[begin]作为基准数,因此从array[begin+1]开始与基准数比较!
    j := end       // array[end]是数组的最后一位

    // 没重合之前
    for i < j {
        if array[i] > array[begin] {
            array[i], array[j] = array[j], array[i] // 交换
            j--
        } else {
            i++
        }
    }

    /* 跳出while循环后,i = j。
     * 此时数组被分割成两个部分  -->  array[begin+1] ~ array[i-1] < array[begin]
     *                        -->  array[i+1] ~ array[end] > array[begin]
     * 这个时候将数组array分成两个部分,再将array[i]与array[begin]进行比较,决定array[i]的位置。
     * 最后将array[i]与array[begin]交换,进行两个分割部分的排序!以此类推,直到最后i = j不满足条件就退出!
     */
    if array[i] >= array[begin] { // 这里必须要取等“>=”,否则数组元素由相同的值组成时,会出现错误!
        i--
    }

    array[begin], array[i] = array[i], array[begin]
    return i
}

func main() {
    list := []int{5}
    QuickSort(list, 0, len(list)-1)
    fmt.Println(list)

    list1 := []int{5, 9}
    QuickSort(list1, 0, len(list1)-1)
    fmt.Println(list1)

    list2 := []int{5, 9, 1}
    QuickSort(list2, 0, len(list2)-1)
    fmt.Println(list2)

    list3 := []int{5, 9, 1, 6, 8, 14, 6, 49, 25, 4, 6, 3}
    QuickSort(list3, 0, len(list3)-1)
    fmt.Println(list3)
}

Output:

[5]
[5 9]
[1 5 9]
[1 3 4 5 6 6 6 8 9 14 25 49]

sample graph:

Quick sort, maintain two subscripts for each segmentation, advance, and finally divide the sequence into two parts.

Three, algorithm improvement

Quicksort can continue to improve the algorithm.

  1. In the case of small-scale arrays, the efficiency of direct insertion sort is the best. When the recursive part of quick sort enters the small array range, it can be switched to direct insertion sort.
  2. There may be a lot of duplicate values ​​in the sorting sequence. Use three-way slicing to quickly sort the array into three parts, which are greater than the reference number, equal to the reference number, and less than the reference number. At this time, three subscripts need to be maintained.
  3. Pseudo-tail recursion is used to reduce the occupation of program stack space, so that the complexity of stack space O(logn)~log(n)changes from : to O(logn).

3.1 Improvement: Small-scale arrays use direct insertion sort

func QuickSort1(array []int, begin, end int) {
    if begin < end {
        // 当数组小于 4 时使用直接插入排序
        if end-begin <= 4 {
            InsertSort(array[begin : end+1])
            return
        }

        // 进行切分
        loc := partition(array, begin, end)
        // 对左部分进行快排
        QuickSort1(array, begin, loc-1)
        // 对右部分进行快排
        QuickSort1(array, loc+1, end)
    }
}

Direct insertion sorting is very efficient in small-scale arrays. We only need to replace end-begin <= 4the recursive part with direct insertion sorting, which means small array sorting.

3.2 Improvement: Three-way segmentation

package main

import "fmt"

// 三切分的快速排序
func QuickSort2(array []int, begin, end int) {
    if begin < end {
        // 三向切分函数,返回左边和右边下标
        lt, gt := partition3(array, begin, end)
        // 从lt到gt的部分是三切分的中间数列
        // 左边三向快排
        QuickSort2(array, begin, lt-1)
        // 右边三向快排
        QuickSort2(array, gt+1, end)
    }
}

// 切分函数,并返回切分元素的下标
func partition3(array []int, begin, end int) (int, int) {
    lt := begin       // 左下标从第一位开始
    gt := end         // 右下标是数组的最后一位
    i := begin + 1    // 中间下标,从第二位开始
    v := array[begin] // 基准数

    // 以中间坐标为准
    for i <= gt {
        if array[i] > v { // 大于基准数,那么交换,右指针左移
            array[i], array[gt] = array[gt], array[i]
            gt--
        } else if array[i] < v { // 小于基准数,那么交换,左指针右移
            array[i], array[lt] = array[lt], array[i]
            lt++
            i++
        } else {
            i++
        }
    }

    return lt, gt
}

Demo:

数列:4 8 2 4 4 4 7 9,基准数为 4

[4] [8] 2 4 4 4 7 [9]  从中间[]开始:8 > 4,中右[]进行交换,右边[]左移
[4] [9] 2 4 4 4 [7] 8  从中间[]开始:9 > 4,中右[]进行交换,右边[]左移
[4] [7] 2 4 4 [4] 9 8  从中间[]开始:7 > 4,中右[]进行交换,右边[]左移
[4] [4] 2 4 [4] 7 9 8  从中间[]开始:4 == 4,不需要交换,中间[]右移
[4] 4 [2] 4 [4] 7 9 8  从中间[]开始:2 < 4,中左[]需要交换,中间和左边[]右移
2 [4] 4 [4] [4] 7 9 8  从中间[]开始:4 == 4,不需要交换,中间[]右移
2 [4] 4 4 [[4]] 7 9 8  从中间[]开始:4 == 4,不需要交换,中间[]右移,因为已经重叠了
第一轮结果:2 4 4 4 4 7 9 8

分成三个数列:

2
4 4 4 4 (元素相同的会聚集在中间数列)
7 9 8

接着对第一个和最后一个数列进行递归即可。

sample graph:

Three cuts, throw the less than the reference number to the left, the greater than the reference number to the right, the same elements will be aggregated.

If there are a large number of repeated elements, the sorting speed will be greatly improved, and it will be linear time, because the same elements will be gathered in the middle, and these elements will no longer enter the next recursive iteration.

The three-way segmentation mainly comes from the three-color problem of the Dutch flag, which is raised by the Dijkstraquestion.

Suppose there is a rope with red, white, and blue flags on it. At first, the colors of the flags on the rope are not in order. You want to classify them and arrange them in the order of blue, white, and red. How do you move them? It will be the least, note that you can only perform this action on the rope, and you can only swap two flags at a time.

It can be seen that the above solution is equivalent to using three-way splitting once, as long as we set the value of the white flag to 100, the blue flag value to 0, and the red flag value to 200, 100as the reference number, the first three-way After the split, the three-color flags are arranged, because 蓝(0)白(100)红(200).

Note: Izger W. Dickescher ( Edsger Wybe Dijkstra, May 11, 1930 ~ August 6, 2002), Dutch, computer scientist, won the Turing Award.

3.3 Improvement: pseudo-tail recursive optimization

// 伪尾递归快速排序
func QuickSort3(array []int, begin, end int) {
    for begin < end {
        // 进行切分
        loc := partition(array, begin, end)

        // 那边元素少先排哪边
        if loc-begin < end-loc {
            // 先排左边
            QuickSort3(array, begin, loc-1)
            begin = loc + 1
        } else {
            // 先排右边
            QuickSort3(array, loc+1, end)
            end = loc - 1
        }
    }
}

Many people think that this is tail recursion. In fact, this fast-ranking method is disguised tail recursion, not real tail recursion, because there is a forloop, not direct return QuickSort, recursion is still pushing the stack continuously, and the stack level is still growing.

However, because the small-scale parts are sorted first, the depth of the stack is greatly reduced, and the depth of the program stack will not exceed the lognlayer, so the worst space complexity of the stack is O(n)reduced to O(logn).

This optimization is also a well optimized, because the stack of layers is reduced, for ordering one billion integers, as long as: log(100 0000 0000)=29.897, occupied by a stack of layers up to 30layer, than non-optimized, may appear O(n)much better constant layer .

Four, supplement: non-recursive writing

The non-recursive writing method is only to convert the previous recursive stack into the manual stack maintained by itself.

// 非递归快速排序
func QuickSort5(array []int) {

    // 人工栈
    helpStack := new(LinkStack)

    // 第一次初始化栈,推入下标0,len(array)-1,表示第一次对全数组范围切分
    helpStack.Push(len(array) - 1)
    helpStack.Push(0)

    // 栈非空证明存在未排序的部分
    for !helpStack.IsEmpty() {
        // 出栈,对begin-end范围进行切分排序
        begin := helpStack.Pop() // 范围区间左边
        end := helpStack.Pop()   // 范围

        // 进行切分
        loc := partition(array, begin, end)

        // 右边范围入栈
        if loc+1 < end {
            helpStack.Push(end)
            helpStack.Push(loc + 1)
        }

        // 左边返回入栈
        if begin < loc-1 {
            helpStack.Push(loc - 1)
            helpStack.Push(begin)
        }
    }
}

The range of the array that originally needs to be recursive begin,end, without recursion, is pushed into its own artificial stack in turn, and then the artificial stack is processed in a loop.

We can see that without recursion, the complexity of the program stack space becomes:, O(1)but additional storage space is generated.

The auxiliary artificial stack structure helpStackoccupies additional space, and the storage space is O(1)changed from in- situ sorting O(logn)~log(n).

We can refer to the above pseudo tail recursive version continue to optimize non-recursive version, let shorter range of the stack, so that the complexity of the memory may be changed: O(logn). Such as:

// 非递归快速排序优化
func QuickSort6(array []int) {

    // 人工栈
    helpStack := new(LinkStack)

    // 第一次初始化栈,推入下标0,len(array)-1,表示第一次对全数组范围切分
    helpStack.Push(len(array) - 1)
    helpStack.Push(0)

    // 栈非空证明存在未排序的部分
    for !helpStack.IsEmpty() {
        // 出栈,对begin-end范围进行切分排序
        begin := helpStack.Pop() // 范围区间左边
        end := helpStack.Pop()   // 范围

        // 进行切分
        loc := partition(array, begin, end)

        // 切分后右边范围大小
        rSize := -1
        // 切分后左边范围大小
        lSize := -1

        // 右边范围入栈
        if loc+1 < end {
            rSize = end - (loc + 1)
        }

        // 左边返回入栈
        if begin < loc-1 {
            lSize = loc - 1 - begin
        }

        // 两个范围,让范围小的先入栈,减少人工栈空间
        if rSize != -1 && lSize != -1 {
            if lSize > rSize {
                helpStack.Push(end)
                helpStack.Push(loc + 1)
                helpStack.Push(loc - 1)
                helpStack.Push(begin)
            } else {
                helpStack.Push(loc - 1)
                helpStack.Push(begin)
                helpStack.Push(end)
                helpStack.Push(loc + 1)
            }
        } else {
            if rSize != -1 {
                helpStack.Push(end)
                helpStack.Push(loc + 1)
            }

            if lSize != -1 {
                helpStack.Push(loc - 1)
                helpStack.Push(begin)
            }
        }
    }
}

The complete procedure is as follows:

package main

import (
    "fmt"
    "sync"
)

// 链表栈,后进先出
type LinkStack struct {
    root *LinkNode  // 链表起点
    size int        // 栈的元素数量
    lock sync.Mutex // 为了并发安全使用的锁
}

// 链表节点
type LinkNode struct {
    Next  *LinkNode
    Value int
}

// 入栈
func (stack *LinkStack) Push(v int) {
    stack.lock.Lock()
    defer stack.lock.Unlock()

    // 如果栈顶为空,那么增加节点
    if stack.root == nil {
        stack.root = new(LinkNode)
        stack.root.Value = v
    } else {
        // 否则新元素插入链表的头部
        // 原来的链表
        preNode := stack.root

        // 新节点
        newNode := new(LinkNode)
        newNode.Value = v

        // 原来的链表链接到新元素后面
        newNode.Next = preNode

        // 将新节点放在头部
        stack.root = newNode
    }

    // 栈中元素数量+1
    stack.size = stack.size + 1
}

// 出栈
func (stack *LinkStack) Pop() int {
    stack.lock.Lock()
    defer stack.lock.Unlock()

    // 栈中元素已空
    if stack.size == 0 {
        panic("empty")
    }

    // 顶部元素要出栈
    topNode := stack.root
    v := topNode.Value

    // 将顶部元素的后继链接链上
    stack.root = topNode.Next

    // 栈中元素数量-1
    stack.size = stack.size - 1

    return v
}

// 栈是否为空
func (stack *LinkStack) IsEmpty() bool {
    return stack.size == 0
}

// 非递归快速排序
func QuickSort5(array []int) {

    // 人工栈
    helpStack := new(LinkStack)

    // 第一次初始化栈,推入下标0,len(array)-1,表示第一次对全数组范围切分
    helpStack.Push(len(array) - 1)
    helpStack.Push(0)

    // 栈非空证明存在未排序的部分
    for !helpStack.IsEmpty() {
        // 出栈,对begin-end范围进行切分排序
        begin := helpStack.Pop() // 范围区间左边
        end := helpStack.Pop()   // 范围

        // 进行切分
        loc := partition(array, begin, end)

        // 右边范围入栈
        if loc+1 < end {
            helpStack.Push(end)
            helpStack.Push(loc + 1)
        }

        // 左边返回入栈
        if begin < loc-1 {
            helpStack.Push(loc - 1)
            helpStack.Push(begin)
        }
    }
}

// 非递归快速排序优化
func QuickSort6(array []int) {

    // 人工栈
    helpStack := new(LinkStack)

    // 第一次初始化栈,推入下标0,len(array)-1,表示第一次对全数组范围切分
    helpStack.Push(len(array) - 1)
    helpStack.Push(0)

    // 栈非空证明存在未排序的部分
    for !helpStack.IsEmpty() {
        // 出栈,对begin-end范围进行切分排序
        begin := helpStack.Pop() // 范围区间左边
        end := helpStack.Pop()   // 范围

        // 进行切分
        loc := partition(array, begin, end)

        // 切分后右边范围大小
        rSize := -1
        // 切分后左边范围大小
        lSize := -1

        // 右边范围入栈
        if loc+1 < end {
            rSize = end - (loc + 1)
        }

        // 左边返回入栈
        if begin < loc-1 {
            lSize = loc - 1 - begin
        }

        // 两个范围,让范围小的先入栈,减少人工栈空间
        if rSize != -1 && lSize != -1 {
            if lSize > rSize {
                helpStack.Push(end)
                helpStack.Push(loc + 1)
                helpStack.Push(loc - 1)
                helpStack.Push(begin)
            } else {
                helpStack.Push(loc - 1)
                helpStack.Push(begin)
                helpStack.Push(end)
                helpStack.Push(loc + 1)
            }
        } else {
            if rSize != -1 {
                helpStack.Push(end)
                helpStack.Push(loc + 1)
            }

            if lSize != -1 {
                helpStack.Push(loc - 1)
                helpStack.Push(begin)
            }
        }
    }
}

// 切分函数,并返回切分元素的下标
func partition(array []int, begin, end int) int {
    i := begin + 1 // 将array[begin]作为基准数,因此从array[begin+1]开始与基准数比较!
    j := end       // array[end]是数组的最后一位

    // 没重合之前
    for i < j {
        if array[i] > array[begin] {
            array[i], array[j] = array[j], array[i] // 交换
            j--
        } else {
            i++
        }
    }

    /* 跳出while循环后,i = j。
     * 此时数组被分割成两个部分  -->  array[begin+1] ~ array[i-1] < array[begin]
     *                        -->  array[i+1] ~ array[end] > array[begin]
     * 这个时候将数组array分成两个部分,再将array[i]与array[begin]进行比较,决定array[i]的位置。
     * 最后将array[i]与array[begin]交换,进行两个分割部分的排序!以此类推,直到最后i = j不满足条件就退出!
     */
    if array[i] >= array[begin] { // 这里必须要取等“>=”,否则数组元素由相同的值组成时,会出现错误!
        i--
    }

    array[begin], array[i] = array[i], array[begin]
    return i
}

func main() {
    list3 := []int{5, 9, 1, 6, 8, 14, 6, 49, 25, 4, 6, 3}
    QuickSort5(list3)
    fmt.Println(list3)

    list4 := []int{5, 9, 1, 6, 8, 14, 6, 49, 25, 4, 6, 3}
    QuickSort6(list4)
    fmt.Println(list4)
}

Output:

[1 3 4 5 6 6 6 8 9 14 25 49]
[1 3 4 5 6 6 6 8 9 14 25 49]

The artificial stack is used instead of the recursive program stack. There is no change in the speed, but the code readability is reduced.

5. Supplement: Reasons for the built-in library to use quick sort

First, heap sorting, merge sorting, the worst and worst time complexity are:, O(nlogn)and quick sorting, the worst time complexity is:, O(n^2)but many programming languages ​​built-in sorting algorithm still uses quick sorting, why?

  1. This problem is biased. The choice of sorting algorithm depends on the specific scenario. LinuxThe sorting algorithm used by the kernel is heap sorting. JavaFor sorting a large number of complex objects, the built-in sorting uses merge sorting, but in general, quick sorting is faster. .
  2. The merge sort has two stability, the first stability is the same element position before and after sorting, the second stability is that each time sorting is very average, the read data is also read sequentially, can use the memory cache Features, such as sorting by reading data from disk. Because the sorting process requires additional auxiliary array space, this part comes at a cost, but the in-place manual merge sort overcomes this defect.
  3. Complexity, the large Othere is a constant term is omitted, the maximum value taken after each heap sort, the node need to be reversed to restore the stack features a lot of wasted effort, the constant term is larger than the quick sort, in most cases Down is much slower than quick sort. However, the heap sorting time is relatively stable, the worst O(n^2)case of fast sorting does not occur , and it saves space, and does not require additional storage space and stack space.
  4. When the number to be sorted is greater than 16000 elements, using bottom-up heap sorting is faster than quick sorting, see here: https://core.ac.uk/download/pdf/82350265.pdf .
  5. The complexity of the quick sort in the worst case is high, mainly because the split is not averaged like the merge sort, but it is very dependent on the base number. Now, we have improved through such as random numbers, three cuts, etc. Greatly reduced. In most cases, it is not so bad, most of them are real blocks.
  6. Merge sort and quick sort are both divide-and-conquer methods, and the sorted data are adjacent, and the number of heap sort comparisons may span a large range, resulting in a reduction in local hit rate, and cannot use the characteristics of modern memory cache to load data The process loses performance.

If there is a requirement for stability, the position of the same element before and after sorting needs to be unchanged, and the merged sort can be used Java. The complex object type requires that the position before and after sorting cannot be changed. Use merge sort.

For stack and storage space requirements, you can use heap sorting. For example, the Linuxkernel stack is small, and the quick sorting takes up too much of the program stack. Using quick sorting may cause stack overflow, so heap sorting is used.

In Golang, sortthe slices are sorted stably in the standard library :

func SliceStable(slice interface{}, less func(i, j int) bool) {
    rv := reflectValueOf(slice)
    swap := reflectSwapper(slice)
    stable_func(lessSwap{less, swap}, rv.Len())
}

func stable_func(data lessSwap, n int) {
    blockSize := 20
    a, b := 0, blockSize
    for b <= n {
        insertionSort_func(data, a, b)
        a = b
        b += blockSize
    }
    insertionSort_func(data, a, n)
    for blockSize < n {
        a, b = 0, 2*blockSize
        for b <= n {
            symMerge_func(data, a, a+blockSize, b)
            a = b
            b += 2 * blockSize
        }
        if m := a + blockSize; m < n {
            symMerge_func(data, a, m, n)
        }
        blockSize *= 2
    }
}

First 20, the entire slice segment will be inserted and sorted according to the range of elements, because the small array insertion and sorting efficiency is high, and then these sorted small arrays are merged and sorted. The merge sorting also uses in-situ sorting, which saves auxiliary space.

And the general sort:

func Slice(slice interface{}, less func(i, j int) bool) {
    rv := reflectValueOf(slice)
    swap := reflectSwapper(slice)
    length := rv.Len()
    quickSort_func(lessSwap{less, swap}, 0, length, maxDepth(length))
}

func quickSort_func(data lessSwap, a, b, maxDepth int) {
    for b-a > 12 {
        if maxDepth == 0 {
            heapSort_func(data, a, b)
            return
        }
        maxDepth--
        mlo, mhi := doPivot_func(data, a, b)
        if mlo-a < b-mhi {
            quickSort_func(data, a, mlo, maxDepth)
            a = mhi
        } else {
            quickSort_func(data, mhi, b, maxDepth)
            b = mlo
        }
    }
    if b-a > 1 {
        for i := a + 6; i < b; i++ {
            if data.Less(i, i-6) {
                data.Swap(i, i-6)
            }
        }
        insertionSort_func(data, a, b)
    }
}

func doPivot_func(data lessSwap, lo, hi int) (midlo, midhi int) {
    m := int(uint(lo+hi) >> 1)
    if hi-lo > 40 {
        s := (hi - lo) / 8
        medianOfThree_func(data, lo, lo+s, lo+2*s)
        medianOfThree_func(data, m, m-s, m+s)
        medianOfThree_func(data, hi-1, hi-1-s, hi-1-2*s)
    }
    medianOfThree_func(data, lo, m, hi-1)
    pivot := lo
    a, c := lo+1, hi-1
    for ; a < c && data.Less(a, pivot); a++ {
    }
    b := a
    for {
        for ; b < c && !data.Less(pivot, b); b++ {
        }
        for ; b < c && data.Less(pivot, c-1); c-- {
        }
        if b >= c {
            break
        }
        data.Swap(b, c-1)
        b++
        c--
    }
    protect := hi-c < 5
    if !protect && hi-c < (hi-lo)/4 {
        dups := 0
        if !data.Less(pivot, hi-1) {
            data.Swap(c, hi-1)
            c++
            dups++
        }
        if !data.Less(b-1, pivot) {
            b--
            dups++
        }
        if !data.Less(m, pivot) {
            data.Swap(m, b-1)
            b--
            dups++
        }
        protect = dups > 1
    }
    if protect {
        for {
            for ; a < b && !data.Less(b-1, pivot); b-- {
            }
            for ; a < b && data.Less(a, pivot); a++ {
            }
            if a >= b {
                break
            }
            data.Swap(a, b-1)
            a++
            b--
        }
    }
    data.Swap(pivot, b-1)
    return b - 1, c
}

Quick sort limits the number of layers of the program stack to: 2*ceil(log(n+1))When the recursion exceeds this layer, it means that the program stack is too deep, then switch to heap sort.

The above quick sort also uses three optimizations. The first is to convert the small array into insertion sort when recursive, the second is to use the median reference number, and the third is to use three-division.

Series article entry

I am the star Chen, Welcome I have personally written data structures and algorithms (Golang achieve) , starting in the article to read more friendly GitBook .

Guess you like

Origin www.cnblogs.com/nima/p/12724868.html