Ten classic sorting algorithms for data structures

Ten classic sorting algorithms

(See this for java implementation) https://program.blog.csdn.net/article/details/83785159

Glossary:

1. Bubble sort

2. Choose Sort

3. Insertion sort

4. Hill sort

5. Merge sort

6. Quick sort

7. Heap sort

8. Count sort

9. Bucket sorting

10. Cardinality sort

Insert picture description here

The sorting algorithm can be divided into internal sorting and external sorting. The internal sorting is that the data records are sorted in the memory, while the external sorting is because the sorted data is very large and cannot accommodate all the sorted records at a time, and the external memory needs to be accessed during the sorting process. Common internal sorting algorithms are: insertion sort, hill sort, selection sort, bubble sort, merge sort, quick sort, heap sort, cardinal sort, etc.

 

Glossary:

n: data size

k: the number of "buckets"

In-place: occupies constant memory, does not occupy additional memory

Out-place: occupy additional memory

1. Bubble sort

Bubble Sort (Bubble Sort) is also a simple and intuitive sorting algorithm. It repeatedly visited the sequence to be sorted, compared two elements at a time, and exchanged them if they were in the wrong order.

The work of visiting the sequence is repeated until no more exchanges are needed, which means that the sequence has been sorted.
The origin of the name of this algorithm is because the smaller the element will slowly "float" to the top of the sequence through exchange.

Bubble sort operates on the data for n-1 rounds, and finds a maximum (small) value in each round.

Operation refers to the comparison and exchange of two adjacent numbers, each round will exchange a maximum value to the beginning (end) of the data column, like bubbling.

(1) Algorithm steps

1. Compare adjacent elements. If the first one is larger than the second, swap them two.

2. Do the same work for each pair of adjacent elements, from the first pair at the beginning to the last pair at the end. After this step is done, the last element will be the largest number.

3. Repeat the above steps for all elements except the last one.

4. Continue to repeat the above steps for fewer and fewer elements each time until there is no pair of numbers to compare.

(2) Motion picture presentation

(3) Pseudo code

for i=1 to A.length-1

for j=A.length downto i+1
    
    if A[j]<A[j-1]
       
      exchange A[j]withA[j-1]

(4) python code

def bubbleSort(A):
    for i in range(0, len(A)):
        for j in range(1, len(A)-i):
            if A[j] < A[j-1]:
                A[j], A[j - 1] = A[j - 1], A[j]
    return A
A=[5,2,15,6,1,3]
print( bubbleSort(A))
[1, 2, 3, 5, 6, 15]

2. Choose Sort

(1) Algorithm steps

1. First find the smallest (large) element in the unsorted sequence and store it at the beginning of the sorted sequence

2. Continue to find the smallest (large) element from the remaining unsorted elements, and then put it at the end of the sorted sequence.

3. Repeat the second step until all elements are sorted.

(2) Motion picture presentation

Insert picture description here
(3) Pseudo code

Insert picture description here
(4) python code

def select_sort(A):
    n=len(A)
    for i in range(len(A)-1):#0到n-2
        min_index=i
        for j in range(i+1,n-1):
            if A[j]<A[min_index]:
                min_index=j
        A[i],A[min_index]=A[min_index],A[i]
    return A

3. Insertion sort

(1) Algorithm steps

1. Treat the first element of the first sequence to be sorted as an ordered sequence, and treat the second element to the last element as an unsorted sequence.

2. Scan the unsorted sequence from beginning to end, and insert each element scanned into the proper position of the ordered sequence.
(If the element to be inserted is equal to an element in the ordered sequence, the element to be inserted is inserted after the equal element.)

(2) Motion picture presentation

(3) Pseudo code

Insert picture description here

(4) python code

def insertion_sort(A):
    for j in range(1,len(A)):
        key=A[j]
        i=j-1
        while i>=0 and A[i]>key:
            A[i+1]=A[i]
            i-=1
        A[i+1]=key
    return A
A=[5,2,4,6,1,3]
insertion_sort(A)
[1, 2, 3, 4, 5, 6]

4. Hill sort

Hill sorting, also known as decreasing incremental sorting algorithm, is a more efficient and improved version of insertion sort, which is optimized for insertion sort to reduce the number of moves. But Hill sorting is an unstable sorting algorithm.

Insertion sorting must move a large amount of data for each insertion, and many movements when inserting before and after are repeated operations, and the efficiency of moving in one step will be much higher.

If the sequence is basically ordered, insertion sort does not need to do a lot of move operations, which is very efficient.

Hill sorting divides the sequence into multiple sub-sequences at fixed intervals. Simple insertion sorting in the sub-sequences, first moving a long distance to make the sequence basically in order; gradually narrowing the interval and repeating the operation, and the final interval is 1, which means simple insertion sorting.

(1) Algorithm steps

1. Select an incremental sequence t1, t2,..., tk, where ti> tj, tk = 1;

2. According to the number of increment sequences k, sort the sequence k times;

3. In each sorting pass, according to the corresponding increment ti, the sequence to be sorted is divided into several sub-sequences of length m, and each sub-table is directly inserted and sorted. When only the increment factor is 1, the entire sequence is treated as a table, and the length of the table is the length of the entire sequence.

(2) Motion picture presentation

Insert picture description here
(3) Pseudo code

(4) python code

def ShellSort(A):
    def shellinsert(A,d):
        n=len(A)
        for i in range(d,n):
            j=i-d
            temp=A[i]                     #记录要出入的数
            while(j>=0 and A[j]>temp):    #从后向前,找打比其小的数的位置
                A[j+d]=A[j]             #向后挪动
                j-=d
            if j!=i-d:
                A[j+d]=temp
    n=len(A)
    if n<=1:
        return A
    d=n//2
    while d>=1:
        shellinsert(A,d)
        d=d//2
    return A
A=[5,2,4,6,1,3,17,11,23,42]
ShellSort(A)
[1, 2, 3, 4, 5, 6, 11, 17, 23, 42]

5. Merge sort

Merge sort (Merge sort) is an effective sorting algorithm based on merge operations. This algorithm is a very typical application of Divide and Conquer.

Divide and Conquer:

1.Divide the problem into a number of subproblems that are smaller instances of the same problem.

2.Conquer the subproblems by solving them recursively. If they are small enough, solve the subproblems as base cases.

3.Combine the solutions to the subproblems into the solution for the original problem.

(1) Algorithm steps

1. Apply for space so that its size is the sum of two sorted sequences, and this space is used to store the merged sequence;

2. Set two pointers, the initial positions are the starting positions of the two sorted sequences;

3. Compare the elements pointed to by the two pointers, select a relatively small element to put into the merge space, and move the pointer to the next position;

4. Repeat step 3 until a pointer reaches the end of the sequence;

5. Copy all remaining elements of another sequence directly to the end of the merged sequence.

(2) Motion picture presentation

Insert picture description here
(3) Pseudo code

Insert picture description here
Insert picture description here
Insert picture description here

(4) python code

def mergeSort(A):
    import math
    if(len(A)<2):
        return arr
    middle = math.floor(len(A)/2)
    left, right = A[0:middle], A[middle:]
    return merge(mergeSort(left), mergeSort(right))

def merge(left,right):
    result = []
    while left and right:
        if left[0] <= right[0]:
            result.append(left.pop(0));
        else:
            result.append(right.pop(0));
    while left:
        result.append(left.pop(0));
    while right:
        result.append(right.pop(0));
    return result
A=[5,2,4,6,1,3,17,11,23,42]
mergeSort(A)
[1, 2, 3, 4, 5, 6, 7, 8, 9]

6. Quick sort

Quick sort uses a divide and conquer (Divide and conquer) strategy to divide a list into two sub-lists.

Insert picture description here
(1) Algorithm steps

1. Pick an element from the sequence and call it a "pivot";

2. Re-order the sequence, all elements smaller than the benchmark value are placed in front of the benchmark, and all elements larger than the benchmark value are placed behind the benchmark (the same number can go to either side).
After the partition exits, the benchmark is in the middle of the sequence. This is called a partition operation;

3. Recursively sort the sub-sequences of elements smaller than the reference value and the sub-sequences of elements larger than the reference value;

(2) Motion picture presentation

Insert picture description here
(3) Pseudo code

Insert picture description here
Insert picture description here

(4)python 代码
#(以最后一个数为基准)
def quick_sort_final(A,p,r):
    if p>=r:
        return(A)
    else:
        q=PARTITION(A,p,r)
        quick_sort_final(A,p,q-1)
        quick_sort_final(A,q+1,r)
    return(A)
def PARTITION(A,p,r):
    q=p#左子数组等于右子树组的边界
    pivot=A[r]#基准数
    for u in range(p,r):#range函数不包含右边界
        if A[u]<=pivot:
            A[q],A[u]=A[u],A[q]
            q=q+1
    A[r]=A[q]
    A[q]=pivot
    return(q)         
#以一个随机数为基准
import random
def PARTITION(A,p,r):
    m=random.randint(p,r)#randint产生一个随机数
    A[m],A[r]=A[r],A[m]#将上衣程序的基准数改为随机数
    q=p#左子数组等于右子树组的边界
    pivot=A[r]#基准数
    for u in range(p,r):#range函数不包含右边界
        if A[u]<=pivot:
            A[q],A[u]=A[u],A[q]
            q=q+1
    A[r]=A[q]
    A[q]=pivot
    return(q)   
def quick_sort_random(A,p,r):
    if p>=r:
        return(A)
    else:
        q=PARTITION(A,p,r)
        quick_sort_final(A,p,q-1)
        quick_sort_final(A,q+1,r)
    return(A)
A=[3,2,9,5,6,4,1]
quick_sort_random(A,0,len(A)-1)
[1, 2, 3, 4, 5, 6, 9]

7. Heap sort

The (binary) heap is an array, which can be regarded as an approximate complete binary tree, and each node on the tree corresponds to an element in the array.
The (two-fork) heap can be divided into two forms: the largest heap (large top heap) and the smallest heap (small top heap).

Maximum heap: any parent node is not less than a complete binary tree with two left and right child nodes, and the element on the root node is the largest.

Insert picture description here

Minimal heap: a complete binary tree with no parent node greater than the left and right child nodes, and the element on the root node is the smallest.
Insert picture description here

In heap sorting, we use the largest heap; the smallest heap is usually used to construct a priority queue

Three major processes:

MAX-HEAPIFY: Its time complexity is O (lgn) O(lgn)O(lgn), which is the key to maintaining the maximum heap nature

BUILD-MAX-HEAP (Build the largest heap): It has linear time complexity and the function is to build a largest heap from an array of unordered input data

HEAPSORT (heap sort): Its time complexity is O (nlgn) O(nlgn)O(nlgn), and its function is to sort an array in-situ.

When using an array (list) to implement a large top heap, that is, the largest heap, number from top to bottom and from left to right. The parent node sequence number is n, then the left and right child node sequence numbers are 2 n+1, 2 n+2 respectively

(1) Algorithm steps

Heap sorting first establishes a large top heap (find a maximum value), and then uses the last leaf node to replace the root node and then adjusts the large top heap (find a maximum value), repeat

(2) Motion picture presentation

Insert picture description here

(3) Pseudo code

Insert picture description here
Insert picture description here
Insert picture description here
(4) python code

# 调整堆 A:待调整序列 length: 序列长度 i:需要调整的结点
def max_heapify(A,i):
    #获取左右叶子节点
    largest = i
    left=2*i + 1
    right=2*i + 2
    #执行循环操作:两个任务:1 寻找最大值的下标;2.最大值与父节点交换
    if (left < Alen) and (A[left] > A[largest]):
        largest = left
    
    #当右叶子节点的下标小于序列长度 并且 右叶子节点的值大于父节点时,将右叶子节点的下标值赋值给largest
    if (right < Alen) and (A[right] > A[largest]):
        largest = right
    #如果largest不等于i 说明当前的父节点不是最大值,需要交换值
    if (largest != i):
        A[i],A[largest] = A[largest],A[i]
        max_heapify(A,largest)       
    # 构建堆            
def build_max_heap(A):
    import math
    for i in range(math.floor(len(A)/2),-1,-1):
        max_heapify(A,i)   
def heapsort(A):
   
    #i:当前堆中序列的长度.初始化为序列的长度
    global Alen#Python中定义函数时,若想在函数内部对函数外的变量进行操作,就需要在函数内部声明其为global。
    Alen = len(A)
    #先建立大顶堆,保证最大值位于根节点;并且父节点的值大于叶子结点
    build_max_heap(A)
    #执行循环:1. 每次取出堆顶元素置于序列的最后(len-1,len-2,len-3...)
    #         2. 调整堆,使其继续满足大顶堆的性质,注意实时修改堆中序列的长度
    for i in range(len(A)-1,0,-1):
       
        A[i],A[0] = A[0],A[i]
        
        #堆中序列长度减1
        Alen -= 1 
        #调整大顶堆
        max_heapify(A,0)
    
    return A
A = [9,5,6,8,2,7,3,4,1]
heapsort(A)
[1, 2, 3, 4, 5, 6, 7, 8, 9]

8. Count sort

The core of counting and sorting is to convert the input data values ​​into keys and store them in the additional array space.

As a sort of linear time complexity, the count sort requires that the input data must be an integer with a certain range.

(1) Algorithm steps

Count sorting uses the value to be sorted as the subscript of the count array (list), count the number of each value, and then output in turn.

(2) Motion picture presentation

Insert picture description here
(3) Pseudo code
Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here
(4) Python code

import numpy as np
def count_key_equal(A):
    m=np.max(A)#取A的最大值
    equal=np.zeros(m+1,int)#因为索引从零开始,故建立一个为m+1的全零数组,记载各个值出现的次数
    for i in range(len(A)):#令key等于序列A的每一个值,equal序列相应的索引值就增加一
        key=A[i]
        equal[key]+=1
    return equal
def count_key_less(equal,A):
    m=np.max(A)
    less=np.zeros(m+1,int)
    for j in range(1,m+1):#range函数包头不包尾
        less[j]=less[j-1]+equal[j-1]#小于数值i的有几个=小于数值i-1+等于数值i—1
    return less
def rearrange(A,less):
    m=np.max(A)
    Next=np.zeros(m+1,int)
    B=np.zeros(len(A),int)
    for j in range(m+1):
        Next[j]=less[j]+1
    for i in range(len(A)):
        key = A[i]
        index = Next[key]
        B[index-1]=A[i]
        Next[key]+=1
    return B
A=[4,1,5,0,1,6,5,1,5,3]
equal=count_key_equal(A)
less=count_key_less(equal,A)
B=rearrange(A,less)
print(B)        
[0 1 1 1 3 4 5 5 5 6]

Combine the above three sub-functions

def count_sort(A):
    min_num=min(A)
    max_num=max(A)
    count_list=[0]*(max_num-min_num+1)
    print(count_list)
    
    for i in A:
        count_list[i-min_num] +=1
    print(count_list)
    A.clear()
    
    for ind,i in enumerate(count_list):
        while i !=0:
            A.append(ind+min_num)
            i-=1
    return A
A=[4,1,5,0,1,6,5,1,5,3]
result=count_sort(A)
print(result)
[0, 0, 0, 0, 0, 0, 0]
[1, 3, 0, 1, 1, 3, 1]
[0, 1, 1, 1, 3, 4, 5, 5, 5, 6]

 

9. Bucket sorting

Bucket sorting is actually a generalization of counting sorting, but its implementation is much more complicated.

Bucket sorting first divides the data into different ordered regions (buckets) with a certain functional relationship, and then the sub-data is sorted in the buckets, and then output sequentially.

When each different data is allocated a bucket, it is equivalent to counting and sorting.

(2) Motion picture presentation

Insert picture description here

(3) Pseudo code

Insert picture description here
(4) python code

def BucketSort(ls):
    ##############桶内使用快速排序
    def QuickSort(ls):
        def partition(arr,left,right):
            key=left                 #划分参考数索引,默认为第一个数,可优化
            while left<right:
                while left<right and arr[right]>=arr[key]:
                    right-=1
                while left<right and arr[left]<=arr[key]:
                    left+=1
                (arr[left],arr[right])=(arr[right],arr[left])
            (arr[left],arr[key])=(arr[key],arr[left])
            return left
 
        #递归调用
        def quicksort(arr,left,right):   
            if left>=right:
                return
            mid=partition(arr,left,right)
            quicksort(arr,left,mid-1)
            quicksort(arr,mid+1,right)
 
        #主函数
        n=len(ls)
        if n<=1:
            return ls
        quicksort(ls,0,n-1)
        return ls
 
    ######################
    n=len(ls)
    big=max(ls)
    num=big//10+1
    bucket=[]
    buckets=[[] for i in range(0,num)]
    for i in ls:
        buckets[i//10].append(i)      #划分桶
    for i in buckets:                 #桶内排序
        bucket=QuickSort(i)
    arr=[]
    for i in buckets:
        if isinstance(i, list):
            for j in i:
                arr.append(j)
        else:
            arr.append(i)
    for i in range(0,n):
        ls[i]=arr[i]
    return ls
A=[4,1,5,0,1,6,5,1,5,3]
BucketSort(A)
[0, 1, 1, 1, 3, 4, 5, 5, 5, 6]

10. Cardinality sort

Cardinality sorting is a non-comparative integer sorting algorithm. Its principle is to cut integers into different numbers by digits, and then compare them according to each digit.

Since integers can also express strings (such as names or dates) and floating-point numbers in a specific format, radix sorting is not limited to integers.

(1) Algorithm steps

Cardinality sorting performs multiple rounds of bitwise comparison and sorting, and the rounds depend on the number of digits of the largest data value.

First compare the order according to the ones place, and then the tens place and hundreds place and so on, the priority is from low to high, so that subsequent moves will not affect the previous ones.

Cardinality sorting. Sorting by bit comparison is essentially a kind of division, an alternative "bucket". For example, the first round compares each bit and loads them into 10'buckets' sorted by the size of the ones bit. Data with the same ones bit in the'bucket' is considered equal. The buckets are ordered and output in order. The next round The sequence is completed in this relay.

(2) Motion picture presentation

Insert picture description here
(3) Pseudo code

(4) python code

'''
1.将所有代比较数值(正整数)统一为同样的数位长度,数位较短的数前面补零。
2.然后,从最低位开始,依次进行排序
3.这样从最低位排序一直到最高位排序完成以后,数列就变成一个有序序列。
'''
def radix_sort(list):
    i=0#记录当前正在排哪一位,最低位为1
    max_num=max(list)#最大值
    j=len(str(max_num))#记录最大值的位数
    while i <j:
        bucket_list=[[] for _ in range(10)]#初始化桶数组
        for x in list:
            bucket_list[int(x/(10**i)) % 10].append(x)#找到位置放入桶数组
        print("第{}轮:".format(i+1))
        print(bucket_list)
        list.clear()
        for x in bucket_list:
            for y in x:
                list.append(y)
        print(list)
        i +=1
    return list

a=[334,5,67,345,7,5345,99,4,23,78]
result=radix_sort(a)    

[[], [], [], [23], [334, 4], [5, 345, 5345], [], [67, 7], [78], [99]]
[23, 334, 4, 5, 345, 5345, 67, 7, 78, 99]
第2轮:
[[4, 5, 7], [], [23], [334], [345, 5345], [], [67], [78], [], [99]]
[4, 5, 7, 23, 334, 345, 5345, 67, 78, 99]
第3轮:
[[4, 5, 7, 23, 67, 78, 99], [], [], [334, 345, 5345], [], [], [], [], [], []]
[4, 5, 7, 23, 67, 78, 99, 334, 345, 5345]
第4轮:
[[4, 5, 7, 23, 67, 78, 99, 334, 345], [], [], [], [], [5345], [], [], [], []]
[4, 5, 7, 23, 67, 78, 99, 334, 345, 5345]

Cardinality sort vs. Count sort vs. Bucket sort

There are two methods for radix sorting:

These three sorting algorithms all use the concept of buckets, but there are obvious differences in the use of buckets:

Cardinality sorting: assign buckets according to each digit of the key value;
counting sorting: each bucket only stores a single key value;
bucket sorting: each bucket stores a certain range of values;

Guess you like

Origin blog.csdn.net/qq_36816848/article/details/112861978