Python- data structure algorithms

First, what is the algorithm

算法（Algorithm）：一个计算过程，解决问题的方法

Second, the time complexity, space complexity

Ⅰ, time complexity

    时间复杂度是一个函数，它定量描述该算法的运行时间，时间复杂度常用“O”表示，时间复杂度可被称为是渐近的，它考察当输入值大小趋近无穷时的情况。呈现时间频度的变化规律，记为T(n)=O(f(n)) 指数时间：一个问题求解所需的执行时间m(n),依输入数据n呈指数倍成长（即 求解所需的执行时间呈指数倍成长）

The time complexity of a formula is used to estimate the running time (unit), in general, high complexity of the algorithm is lower than the time complexity of the algorithm is slow

print('Hello world')  # O(1)
 
 
# O(1)
print('Hello World')
print('Hello Python')
print('Hello Algorithm')
 
 
for i in range(n):  # O(n)
    print('Hello world')
 
 
for i in range(n):  # O(n^2)
    for j in range(n):
        print('Hello world')
 
 
for i in range(n):  # O(n^2)
    print('Hello World')
    for j in range(n):
        print('Hello World')
 
 
for i in range(n):  # O(n^2)
    for j in range(i):
        print('Hello World')
 
 
for i in range(n):
    for j in range(n):
        for k in range(n):
            print('Hello World')  # O(n^3)

Ⅱ, space complexity

Space complexity: a formula used to estimate the size of the footprint algorithm

Third, the sorted list

Sort the list: The list of unordered list into order

1, bubble sort

所谓冒泡，就是将元素两两之间进行比较，谁大就往后移动，直到将最大的元素排到最后面，接着再循环一趟，从头开始进行两两比较，而上一趟已经排好的那个元素就不用进行比较了。（图中排好序的元素标记为黄色柱子）

Time complexity: O (n2)

mppx

Basic writing:

def bubble_sort(li):
    for i in range(len(li) - 1):
        for j in range(len(li) - 1 - i):
            if li[j] > li[j + 1]:
                li[j], li[j + 1] = li[j + 1], li[j]
    return li


list = [3, 1, 5.78, 8, 63, 96, 21, 0, 2]
print(bubble_sort(list))

Bubble optimization 1

# 设定一个变量为False，如果元素之间交换了位置，将变量重新赋值为True,最后再判断，#在一次循环结束后，变量如果还是为False，则brak退出循环，结束排序。
#如果冒泡排序中执行一趟而没有交换，则列表已经是有序状态，可以直接结束算法。
def bubble_sort(li):
    for i in range(len(li) - 1):
        flag = False
        for j in range(len(li) - 1 - i):
            if li[j] > li[j + 1]:
                li[j], li[j + 1] = li[j + 1], li[j]
                flag = True
        print("冒泡第%s次" % i)
        if not flag:
            print("yes")
            break
    return li


# list = [3, 1, 5.78, 8, 63, 96, 21, 0, 2]
list = [1, 2, 3, 4, 5, 8, 6, 9, 7]
print(bubble_sort(list))

Optimization of three bubbling

Two-way sorting efficiency, that is, when a sort of left to right end of the comparison, flew from right to left to a sort comparison.

# 双向冒泡
@cal_time
def bubble_sort2(li):
    for i in range(len(li) - 1):
        flag = False
        for j in range(len(li) - 1 - i):
            if li[j] > li[j + 1]:
                li[j], li[j + 1] = li[j + 1], li[j]
                flag = True
        if flag:
            for j in range(len(li) - 1 - i, 0, -1):
                if li[j - 1] > li[j]:
                    li[j - 1], li[j] = li[j], li[j - 1]
                    flag = True
        if not flag:
            break
    return li

Three kinds of bubble contrast execution time

import time, sys
sys.setrecursionlimit(1000000)
# 自定义计时装饰器
def cal_time(func):
    def waraper(*args, **kwargs):
        t1 = time.time()
        res = func(*args, **kwargs)
        t2 = time.time()
        print('%s消耗的时间是:%s' % (func.__name__, t2 - t1))
        return res
    return waraper
# 随机1-10000生成列表
li = list(range(1, 10000))
random.shuffle(li)
bubble_sort(li)
bubble_sort1(li)
bubble_sort2(li)
---------------------------------------------------------------------
# 时间：
bubble_sort消耗的时间是:9.232983350753784
bubble_sort1消耗的时间是:0.0010197162628173828
bubble_sort2消耗的时间是:0.0009989738464355469

2. Select Sort

The core algorithm: fixed position, selected elements, namely: start sequence, find the smallest element in the first position, the second small element after finding, on the second element, and so on, can be completed Sort the entire work.

# 时间复杂度: O(n^2) 空间复杂度: O(1)
@cal_time
def select_sort(li):
    for i in range(len(li) - 1):
        min_li = i
        for j in range(i + 1, len(li)):
            if li[j] < li[min_li]:
                min_li = j
        if min_li != i:
            li[i], li[min_li] = li[min_li], li[i]


li = list(range(1, 10000))
random.shuffle(li)
select_sort(li)

3, insertion sort

# 列表被分为有序区和无序区两个部分。最初有序区只有一个元素。
# 每次从无序区选择一个元素，插入到有序区的位置，直到无序区变空。
@cal_time
def insert_sort(li):
    for i in range(1, len(li)):
        tmp = li[i]
        # 设置当前值其哪一个元素标识
        j = i - 1
        while j >= 0 and tmp < li[j]:
            li[j + 1] = li[j]
            j = j - 1
        li[j + 1] = tmp
li = list(range(1, 100))
random.shuffle(li)
insert_sort(li)
print(li)

Sort low group

Insertion sort Bubble sort Selection Sort
Time complexity: O (n2)
Space complexity: O (1)

4, Quick Sort

    快速排序的思想：首先任意选取一个数据（通常选用数组的第一个数）作为关键数据，然后将所有比它小的数都放到它前面，所有比它大的数都放到它后面，这个过程称为一趟快速排序

Fast sorting algorithms are:

1) Set two variables i, j, at the beginning of the sort: i = 0, j = N-1;

2) to the first array element as the key data, assigned to the key, i.e., key = A [0];

3) Search forward starting from j, i.e., starting from the forward search (J,), the first to find a value of less than key A [j], the A [j] and A [i] are interchangeable;

4) start from the back search i, i.e., before the start of the backward search (i ++), is greater than a first key to find the A [i], the A [i] and A [j] interchangeable;

5) Repeat steps 3 and 4 until i = j; (3,4 step, did not find qualified value, i.e. 3 A [j] is not less than the key, in. 4 A [i] is not greater than the time key changing j, the value of i such that j = j-1, i = i + 1, until you find found qualified value, when the exchange for i, j pointer position unchanged. Further, i == j that process must be exactly the time i + j- or completed, so that at this time the end of the cycle).

# 时间复杂度: O(nlogn) 空间复杂度: O(1)
def partition(data, left, right):
    tmp = data[left]
    while left < right:
        while left < right and data[right] >= tmp:
            right -= 1
        data[left] = data[right]
        while left < right and data[left] <= tmp:
            left += 1
        data[right] = data[left]
    data[left] = tmp
    return tmp


def _quick_sort(data, left, right):
    if left < right:
        mid = partition(data, left, right)
        _quick_sort(data, left, mid - 1)
        _quick_sort(data, mid + 1, right)


@cal_time
def quick_sort(li, left, right):
    _quick_sort(li, left, right)

4, merge sort

**归并排序**仍然是利用完全二叉树实现，它是建立在归并操作上的一种有效的排序算法,该算法是采用分治法（Divide and Conquer）的一个非常典型的应用。将已有序的子序列合并，得到完全有序的序列。

**基本过程**：假设初始序列含有n个记录，则可以看成是n个有序的子序列，每个子序列的长度为1，然后两两归并，得到n/2个长度为2或1的有序子序列，再两两归并，最终得到一个长度为n的有序序列为止，这称为2路归并排序。

The main figure shows two examples of code portions, respectively, split and merge of two parts of the original sequence

def merge(li, low, mid, high):
    i = low
    j = mid + 1
    ltmp = []
    while i <= mid and j <= high:
        if li[i] < li[j]:
            ltmp.append(li[i])
            i = i + 1
        else:
            ltmp.append(li[j])
            j = j + 1

    while i <= mid:
        ltmp.append(li[i])
        i = i + 1

    while j <= high:
        ltmp.append(li[j])
        j = j + 1

    li[low:high + 1] = ltmp


def _mergesort(li, low, high):
    if low < high:
        mid = (low + high) // 2
        _mergesort(li, low, mid)
        _mergesort(li, mid + 1, high)
        merge(li, low, mid, high)


@cal_time
def mergesort(li, low, high):
    _mergesort(li, low, high)

The implementation of a variety of sorting algorithms under the same conditions

li = list(range(1, 10000))
random.shuffle(li)
mergesort(li, 0, len(li) - 1)

li = list(range(1, 10000))
random.shuffle(li)
bubble_sort1(li)

li = list(range(1, 10000))
random.shuffle(li)
bubble_sort2(li)

li = list(range(1, 10000))
random.shuffle(li)
select_sort(li)

li = list(range(1, 10000))
random.shuffle(li)
insert_sort(li)

li = list(range(1, 10000))
random.shuffle(li)
quick_sort(li, 0, len(li) - 1)
---------------------------------------------------------------------mergesort消耗的时间是:0.03596305847167969
bubble_sort1消耗的时间是:12.642452716827393
bubble_sort2消耗的时间是:10.469623804092407
select_sort消耗的时间是:4.044940233230591
insert_sort消耗的时间是:5.539592027664185
quick_sort消耗的时间是:0.032910823822021484

5, Hill sorting

Shell sort is a packet insertion sort.

First, take an integer d1 = n / 2, the elements are divided into groups d1, d1 is the distance between adjacent elements of each quantity for direct insertion sort in each group;

Take second integer d2 = d1 / 2, packet ordering repeat the process until the di = 1, i.e. for direct insertion sort all elements within the same group.

Hill did not make the trip every sort ordered some elements, but the overall data closer to orderly; the last trip so that all data sorting order.

# 希尔排序的实质就是分组插入排序，该方法又称缩小增量排序，因DL．Shell于1959年提出而得名。
# 希尔排序，也称递减增量排序算法，是插入排序的一种更高效的改进版本。希尔排序是非稳定排序算法。
# 希尔排序是基于插入排序的以下两点性质而提出改进方法的：
# 插入排序在对几乎已经排好序的数据操作时，效率高，即可以达到线性排序的效率
# 但插入排序一般来说是低效的，因为插入排序每次只能将数据移动一位

def shell_sort(li):
    gap = len(li) // 2
    while gap > 0:
        for i in range(gap, len(li)):
            tmp = li[i]
            j = i - gap
            while j >= 0 and tmp < li[j]:
                li[j + gap] = li[j]
                j -= gap
                li[j + gap] = tmp
                gap /= 2

summary:

Sort method		time complexity		stability	Code complexity
	Worst case	Average case	Best case
Bubble Sort	O (n2)	O (n2)	O (n)	stable	simple
Direct Selection Sort	O (n2)	O (n2)	O (n2)	Unstable	simple
Direct insertion sort	O (n2)	O (n2)	O (n2)	stable	simple
Quick Sort	O (n2)	O (nlogn)	O (nlogn)	Unstable	More complex
Heapsort	O (nlogn)	O (nlogn)	O (nlogn)	Unstable	complex
Merge sort	O (nlogn)	O (nlogn)	O (nlogn)	stable	More complex
Shell sort		O (1.3n)		Unstable	More complex