nowcoder basic algorithm chapter 2

划分问题和快排

普通划分

给定一个数组arr，和一个数num，请把小于等于num的数放在数组的左边，大于num的数放在数组的右边。要求额外空间复杂度O(1)，时间复杂度O(N)

想法是这样的，维持一个index，表示在index以及以前的数字都是小于等于num的。

def split_array(arr, num):
    """维持一个小于等于区域的index，
    如果一个数小于等于num，则和index位置交换
    如果一个数大于num，则直接跳下一个
    """
    i = -1
    for j in range(len(arr)):
        if arr[j] > num:
            pass
        else:
            i += 1
            arr[j], arr[i] = arr[i], arr[j]
    return arr

荷兰国旗问题

给定一个数组arr，和一个数num，请把小于num的数放在数组的左边，等于num的数放在数组的中间，大于num的数放在数组的右边。要求额外空间复杂度O(1)，时间复杂度O(N)

思路是这样的，维持两个index，一个less，一个more

def split_array(arr, num):
    """总体的策略是，遇到小的数字，那么把它往前面移动，遇到大的数字，把它让后面移动，遇到相等的数字，不动
    维持两个三个变量来进行交换 less  , more , current
    其中more的值应该为len(arr),这样在判断的时候多判断一次，并且j-=1写在后面

    more要等于len(arr),并且运算的时候先要减一，否则会出现这种情况：
    [7,6,4,1,2,5,4] 变为[2, 4, 4, 1, 5, 6, 7]，因为这种情况下面并没有对中间的1进行排序，少了一次排序的次数
    """
    current = 0
    less = -1
    more = len(arr)
    while(current < more):
        if arr[current] < num:      # 如果小于，那么less和current要进行交换，并且都推进一位
            less += 1
            arr[less], arr[current] = arr[current], arr[less]
            current += 1
        elif arr[current] == num:   # 如果等于，那么直接向下进行
            current += 1
        else:
            more -= 1
            arr[current], arr[more] = arr[more], arr[current]
    return arr

快排算法

由划分问题引出来的快排算法，用一个数将数组划分为两部分，然后剩下的两部分继续划分

import random
def quick_sort(arr):
    """快排散发
    
    时间复杂度：随机快排是一种随机算法，使用期望来求解，时间复杂度为O(NlogN)
    空间复杂度： O(logN)
    
    稳定性： 非稳定的，由于是随机选择， 比如有两个相等的数，可能选择左边的，然后右边的会放到它的左边
    """
    sort_range(arr, 0, len(arr)-1)


def sort_range(arr, left, right):
    if left < right:
        position = split_array(arr, left, right)
        sort_range(arr, left, position[0]-1)
        sort_range(arr, position[1]+1, right)

def split_array(arr, left, right):
    """遇到小与num的数，那么把它往前面移动，遇到大的数字，把它让后面移动，遇到相等的数字，继续向后走。

    less = left-1 与 more = right +1
    在less下标左边的数字（包含less）都是比num要小的，
    在more右边的数字（包含more）都是比more要大的


    随机快排的时间复杂度： O(nlogn)
            空间复杂度： O(logn)   使用随机算法，这个复杂度是求随机的期望得到的
    Args:
        left 表示需要排序的左区间，包含left
        right  需要排序的由区间 包含right
    Returns:
        [less+1, more-1] 一个含有两个数的里列表，表示划分的num的坐下标和右下标 如排序后的[2,1,4,4,5,6,7]返回[2,3]
    """
    num = arr[random.randint(left, right)]
    current = left
    less = left -1
    more = right + 1
    while(current < more):
        if arr[current] < num:      # 如果小于，那么less和current要进行交换，并且都推进一位
            less += 1
            arr[less], arr[current] = arr[current], arr[less]
            current += 1
        elif arr[current] == num:   # 如果等于，那么直接向下进行
            current += 1
        else:
            more -= 1
            arr[current], arr[more] = arr[more], arr[current]
    return [less+1 , more-1]

堆结构

一个数组可以看做是一颗完全二叉树。节点标号是从0开始的，其中父节点和子节点的关系：如果父节点为i，那么左子节点为 2*i+1, 右子节点为2*i+2。若果子节点为i,那么父节点为(i-1)/2,对于0也使用，前提是除法是向0舍入。

大根堆和小根堆：大根堆是一个节点的值总是比它下面节点所有的值都大，这样根节点的值为最大值。小根堆则刚好相反。

如何将数组转换为大根堆：

思路是这样的，对于数组节点都做一次调整，如果它比父节点大，那么和父节点交换，直到交换到根节点。

def heap_insert(arr, current_index):
    father_index = int((current_index - 1) / 2)
    while(arr[father_index] < arr[current_index]):
        arr[father_index], arr[current_index] = arr[current_index], arr[father_index]
        current_index = father_index
        father_index = int((current_index - 1) / 2)
x = [2, 1, 3, 6, 0, 4]
for i in range(len(x)):
    heap_insert(x, i)

大根堆的下沉操作：

假如有一个节点的值变小了，那么如何调整这个数组，使得它依然是一个堆，这个过程叫做下沉。思路是这样的：如果是左子树不存在，那么它是叶子节点，不做调整。判断右子树存在吗？然后找出左子树和右子树的最大值，如果当前值比最大值大，那么退出，否则进行交换，然后重复这样的操作。

def heapipy(arr, current_index, heap_size):
    """在大根堆的情况下，如果一个节点变小情况下的 向下调整过程

    Args:
        arr: 要调整的数组
        current_index: 要替换的下标
        heap_size: 数组的长度(堆的大小)
    """
    left_index = 2 * current_index + 1
    while(left_index < heap_size):
        right_index = left_index + 1
        if right_index < heap_size:
            largest =  left_index if arr[left_index] >= arr[right_index] else right_index
        else:
            largest = left_index

        largest = largest if arr[largest] > arr[current_index] else current_index
        if largest == current_index:
            break
        else:
            arr[largest], arr[current_index] = arr[current_index], arr[largest]
            current_index = largest
            left_index = 2*current_index + 1

堆排序：

利用上面两个特性，首先来构造根，然后是将末尾的数和第一个数交换，然后第一个数下沉一下。

def heapipy(arr, current_index, heap_size):
    """在大根堆的情况下，如果一个节点变小情况下的 向下调整过程

    Args:
        arr: 要调整的数组
        current_index: 要替换的下标
        heap_size: 数组的长度(堆的大小)
    """
    left_index = 2 * current_index + 1
    while(left_index < heap_size):
        right_index = left_index + 1
        if right_index < heap_size:
            largest =  left_index if arr[left_index] >= arr[right_index] else right_index
        else:
            largest = left_index

        largest = largest if arr[largest] > arr[current_index] else current_index
        if largest == current_index:
            break
        else:
            arr[largest], arr[current_index] = arr[current_index], arr[largest]
            current_index = largest
            left_index = 2*current_index + 1

def heap_insert(arr, current_index):
    """在数组指定的位置插入值，一般是末尾值，这样能够不断的构建大堆"""
    father_index = int((current_index - 1) / 2)
    while(arr[father_index] < arr[current_index]):
        arr[father_index], arr[current_index] = arr[current_index], arr[father_index]
        current_index = father_index
        father_index = int((current_index - 1) / 2)

def heap_sort(arr):
    """堆排序

    时间复杂度：建堆的时间复杂度为O(n), 调整过程的时间复杂度为O(nlogn)，所以总体复杂度为O(nlogn)
    空间复杂度： O(1) 只是交换的时候用了一下额外空间

    稳定性： 不稳定， 比如[4a,4b,4c,5]建堆的时候发生错误，变为【5,4a,4c,4b],排序后变为【5，4b,4c,4a】
            [9,5A,7,5B]在放入堆的时候会发生错误
    """
    # 建立堆栈的过程
    for i in range(len(arr)):
        heap_insert(arr, i)

    arr_length = len(arr)
    arr[0], arr[-1] = arr[-1], arr[0]
    arr_length -= 1
    while(arr_length > 0):
        heapipy(arr, 0, arr_length)
        arr[0], arr[arr_length-1] = arr[arr_length-1], arr[0]
        arr_length -= 1

堆的应用：求不断产生数的中位数

由一个数组产生器不断的产生一组数，在产生的过程当中求这些数的中位数。

暴力解法：每求一个数，然后将数组进行一些排序，求中位数，时间复杂度为O(NlogN)

使用堆结构：使用一个大根堆和一个小根堆，两个堆存放的数量相差不会超过1，这样大根堆所有的数小于小根堆所有的数，中位数在大根堆的堆顶或者小根堆的堆顶。

【关于如何弹出一个堆的最大值，或者最小值】，有一个很好的方法是，【1：第一个数和最后一个数交换，2弹出最后一个数，3：向下调整第一个数】

代码如下：

import random

def heapipy(arr, current_index, heap_size):
    """在大根堆的情况下，如果一个节点变小情况下的 向下调整过程
    Args:
        arr: 要调整的数组
        current_index: 要替换的下标
        heap_size: 数组的长度(堆的大小)
    """
    left_index = 2 * current_index + 1
    while(left_index < heap_size):
        right_index = left_index + 1
        if right_index < heap_size:
            largest =  left_index if arr[left_index] >= arr[right_index] else right_index
        else:
            largest = left_index

        largest = largest if arr[largest] > arr[current_index] else current_index
        if largest == current_index:
            break
        else:
            arr[largest], arr[current_index] = arr[current_index], arr[largest]
            current_index = largest
            left_index = 2*current_index + 1

def small_heapipy(arr, current_index, heap_size):
    """在大根堆的情况下，如果一个节点变小情况下的 向下调整过程
    Args:
        arr: 要调整的数组
        current_index: 要替换的下标
        heap_size: 数组的长度(堆的大小)
    """
    left_index = 2 * current_index + 1
    while(left_index < heap_size):
        right_index = left_index + 1
        if right_index < heap_size:
            small =  left_index if arr[left_index] <= arr[right_index] else right_index
        else:
            small = left_index

        small = small if arr[small] < arr[current_index] else current_index
        if small == current_index:
            break
        else:
            arr[small], arr[current_index] = arr[current_index], arr[small]
            current_index = small
            left_index = 2*current_index + 1


def big_heap_insert(arr, current_index):
    father_index = int((current_index - 1) / 2)
    while(arr[father_index] < arr[current_index]):
        arr[father_index], arr[current_index] = arr[current_index], arr[father_index]
        current_index = father_index
        father_index = int((current_index - 1) / 2)
def small_heap_insert(arr, current_index):
    father_index = int((current_index - 1) / 2)
    while(arr[father_index] > arr[current_index]):
        arr[father_index], arr[current_index] = arr[current_index], arr[father_index]
        current_index = father_index
        father_index = int((current_index - 1) / 2)


small_heap = []
big_heap = []
for i in range(100):
    x = random.randint(1,100)
    if not big_heap:
        big_heap.append(x)
    else:
        if x <= big_heap[0]:
            big_heap.append(x)
            big_heap_insert(big_heap,len(big_heap)-1)
        else:
            small_heap.append(x)
            small_heap_insert(small_heap, len(small_heap)-1)
        if len(big_heap) - len(small_heap) == 2:

            big_heap[0], big_heap[-1] = big_heap[-1], big_heap[0]
            tmp = big_heap.pop()
            heapipy(big_heap, 0, len(big_heap))
            small_heap.append(tmp)  # 有代码重复的意思在里面
            small_heap_insert(small_heap, len(small_heap)-1)
        elif len(small_heap) - len(big_heap) == 2:

            small_heap[0], small_heap[-1] = small_heap[-1], small_heap[0]
            tmp = small_heap.pop()
            small_heapipy(small_heap,0, len(small_heap))

            big_heap.append(tmp)
            big_heap_insert(big_heap,len(big_heap)-1)

桶排序：

计数排序

def counting_sort(arr, arr_size):
    """计数排序

    它是非比较的排序算法 N=len(arr) K = arr_size

    时间复杂度： O(N+K)
    空间复杂度： O(N+K)

    稳定性： 稳定性排序
    """
    buck_count = [0] * arr_size
    for i in arr:
        buck_count[i-1] += 1

    result = []
    for i in range(arr_size):
        result += [i+1] * buck_count[i]
    return result

基数排序

def radix_sort(arr, digit):
    """基数排序

    时间复杂度： O(N)  其实是O(digit*n)一般digit很小，所以为O(N)
    空间复杂度: O(N)  用了两个数组，bucket和result（下面程序当中arr=[]，直接将值赋给了arr，而不是result)

    稳定性：稳定性算法
    """
    for i in range(digit):
        bucket = [[] for i in range(10)]
        for x in arr:
            number_of_digit = (x % (10**(i+1))) // (10 ** i)  # 得到第i位的数字
            bucket[number_of_digit].append(x)
        arr = []
        for i in range(10):
            arr += bucket[i]
    return arr

求相邻数的最大差值，时间复杂度O(N)

给定一个数组，求如果排序之后，相邻两数的最大差值，要求时间复杂度O(N)，且要求不能用非基于比较的排序

思路是利用桶的思想来进行解决，将相差范围内的数字放入一个桶当中

def get_index_by_num(num, arr_length, max_value, min_value):
    """将一个数划分到桶中
    将arr_length 切分成 （max_value - min_value)份
    然后在 min_value开始，分配这些份
    """
    return int( arr_length / (max_value - min_value) * (num-min_value)  )


def get_max_sub_min(arr):

    max_value = float('-inf')
    min_value = float('inf')

    for i in arr:
        max_value = i if i > max_value else max_value
        min_value = i if i < min_value else min_value
    if max_value == min_value:
        return 0

    arr_length = len(arr)
    max = [float('-inf')] * (arr_length + 1)
    min = [float('inf')] * (arr_length + 1)
    has_num = [False] * (arr_length + 1)

    for i in arr:
        index = get_index_by_num(i, arr_length, max_value, min_value)
        has_num[index] = True
        min[index] = i if i < min[index] else min[index]
        max[index] = i if i > max[index] else max[index]
    print(min)
    print(max)
    print(has_num)

    #最大值不会出现在一个桶的内部，只会出现在两个桶之间，但并不一定是空桶之间
    res = 0
    last_max = max[0]

    for index in range(1,arr_length + 1):
        if has_num[index]:
            res = min[index] - last_max if (min[index] - last_max) > res else res
            last_max = max[index]
    return res