Python implements bucket sorting

Python implements bucket sorting

1. Introduction to Bucket Sorting

Bucket sort is a sorting algorithm implemented by bucketing and merging, also known as bin sorting.

Bucket sorting first divides the data into a limited number of buckets, and then sorts the data in each bucket (in-bucket sorting can use any sorting algorithm, such as quick sort), and finally merges all sorted buckets into An ordered sequence, the list is sorted.

Bucket sorting takes up a lot of extra space. To sort the data in the bucket, the choice of sorting algorithm is very important to the performance. There are not many scenarios where bucket sorting is applicable, and the more used ones are counting sorting and cardinal sorting based on the idea of ​​bucket sorting.

Second, the principle of bucket sorting

The principle of bucket sorting is as follows:

1. Find the maximum and minimum values ​​in the list to be sorted, and get the data range.

2. According to the data range, select a suitable value to construct a limited number of buckets, and determine the data range of each bucket. If the data range is [0,100), divide the data into 10 buckets, the first bucket is [0,10), the second bucket is [10,20), and so on.

3. Assign the data in the list to be sorted to the corresponding buckets.

4. Sort the data in each bucket. Any sorting algorithm can be used here. It is recommended to use a sorting algorithm with low time complexity.

5. Take out the data in all buckets one by one and add it to a new ordered sequence, and the list sorting is complete.

Take the list [5, 7, 3, 7, 2, 3, 2, 5, 9, 5, 7, 8] in ascending order as an example. The initial state of the list is shown in the figure below.

1. Find the maximum and minimum values ​​in the list to be sorted, and select a value to allocate the number of buckets. The maximum value in the example is 9 and the minimum value is 2, and three buckets are allocated.

2. Visit the list to be sorted and assign each data to the corresponding bucket in turn. 5 belongs to the scope of the second bucket and is placed in the second bucket.

3. Continue to visit the list to be sorted and sort the buckets. 7 belongs to the scope of the second bucket and is placed in the second bucket.

4. Continue to visit the list to be sorted and sort the buckets. 3 belongs to the range of the first bucket and is placed in the first bucket.

5. Continue to visit the list to be sorted and divide buckets. 7 belongs to the scope of the second bucket and is placed in the second bucket.

6. Always visit the complete list to be sorted, and put all the data in the corresponding bucket.

7. To perform in-bucket sorting on the data in each bucket, the list to be sorted needs to be sorted in ascending order, so each bucket is sorted in ascending order.

8. Take out the data in all buckets in turn and add them to the sorted sequence. First take out the data in the first bucket, 2,2,3,3.

9. Continue to take out the data in the second bucket, 5,5,5,7,7,7.

10. Continue to take out all the data in the buckets and add them to the sorted sequence. The list is sorted. The sorting result is as shown in the figure below.

Three, Python implements bucket sorting

# coding=utf-8
def bucket_sort(array):
    min_num, max_num = min(array), max(array)
    bucket_num = (max_num-min_num)//3 + 1
    buckets = [[] for _ in range(int(bucket_num))]
    for num in array:
        buckets[int((num-min_num)//3)].append(num)
    new_array = list()
    for i in buckets:
        for j in sorted(i):
            new_array.append(j)
    return new_array


if __name__ == '__main__':
    array = [5, 7, 3, 7, 2, 3, 2, 5, 9, 5, 7, 8]
    print(bucket_sort(array))

operation result:

[2, 2, 3, 3, 5, 5, 5, 7, 7, 7, 8, 9]

In the code, use the Python built-in functions max() and min() to find the maximum and minimum values ​​in the list to be sorted. Then set the data range of each bucket to 3, create three buckets, and then add the data to the corresponding buckets. Take out each bucket and sort the data in each bucket. Python's built-in function sorted() is used directly in the code. Sorting algorithms such as quick sort can also be used here. After the data in the buckets are sorted, the data in each bucket is sequentially added to an ordered sequence, and the list is sorted.

The i in the code represents the i-th bucket, and j represents the j-th data after sorting the data in the bucket.

Fourth, the time complexity and stability of bucket sorting

1. Time complexity

In bucket sorting, each element in the list to be sorted needs to be visited, and the list length is n, and then each bucket needs to be sorted in the bucket. The worst time complexity of sorting in a single bucket is O(ni ^2), ni means that there are ni data in the i-th bucket, there are a total of k buckets, and the time complexity is n plus the time complexity of sorting in each bucket. In the worst case, all data is divided into one bucket Within, ni=n, the time complexity is T(n)=n+n^2, and then multiply the number of steps for bucketing and sorting (constant, does not affect the big O notation), so the time complexity of bucket sorting is O (n^2).

The optimal situation of bucket sorting is to evenly distribute the data to each bucket. At this time, there are k buckets, and each bucket has n/k data. The average time complexity of sorting in each bucket is O(n /k*logn/k), the time complexity of the entire bucket sorting is T(n)=n+k*n/k*logn/k, and when k=n, that is, there is only one element in each bucket (not (Need to sort in the bucket), the time complexity is O(n).

2. Stability

According to the sorting principle of bucket sorting, the list to be sorted is divided into buckets, sorted in buckets, and merged. When performing in-bucket sorting on each bucket, different sorting algorithms can be used. Some sorting algorithms are stable, and some sorting algorithms are unstable, which will affect the stability of bucket sorting. So the stability of bucket sorting depends on the stability of the bucket sorting algorithm.

 

 

Guess you like

Origin blog.csdn.net/weixin_43790276/article/details/107398295