Bucket-based sorting radix sorting and sorting method summary

Description #

There are two bucket-based sorts, counting sort and radix sort .

However, the application range of these two kinds of sorting is limited, and the data status of the sample needs to satisfy the division of buckets

For a description of the counting sorting algorithm, see: Counting sorting for bucket-based sorting

Radix Sort #

Generally speaking, radix sorting requires that samples be positive integers in decimal, and the process is as follows

Step 1: find the maximum value, how many digits is the maximum value, if other numbers are less than this digit, fill it with 0;

For example:

The original array is

arr = {17,210,3065,40,71,2}

The maximum value is 3065, which is four digits, and the others are not four digits. The front is supplemented with 0, so let the array become

arr = {0017,0210,3065,0040,0071,0002}

Step 2: Prepare 10 buckets, each bucket is a queue;

Step 3: enter the buckets sequentially from the ones (connected with the elements in the corresponding buckets with a queue), and then pour them out in sequence; then enter the buckets according to the tens digit (connected with the elements in the corresponding buckets with a queue), and pour them out in sequence; And so on, until the highest into the barrel, and then poured out. The last poured order is the sorted result.

Taking the above array as an example, the whole process of radix sorting is as follows

In the first round of bucketing, the numbers in the units digit are: 7, 0, 5, 0, 1, 2. After bucketing, the result is as follows:

Then out of the bucket, the result of the first round is {0210,0040,0071,0002,3065,0017};

In the second round of entering the barrel, the numbers on the tens digit are: 1, 4, 7, 0, 6, 1. After entering the barrel, the effect is as follows:

The second round out of barrels: the result is {0002,0210,0017,0040,3065,0071};

In the third round of bucketing, the numbers in the hundreds place are: 0, 2, 0, 0, 0, 0. After bucketing, the effect is as follows:

The fourth round of bucketing is also the last round of bucketing. The numbers in the thousands place are: 0,0,0,3,0,0. After bucketing, the effect is as follows:

The last round of barrel output: the result is that {0002,0017,0040,0071,0210,3065}it has been sorted.

The above is the process of radix sorting, but in terms of algorithm implementation, there is a more optimal solution

According to the entire radix sorting algorithm, some optimizations have been made on the code, and an array containing ten elements is used to countrepresent the bucket, and the entire code does not use the data structure of the queue, and only  count the array is used to realize the process of entering and exiting the bucket , and then analyze the code one by one, where helperthe array is used to store the sorted array, bitsindicating how many digits there are in the maximum decimal system, and the process is consistent with the algorithm process mentioned above:

// 从个位开始,一直到最高位,不断入桶出桶
for (int bit = 1; bit <= bits; bit++) {
    // 入桶
    // 出桶
}

The logic of entering the bucket, originally we needed to record the value into the bucket in the queue corresponding to the bucket, but now it is not necessary, we only need to record a number, which is the following logic

// 从个位开始,一直到最高位,不断入桶出桶
for (int bit = 1; bit <= bits; bit++) {
    int[] count = new int[10];
    for (int num : arr) {
        count[digit(num, bit)]++;
    }
    // 出桶
}

Using the above example array to illustrate, the initial state of the example array is {0017,0210,3065,0040,0071,0002}, after the first round of single-digit bucketing operation, countthe array will become {2,1,1,0,0,1,0,1,0,0}, as shown in the example below

original algorithm

After optimizing with count

It can be seen that count only stores the number in the array and how many numbers have the same digit value.

for example:

count[0] = 2; // 说明 0 号桶在第一轮入桶的时候,有两个数,也说明个位上是 0 的数有两个。
count[5] = 1; // 说明 5 号桶在第一轮入桶的时候,有一个数,也说明个位上是 5 的数有一个。
......

Next is the bucketing operation. In the original algorithm, the value exists in the queue, and the buckets are traversed from left to right, and the elements in the bucket can be traversed according to the queue; after optimization, there is only one array, and the array only records the number. How to realize countthe count bucketing Woolen cloth? Objectively speaking, in the first round, the order in which the barrels are released is {0210,0040,0071,0002,3065,0017}, in fact, that is, the numbers in the barrels are numbered from small to large, and the numbers in the buckets come out in order.

count={2,1,1,0,0,1,0,1,0,0}Based on the prefix and array available in the first round {2,3,4,4,4,5,5,6,6,6}, the bucket with the largest number and containing array elements is bucket No. 7, and there is only one number in bucket No. 7, which is 0017, so this number must be the last one out of the bucket! The next largest bucket number that contains elements is bucket No. 5, and bucket No. 5 has only one number, which is 3065. This number must be the number that comes out in the penultimate order! By analogy, you can extract all the numbers in the first round of buckets in order, and enter the core code as follows:

      // 前缀和
      for (int j = 1; j < 10; j++) {
        count[j] = count[j - 1] + count[j];
      }
      // 倒序遍历数组
      for (int i = arr.length - 1; i >= 0; i--) {
        int pos = digit(arr[i], bit);
        // 数组中某一位是 pos 的数,在某一轮入桶后
        // 出桶的时候,应该处在什么位置!!!
        help[--count[pos]] = arr[i];
      }

For complete code see

public class Code_RadixSort {

  // 非负数
  public static void radixSort(int[] arr) {
    if (arr == null || arr.length <= 1) {
      return;
    }
    int max = arr[0];
    for (int i = 1; i < arr.length; i++) {
      max = Math.max(arr[i], max);
    }
    // 最大值有几位
    int bits = 0;
    while (max != 0) {
      bits++;
      max /= 10;
    }
    int[] help = new int[arr.length];
    for (int bit = 1; bit <= bits; bit++) {
      int[] count = new int[10];
      for (int num : arr) {
        count[digit(num, bit)]++;
      }
      // 前缀和
      for (int j = 1; j < 10; j++) {
        count[j] = count[j - 1] + count[j];
      }
      // 倒序遍历数组
      for (int i = arr.length - 1; i >= 0; i--) {
        int pos = digit(arr[i], bit);
        help[--count[pos]] = arr[i];
      }
      int m = 0;
      for (int num : help) {
        arr[m++] = num;
      }
    }
  }

  // 获取某个数在某一位上的值
  // 从1开始,从个位开始
  public static int digit(int num, int digit) {
    return ((num / (int) Math.pow(10, digit - 1)) % 10);
  }
}

Sorting Summary #

time complexity additional space complexity stability
selection sort O(N^2) O(1) none
Bubble Sort O(N^2) O(1) have
insertion sort O(N^2) O(1) have
merge sort O(N*logN) O(N) have
random quicksort O(N*logN) O(logN) none
heap sort O(N*logN) O(1) none
counting sort O(N) O(M) have
radix sort O(N) O(N) have

Selection sort cannot achieve stability, such as: 5, 5, 5, 5, 5, 3, 5, 5, 5

Bubble sort can achieve stability, when equal, do not go to the right

Insertion sorting can achieve stability. When equal, do not continue to exchange to the left

Quick sort cannot achieve stability, because the partition process cannot be stable, and a certain number will be exchanged with the area less than or equal to

0) The key to sorting stability is when dealing with equality

1) Sorting that is not based on comparison has strict requirements on sample data and is not easy to rewrite

2) Sorting based on comparison, as long as the ratio of two samples is specified, it can be directly reused

3) For sorting based on comparison, the limit of time complexity is O(N*logN)

4) Time complexity O(N*logN), additional space complexity is lower than O(N), and stable comparison-based sorting does not exist.

5) Choose fast sorting for absolute speed, stacking for saving space, and merging for stability

Guess you like

Origin blog.csdn.net/jh035512/article/details/128109768