Understanding the bucket sorting algorithm

    In this tutorial, you will learn how the bucket sorting algorithm works. In addition, you will find examples using C language.
    Bucket sorting is a sorting technique. It first divides elements into several groups called buckets to sort the elements. Each bucket can use its own algorithm or recursively call the same algorithm to sort the elements in the bucket.
    Create several buckets. Each bucket is filled with a specific range of elements. The elements in the bucket can be sorted using any algorithm. Finally, collect the elements in the bucket to get the sorted array.
    The bucket sorting process can be understood as a scatter-gather method. First, the elements are scattered into the bucket, and then the elements in the bucket are sorted. Finally, set the sorted elements. (I feel that these two paragraphs of the original text are long-winded)
Insert picture description here

How does bucket sorting work?
  1. Suppose the input array is:
    Insert picture description here
    create an array of size 10. Each slot of this array is used as a bucket for storing elements.
    Insert picture description here
  2. Insert elements from the array into the bucket. Insert elements according to the scope of the bucket.
    In the sample code, the ranges of the buckets are 0 to 1, 1 to 2, 2 to 3, ... (n-1) to n.
    Assuming that the input element is 0.23, multiply by 10 (ie 0.23*10=2.3). Then, it is converted into an integer (ie 2.3≈2). Finally, 0.23 is inserted into bucket 2.
    Insert picture description here
    Similarly, 0.25 is also inserted into the same bucket. Fetch the lower limit of the floating point number each time.
    If we take an integer as input, we must divide it by the interval (10 here) to get the lower limit.
    Similarly, other elements will be inserted into their respective buckets.
    Insert picture description here
  3. Any stable sorting algorithm can be used to sort the elements of each bucket. Here, we used quick sort (built-in function).
    Insert picture description here
  4. Collect the elements in each bucket.
    This is done by traversing the bucket and inserting a single element into the original array in each loop. Once the element in the bucket is copied into the original array, the element will be erased.
    Insert picture description here
Bucket sorting algorithm
bucketSort()
  create N buckets each of which can hold a range of values
  for all the buckets
    initialize each bucket with 0 values
  for all the buckets
    put elements into buckets matching the range
  for all the buckets 
    sort elements in each bucket
  gather elements from each bucket
end bucketSort
C example
// Bucket sort in C

#include <stdio.h>
#include <stdlib.h>

#define NARRAY 7   // Array size
#define NBUCKET 6  // Number of buckets
#define INTERVAL 10  // Each bucket capacity

struct Node {
    
    
  int data;
  struct Node *next;
};

void BucketSort(int arr[]);
struct Node *InsertionSort(struct Node *list);
void print(int arr[]);
void printBuckets(struct Node *list);
int getBucketIndex(int value);

// Sorting function
void BucketSort(int arr[]) {
    
    
  int i, j;
  struct Node **buckets;

  // Create buckets and allocate memory size
  buckets = (struct Node **)malloc(sizeof(struct Node *) * NBUCKET);

  // Initialize empty buckets
  for (i = 0; i < NBUCKET; ++i) {
    
    
    buckets[i] = NULL;
  }

  // Fill the buckets with respective elements
  for (i = 0; i < NARRAY; ++i) {
    
    
    struct Node *current;
    int pos = getBucketIndex(arr[i]);
    current = (struct Node *)malloc(sizeof(struct Node));
    current->data = arr[i];
    current->next = buckets[pos];
    buckets[pos] = current;
  }

  // Print the buckets along with their elements
  for (i = 0; i < NBUCKET; i++) {
    
    
    printf("Bucket[%d]: ", i);
    printBuckets(buckets[i]);
    printf("\n");
  }

  // Sort the elements of each bucket
  for (i = 0; i < NBUCKET; ++i) {
    
    
    buckets[i] = InsertionSort(buckets[i]);
  }

  printf("-------------\n");
  printf("Bucktets after sorting\n");
  for (i = 0; i < NBUCKET; i++) {
    
    
    printf("Bucket[%d]: ", i);
    printBuckets(buckets[i]);
    printf("\n");
  }

  // Put sorted elements on arr
  for (j = 0, i = 0; i < NBUCKET; ++i) {
    
    
    struct Node *node;
    node = buckets[i];
    while (node) {
    
    
      arr[j++] = node->data;
      node = node->next;
    }
  }

  return;
}

// Function to sort the elements of each bucket
struct Node *InsertionSort(struct Node *list) {
    
    
  struct Node *k, *nodeList;
  if (list == 0 || list->next == 0) {
    
    
    return list;
  }

  nodeList = list;
  k = list->next;
  nodeList->next = 0;
  while (k != 0) {
    
    
    struct Node *ptr;
    if (nodeList->data > k->data) {
    
    
      struct Node *tmp;
      tmp = k;
      k = k->next;
      tmp->next = nodeList;
      nodeList = tmp;
      continue;
    }

    for (ptr = nodeList; ptr->next != 0; ptr = ptr->next) {
    
    
      if (ptr->next->data > k->data)
        break;
    }

    if (ptr->next != 0) {
    
    
      struct Node *tmp;
      tmp = k;
      k = k->next;
      tmp->next = ptr->next;
      ptr->next = tmp;
      continue;
    } else {
    
    
      ptr->next = k;
      k = k->next;
      ptr->next->next = 0;
      continue;
    }
  }
  return nodeList;
}

int getBucketIndex(int value) {
    
    
  return value / INTERVAL;
}

void print(int ar[]) {
    
    
  int i;
  for (i = 0; i < NARRAY; ++i) {
    
    
    printf("%d ", ar[i]);
  }
  printf("\n");
}

// Print buckets
void printBuckets(struct Node *list) {
    
    
  struct Node *cur = list;
  while (cur) {
    
    
    printf("%d ", cur->data);
    cur = cur->next;
  }
}

// Driver code
int main(void) {
    
    
  int array[NARRAY] = {
    
    42, 32, 33, 52, 37, 47, 51};

  printf("Initial array: ");
  print(array);
  printf("-------------\n");

  BucketSort(array);
  printf("-------------\n");
  printf("Sorted array: ");
  print(array);
  return 0;
}
the complexity
  • Worst case complexity: O( n 2 n^2n2 )
    When there are elements with similar values ​​in the array, they are likely to be placed in the same bucket. This may cause some buckets to have more elements than others.
    This makes the complexity dependent on the sorting algorithm used to sort the elements in the bucket.
    When the order of the elements is reversed, the complexity becomes worse. If insertion sort is used to sort the elements in the bucket, the time complexity becomes O(n 2 n^2n2)。
  • Best-case complexity: O(n+k)
    This happens when the elements are evenly distributed in the bucket, and the number of elements in each bucket is almost equal.
    If the elements in the bucket have been sorted, then the complexity will become better.
    If insertion sort is used to sort the elements in the bucket, then in the best case, the overall complexity will be linear, that is, O(n+k). O(n) is the complexity of generating the bucket, and O(k) is the complexity of sorting the elements of the bucket using an algorithm with linear time complexity in the best case.
  • Average case complexity: O(n)
    This happens when the elements are randomly distributed in the array. Even if the elements are not uniformly distributed, bucket sorting runs in linear time.
Bucket sorting application

    Use bucket sorting in the following situations:

  • Input is evenly distributed within a certain range
  • Floating point value exists
Reference documents

[1]Parewa Labs Pvt. Ltd.Bucket Sort Algorithm[EB/OL].https://www.programiz.com/dsa/bucket-sort,2020-01-01.

Guess you like

Origin blog.csdn.net/zsx0728/article/details/114923186
Recommended