[Transfer] Talking about the classic sorting algorithm: 5. Comparison of linear time sorting (counting, cardinality, bucket sorting)

Reprinted from: http://blog.csdn.net/touch_2011/article/details/6787127

 

1. Counting sort

          1.1 Lead out

            In the previous four blogs, all sorting algorithms have comparisons, which can be called "comparison sorting". The lower bound for comparison sort is O(nlogn). So is there a linear time sorting algorithm with time complexity O(n)? Counting sorting is a very basic linear time sorting, which is the basis of radix sorting. The basic idea is: for each element x, determine the number of elements smaller than x, and then you can put x directly in its position in the ordered sequence. Process description: Assume that the range of values ​​in the sequence a to be sorted is [0, k], where k represents the maximum value in the sequence to be sorted. First, an auxiliary array count is used to record the number of times each value appears in a, for example, count[i] represents the number of i in a. Then change the value of the elements in count in turn, so that count[i] represents the number of elements in a that are not greater than i. Then scan the a array from back to front, and the elements in a are directly placed in the auxiliary array b according to the information in count. Finally, the ordered sequence b is copied to a.

#include<stdio.h>
#include<stdlib.h>

//Counting sort, n is the number of records in array a, k is the maximum value in the record
void countingSort(int *a,int n,int k)
{
	int i;
	int *count=(int *)malloc(sizeof(int)*(k+1));
	int *b=(int *)malloc(sizeof(int)*(n+1));
	//Initialize the count array count
	for(i=0;i<=k;i++)
		*(count+i)=0;
	//Calculate the number of records equal to a[i]
	for(i=1;i<=n;i++)
		(*(count+a[i]))++;
	//Calculate the number of records less than or equal to a[i]
	for(i=1;i<=k;i++)
        *(count+i) += *(count+i-1);
	//Scan the a array and place each element at the corresponding position in the ordered sequence
	for(i=n;i>=1;i--){
		*(b + *(count + a[i]))=a[i];
        (*(count+a[i]))--;
	}
	for(i=1;i<=n;i++)
		a[i]=*(b+i);
	free(count);
	free(b);
}

void main()
{
	int i;
	int a[7]={0,3,5,8,9,1,2};//A[0] is not considered
	countingSort(a,6,9);
	for(i=1;i<=6;i++)
		printf("%-4d",a[i]);
	printf("\n");
}

 

 1.2 Efficiency Analysis

From the code, the count sort has 5 for loops, three of which are n and two are k. So the total time T(3n+2k), the time complexity o(n+k), whether in the worst or the best case, this time complexity is unchanged. In addition, the counting sorting is stable, the auxiliary space n+ k, this space is relatively large, and counting sorting has constraints on the sequence to be sorted (for example, we assume that the range of values ​​in the sequence a to be sorted is [0, k], where k represents the maximum value in the sequence to be sorted), the element value It needs to be a non-negative number. If k is too large, the efficiency will be greatly reduced. What should be noted here is "scan the a array to put each element in the corresponding position of the ordered sequence" Why scan the a array from the back to the front in this step? If you think about the process of counting sorting, you will know that counting sorting is unstable because of scanning in the past. As mentioned earlier, counting sorting is the basis of radix sorting, so its stability directly affects the stability of radix sorting.

2. Radix sort

          2.1 Lead out

            In counting sorting, when k is large, the time and space overhead will increase (think about sorting the sequence {8888,1234,9999} by counting, which not only wastes a lot of space, but also is not as good as comparison sorting in terms of time. ). Therefore, the records to be sorted can be decomposed into ones digit (the first digit), ten digit (the second digit).... Then the entire sequence is counted and sorted by the first digit and the second digit... respectively. In this case, each bit decomposed does not exceed 9, that is, the maximum value in the sequence sorted by counting is 9.

          2.2 Code

#include<stdio.h>
#include<stdlib.h>
#include<math.h>


//Counting sort, n is the number of records in the array a, k is the maximum value in the record, sorted by the d-th place
void countingSort(int *a,int n,int k,int d)
{
	int i;
	int *count=(int *)malloc(sizeof(int)*(k+1));
	int *b=(int *)malloc(sizeof(int)*(n+1));
	//Initialize the count array count
	for(i=0;i<=k;i++)
		*(count+i)=0;
	//Calculate the number of records equal to a[i] in the d position (a[i]/(int)pow(10,d-1)%10)
	for(i=1;i<=n;i++)
        (*(count+a[i]/(int)pow(10,d-1)%10))++;

	//Calculate the number of records less than or equal to a[i] in the d position (a[i]/(int)pow(10,d-1)%10)
	for(i=1;i<=k;i++)
        *(count+i) += *(count+i-1);
	//Scan the a array and place each element at the corresponding position in the ordered sequence
	for(i=n;i>=1;i--){
		*(b + *(count + a[i]/(int)pow(10,d-1)%10))=a[i];
        (*(count+a[i]/(int)pow(10,d-1)%10))--;
	}
	for(i=1;i<=n;i++)
		a[i]=*(b+i);
	free(count);
	free(b);
}


// Radix sort, n is the number of records in array a, each record has d digits
void radixSort(int *a,int n,int d)
{
	int i;
	for(i=1;i<=d;i++){
	    countingSort(a,6,9,i);
	}
}

void main()
{
	int i;
	int a[7]={0,114,118,152,114,111,132};//A[0] is not considered
	radixSort (a, 6.3);
	for(i=1;i<=6;i++)
		printf("%-4d",a[i]);
	printf("\n");
}

 

 2.3 Efficiency Analysis

Cardinality sorting time T(n)=d*(2k+3n), where d is the number of digits of the record value, (2k+3n) is the counting and sorting time of each trip, as analyzed above, k does not exceed 9, d The value of is generally small, k and d can be regarded as a small constant, so the time complexity is o(n). Worst-best case doesn't change the time complexity. Radix sort is stable. Auxiliary space is sorted with count k+n.

3. Bucket sorting

          3.1 Lead out

            Like counting sorting, bucket sorting also makes assumptions about the sequence to be sorted. Bucket sorting assumes that the sequence is generated by a random process that distributes elements uniformly and independently on the interval [0, 1). The basic idea is to divide the interval [0,1) into n sub-intervals of the same size, called buckets. Distribute n records into buckets. If there are more than one records in the same bucket, in-bucket sorting is required. Finally, the records in each bucket are listed in order to remember the ordered sequence.

          3.2 Code

#include<stdio.h>
#include<stdlib.h>

// bucket sort
void bucketSort(double* a,int n)
{
	//Description of linked list nodes
	typedef struct Node{
		double key;
        struct Node * next; 
	}Node;
	//辅助数组元素描述
	typedef struct{
         Node * next;
	}Head;
	int i,j;
    Head head[10]={NULL};
	Node * p;
	Node * q;
	Node * node;
	for(i=1;i<=n;i++){
		node=(Node*)malloc(sizeof(Node));
		node->key=a[i];
		node->next=NULL;
		p = q =head[(int)(a[i]*10)].next;
		if(p == NULL){
			head[(int)(a[i]*10)].next=node;
			continue;
		}
		while(p){
            if(node->key < p->key)
				break;
			q=p;
			p=p->next;
		}
		if(p == NULL){
			q->next=node;
		}else{
			node->next=p;
			q->next=node;
		}
	}
	j=1;
	for(i=0;i<10;i++){
    	p=head[i].next;
		while(p){
			a[j++]=p->key;
			p=p->next;
		}
	}
}

void main()
{
	int i;
	double a[13]={0,0.13,0.25,0.18,0.29,0.81,0.52,0.52,0.83,0.52,0.69,0.13,0.16};//不考虑a[0]
	bucketSort(a,12);
	for(i=1;i<=12;i++)
		printf("%-6.2f",a[i]);
	printf("\n");
}

 

   3.3 效率分析

当记录在桶中分布均匀时,即每个桶只有一个元素,此时时间复杂度o(n)。因此桶排序适合对很少重复的记录排序。辅助空间2n。桶排序是稳定的排序,实现比较复杂。

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326994834&siteId=291194637