Data Structures - Introduction to Merge Sort and Counting Sort

merge sort

Merge sort (MERGE-SORT) is an effective sorting algorithm based on the merge operation, which is a very typical application of divide and conquer (Divide and Conquer). Merge sorted subsequences to get a fully sorted sequence. That is to make each subsequence in order first, and then make the subsequence segments in order. Merging two sorted lists into one sorted list is called a two-way merge.
insert image description here

The idea of ​​merge sort

First, a temporary array with the same length as the array to be sorted must be created to store the data after the merge operation. 1. Continuously divide the sorted data interval into two until there is only one data in the interval or the interval does not exist. Merge each subinterval by recursion, write the merged sequence into a temporary array, and finally copy the merged content from the temporary array back to the original array.
insert image description here

Implementation of one-pass sorting

First, the smaller value of the content of the two intervals is written into the tmp array to achieve interval merging. When all the arrays of a range are written, the content of the remaining range is directly tail-inserted into the temporary array. Finally, use memcpy to copy the contents of the temporary array back to the original array.

void _MergeSort(int* arr, int begin, int end, int* tmp)
{
    
    
	int mid = (end + begin) / 2;
    //子区间归并
    int begin1 = begin, end1 = mid;
    int begin2 = mid + 1, end2 = end;

    int i = begin;

    while (begin1 <= end1 && begin2 <= end2)
    {
    
    
        if (arr[begin1] < arr[begin2])
        {
    
    
            tmp[i++] = arr[begin1++];
        }
        else
        {
    
    
            tmp[i++] = arr[begin2++];

        }
    }

    //把剩下的直接尾插到tmp数组中
    while (begin1 <= end1)
    {
    
    
        tmp[i++] = arr[begin1++];
    }

    while (begin2 <= end2)
    {
    
    
        tmp[i++] = arr[begin2++];
    }

    //将归并的内容拷回原数组
    memcpy(arr + begin, tmp + begin, (end - begin + 1) * sizeof(int));
}

Merge sort implementation

It is to recursively divide the interval on the basis of single-pass sorting, and start merging until the interval does not exist. Of course, drawing a recursive expansion diagram can also help us understand the recursive implementation of merge sort.
insert image description here

insert image description here

Implementation of the non-recursive version

The implementation idea is as follows: changing recursion to non-recursion is actually similar to the problem of changing recursion to non-recursion of the Fibonacci sequence. However, for the idea of ​​merging here, we need to merge from 11 to 22 to 44. First, create a temporary array. Then, the merged data will start from 1. Finally, the idea of ​​single-pass sorting is as above, so I won’t go into details here.
insert image description here
Since the data cannot be perfectly divided every time, we need to discuss the interval. begin1 will definitely not cross the boundary. end1, begin2, and end2 all have the possibility of crossing the boundary. Here we have done it in the form of multiple sets of copies. When end1 and begin2 cross the boundary, we jump out of the loop directly without processing. When end2 is out of bounds, modify end2 to the subscript value of the array length -1.
insert image description here
Next, I will first demonstrate the code in the form of merging and copying each group.
insert image description here

Below I will print out the value of each group of intervals for easy viewing.
insert image description here

This is done in the form of copying after multiple groups are merged. Of course, this is not the only way. Let me introduce how to deal with the boundary problem in the way of copying a stud.
insert image description here
Since it is a stud copy, it is necessary to modify the out-of-bounds range. Then I typed out the modified value to see.
insert image description here

Feature Summary

1. The time complexity of merge sort is O(N*LogN), and the space complexity is O(N).
Second, merge sort can solve the problem of external sorting in disk. Because sometimes when sorting a large amount of data, the memory may not be able to accommodate it, so it can only be sorted on the disk. Merge sort can traverse the file system recursively. Similar quick sort and heap sort cannot achieve this effect. Because the file system belongs to a tree structure, it is not suitable for the heap stored in the array and the quick sort for the left and right pointers to traverse the array.
3. Merge sort is a stable sort.

counting sort

Counting sort is fundamentally different from what was introduced earlier. It is a kind of non-comparative sort. What we have introduced before is comparison sorting, that is, the size of the value is used for comparison.

The idea of ​​counting sort

Perform relative mapping according to the range of the maximum and minimum values ​​of the array to count, then traverse the technology array, and rewrite the relative mapping value back to the original array to achieve the sorting function.
insert image description here

Implementation of counting sort

First, traverse the array to find the maximum and minimum values, find the value range of the array according to the maximum and minimum values, open up an array with a length of maximum value - minimum value + 1 to count, and count according to the relative mapping of values, and then Traverse the technical array, and rewrite the relative mapping value back to the original array to achieve the sorting function.insert image description here

Feature Summary

1. Counting sorting is an integer data that is only applicable to sorting data range sets. The application scenarios are relatively limited. Absolute mappings are not suitable because counting will fail in the case of negative integers. And when the value is large and the range of difference between values ​​is small, the space loss is too large. So it is more appropriate to use relative mapping.
Second, the time complexity is O(N+range), and the space complexity is O(range).

Guess you like

Origin blog.csdn.net/m0_71927622/article/details/131395632