Data structure-5 sorting

1 Simple sorting (bubble, insert)

If the number of sorts is more than 10,000, efficiency is very important

1.1 Prerequisite

Insert picture description here
The elements to be sorted are stored in the array (A[]), and N represents the number of numbers to be sorted.
Internal sorting: Assuming that the memory space is large enough, all the numbers are led to the memory, and the sorting of all the numbers is done in the memory.

1.2 Bubble sort

Insert picture description here
The purpose is to arrange the smallest bubble to the top, and the largest to the
bottom. Compare the 2 bubbles (the 0th and the 1st) from top to bottom. If the smaller one is at the top and the big one is at the bottom, no Otherwise, swap these two bubbles,
and then go down and compare the two adjacent ones (the first and the second)
to the end, and complete the first sorting, as shown in the figure below
Insert picture description here
and after the first pass Sorting, the largest must be at the bottom. The
second pass, repeating the previous n-1.
Insert picture description here
However, if luck is better, the sorting order on a certain pass above is arranged, that is, there is no exchange order for a certain pass ( Execute swap function)
Insert picture description here
Best case: Scan only once.
Worst case: Arrange from small to large, but give it from large to small.
Benefits:
1. Simple.
2. The same applies to linked lists (other sorting is possible Can’t do it)
3. Swap only when it is strictly greater than that to ensure the stability of the sorting algorithm.
Disadvantage: In the
worst case, the time complexity is O(N 2 ) unacceptable
Insert picture description here

1.3 Insertion sort

Playing cards
Assume that the 0th card is already in your hand at the
beginning. Compare from back to front. After you find the position, move this position and the subsequent ones back by one.
Insert picture description here
Advantages:
1. Simple
2. Use 3 compared to bubbling. In terms of steps, the insertion sort step has only one step, which is simpler
3, stable
4, and
disadvantageous
Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here

1.4 Lower bound of time complexity

Insert picture description here
Insert picture description here
Insert picture description here
The order of bubbling and insertion exchanges is 9 times, each time a reverse pair is eliminated, the
Insert picture description here
sequence is basically ordered, that is, I is small, and the sorting algorithm will be very fast.
Insert picture description here
Omega: Lower Bound

2 Hill sort

Take advantage of the simplicity of insertion sort, but overcome the shortcomings of only eliminating one reverse pair each time.
Insert picture description here
Perform a-interval sorting (first sort the numbers separated by a)
, b-interval sorting,
and c-interval sorting
. . .
Perform 1-interval sorting
a>b>c>.. . . >1
A very important property: after the next interval is sorted, the nature of the
Insert picture description here
previous interval is maintained. The original insertion sort is that the 0th card is already in the hand, and the
nature of the order of the interval 1 card is that the D card is in the hand. The order of the interval D cards
Insert picture description here
is both upper and lower bounds.
Insert picture description here
Insert picture description here
For tens of thousands of numbers, the complexity of Hill+Sedgewick is relatively low.
Insert picture description here
Unstable, because the records move in jumps and the sorting method is unstable.

3 heap sort

3.1 Selection sort

Find the smallest element from the beginning in turn, put it to the front, and then find the smallest element from this position, and put it behind the position just now.
Insert picture description here
Swap(A[i],A[MinP]): The number of exchanges may not be next to each other In the
worst case, it needs to be changed every time, so the time complexity of Swap is linear (O(N)).
Finding the smallest element ScanForMin() is also a for loop,
so no matter The best case and the worst case are O(N 2 )
how to quickly find the smallest element? Can use heap sort

3.2 Heap sort

Improvements to selection sorting

Algorithm 1

A more stupid way, after the heap is built, output the heap in turn
Insert picture description here

Algorithm 2

The first array
Insert picture description here
Algorithm 2 is not to adjust it to the smallest heap, but to the largest heap.
Each subtree is adjusted in turn. After adjustment:
Insert picture description here
because it is sorted from small to large, the largest d must be exchanged with the last node, namely : After
Insert picture description here
placing it below, reduce the entire size of the heap by 1 (excluding d), because d has been placed in the final position,
and then adjust the heap other than d to the largest heap,
Insert picture description here
and c and the last node in the heap ( That is, a) Exchange, cut off c
Insert picture description here
,
Insert picture description here
then adjust a and b to the largest pile, and then do the exchange.
Insert picture description here
Finally, the pile sorting is completed.
When learning piles, the position of a[0] is not placed in the element, but sentinel is placed. , But when sorting, a[0] puts the element
Insert picture description here
Insert picture description here
PercDown(A,i,N): filter down the sub-function, i is the position corresponding to the root node.
Insert picture description here
Question 2 should be Algorithm 2, first adjust it to the largest heap
Insert picture description here
Insert picture description here

4 merge sort

4.1 Merging of ordered sub-columns

Insert picture description here
The pointer here (Aptr\Bptr\Cptr) refers to the position in the array.
Compare the value of the number pointed to by A and the number pointed to by B:
Insert picture description here
Insert picture description here
each element is scanned once, and the time complexity is obviously O(N)
Insert picture description here
A[]: contains Two arrays, the left is the first, the right is the second, the two arrays are stored next to the
last step and the sorted array TmpA[] must be stored back to A[], but L is no longer referring to A[] Has started, but RightEnd hasn’t moved, so save it from back to front

4.2 Recursive Algorithm

Insert picture description here
There is no best, worst, and no average time complexity. It
Insert picture description here
is good to declare TmpA in Merge_sort(). Why is it good to use TmpA in the Merge function
?
Declared in advance:
Insert picture description here
When sorting the first paragraph above, use the green piece below to merge, and then import the green piece of land back to the top
Insert picture description here
. The space of the Tmp array below will be used repeatedly,
but if you don’t declare it in advance :
Insert picture description here
Insert picture description here
Will perform a lot of malloc and free,
so it is better to open an array first, and then pass the pointer in each time it is more cost-effective

4.3 Non-recursive algorithm

Recursion uses the stack of the system, and there are many additional operations that make the recursion slower
Insert picture description here
. Two adjacent ones are combined into a subsequence until the largest sequence
is synthesized. A lot of space (logn layer)
Insert picture description here
can be used to achieve the smallest two extras. The space complexity is O(N) and
only needs to open a temporary array: merge A into the temporary array, and merge the temporary array into A next time. . . The last step may be in A, or it may be
Insert picture description here
merged into the penultimate pair of subsequences in the for loop in the temporary array , so the possible number of the last pair of i<=N-2 length
is not the same as the previous one, so special processing is required
Merge() will eventually lead the Tmp array back to A, but it is not used here, so use Merge1
Merge1(A,TmpA,i,i+length,i+2
length-1):

  • i: the starting position of the first paragraph
  • i+length: the starting position of the second paragraph
  • i+2*lenght-1: the end position of the second paragraph

Insert picture description here
The best and worst time complexity of merge sort is O(nlogn)
but requires additional space.
All data is in memory and does not need to be merged
. Merging is useful for external sorting.

5 Quick sort

5.1 Algorithm overview

The fastest algorithm, but there will still be the worst case. If you write it yourself, if some details are not implemented well, it is easy to make the algorithm no longer quick sort
and merge. Both use divide and conquer to
Insert picture description here
choose a pivot, which is bigger than it. Put the one on the right, and put the smaller one on the left for sorting
Insert picture description here
. When does the recursion end? If there is only one element, the proof is over.
Questions:
1. How to choose the principal element
2. How to divide the subsets. If
these two problems are not solved well, the fast sorting will not be fast.
Insert picture description here
Insert picture description here

5.2 Choosing the pivot

Insert picture description here
Insert picture description here
The median of the leftmost right and the middle three numbers is used. After the three numbers are exchanged, the leftmost store is the minimum of the three numbers, and the rightmost store is the three combed maximums, then this At that time, the leftmost and rightmost numbers have been divided. Then put the middle number (ie a[center]) at a[right-1], so that these 3 numbers can be less considered, and only a[left+1]-a[right-2] is processed. Just count

5.3 Subset division

The following is a[left+1]-a[right-2] The
red 6 is the selected pivot. The
Insert picture description here
goal is to make the elements on the left <6, and the right are >6. If
Insert picture description here
i points to <6, the pointer moves backward; j points to If it is >6, then the pointer will move forward.
When it is found that i points to >6 and j points to <6, then the two points to the value are exchanged
as shown in the figure below, and it becomes as follows:
Insert picture description here
After the exchange, i moves forward until Point to 9, at this time >6,
Insert picture description here
start to move j, and move forward until it points to j, at this time <6.
Insert picture description here
Exchange these two elements
Insert picture description here
Insert picture description here
Insert picture description here
Insert picture description here
. Why is the fast sorting fast? Because it puts the pivot in the final position at one time, it will not be moved in the future.
Unlike insertion sort, the position is only temporary.
What if there is an element that is equal to pivot?

  • Stop to exchange?
  • Ignore it and continue to move the pointer?

Consider extreme cases, the values ​​in the array are all equal

  • If you stop to exchange, these numbers will do a lot of meaningless exchanges. One advantage is that each time it will be divided into 2 sequences, recursively, and the complexity is nlogn
  • If you ignore him, i will move all the way to the right end, j has no chance to move, so the pivot will be placed at the end point, so this is the very embarrassing situation, and the time complexity will become N 2
  • So it’s better to stop and exchange

Insert picture description here

5.4 Algorithm implementation

Insert picture description here
Insert picture description hereInsert picture description here
Insert picture description here
If an element is exactly equal to the pivot, it will be exchanged, which will lead to instability

6 Table sorting

6.1 Algorithm overview

If you want to sort is not an element, but a structure, then the time to move this structure can not be ignored. The
above algorithms must be exchanged.
Use table sorting, and move the pointer to
Insert picture description here
compare keys, not exchange the whole, but only exchange the table After pointer
insertion sort is used, the value of table becomes:
Insert picture description here
Insert picture description here
Insert picture description here

6.2 Physical ordering

Which structures must be sorted out.
Insert picture description here
According to the table value after the change, determine several rings
such as: table[0]=3, then put the value of table[3] into the ring, table[3]=1, then Put the value of table[1] into the ring. . .
Insert picture description here
Insert picture description here
Insert picture description here
It takes 3 steps to move 2 numbers, so it will take more time

7 Cardinality sort

7.1 Bucket sorting

Insert picture description here
Insert picture description here
N is needed to insert student scores, M is needed to scan all buckets when outputting
Insert picture description here

7.2 Cardinality sort

Insert picture description here
Insert picture description here
A total of P passes are executed, N elements are allocated in each pass, and B buckets are visited

7.3 Sort by multiple keywords

Insert picture description here
Insert picture description here
Insert picture description here
As far as playing cards are concerned, the second priority is smarter than the main priority.
Insert picture description here
Insert picture description here

8 Comparison of sorting algorithms

Insert picture description here
Selective sorting is sorting, so unstable,
bubbling and direct insertion are two adjacent exchanges, so they are stable, they don’t need extra space
Insert picture description here
Insert picture description here
. 2. Insertion is directly excluded, if it is in reverse order, insertion sorting, Before the last pass, the element must not be in the final position, because the last inserted number is the smallest, it must be moved to the first one, and the other numbers must be moved backward
. 3. Adjust to the smallest pile, when it is basically orderly The efficiency will not be bad
. 4. Bubbling: 3 is impossible after 2 sorting, 2
insertion: after 2 is inserted, it becomes 2, 3
selection: the first number must be the smallest value.
Insert picture description here
Selection sorting and heap sorting are OK You don’t need to sort all of them to get the top 10 numbers after
sorting. However, to select sorting, you need to traverse all the numbers to find the largest value, which is time-consuming
(and the codes haven’t been written)

Guess you like

Origin blog.csdn.net/qq_42713936/article/details/106045220