8 Sorts and 3 Searches That Programmers Must Know

Every day you are clamoring for what technologies and frameworks you know, do you realize that you are being confused by these new terms and new technologies every day, .NET, XML and other technologies are tempting, but if your foundation is not solid, like It is like walking in the clouds and fog, you can only see in front of you, but you can't see farther. These new technologies cover up many underlying principles. If you want to truly learn the technology, you should go to the cloud and learn the basic knowledge solidly. With these foundations, it is easy to master those new technologies.

 

To write excellent code, a solid foundation is also required. If the sorting and searching algorithms are not well learned, how can the performance of the program be optimized? Not much nonsense, these sorting algorithms to be introduced in this article are the foundation of the foundation, programmers must know!




1. Direct insertion sort

(1) Basic idea: In a set of numbers to be sorted, it is assumed that the previous (n-1) [n>=2] numbers are already sorted

Good order, now we need to insert the nth number into the previous ordered number, so that the n number

Also in order. Repeat this cycle until everything is in order.

(2) Examples




2. Hill sort (also known as minimum incremental sort)

 

(1) Basic idea: The algorithm first divides a group of numbers to be sorted into several groups by a certain increment d (n/2, n is the number of numbers to be sorted), and the subscripts recorded in each group differ by d. All elements in the group are inline sorted, then grouped by a smaller increment (d/2), and inline sorted in each group. When the increment is reduced to 1, the sorting is completed after direct insertion sorting is performed.

(2) Examples:

 

3. Simple selection sort

 

(1) Basic idea: in a set of numbers to be sorted, select the smallest number and exchange it with the number in the first position;

Then find the smallest among the remaining numbers and exchange it with the number in the second position, and so on until the second-to-last number is compared with the last number.

(2) Examples:

 

4. Heap sort

 

(1) Basic idea: Heap sort is a tree-shaped selection sort, which is an effective improvement on direct selection sort.

A heap is defined as follows: a sequence of n elements (h1,h2,...,hn) if and only if (hi>=h2i,hi>=2i+1) or (hi<=h2i,hi< =2i+1) (i=1,2,...,n/2) is called a heap. Only heaps that satisfy the former condition are discussed here. As can be seen from the definition of heap, the top element of the heap (ie the first element) must be the largest item (big top heap). A complete binary tree can represent the structure of the heap very intuitively. The top of the heap is the root, and the others are the left and right subtrees. Initially, the sequence of numbers to be sorted is regarded as a binary tree stored in order, and their storage order is adjusted to make it a heap. At this time, the number of root nodes of the heap is the largest. Then swap the root node with the last node of the heap. Then readjust the previous (n-1) number to make it a heap. And so on, until there are only two node heaps, and swap them, and finally get an ordered sequence of n nodes. From the algorithm description, heap sorting requires two processes, one is to build the heap, and the other is to exchange the position of the top of the heap with the last element of the heap. So heap sort consists of two functions. One is the percolation function for building the heap, and the other is the function for repeatedly calling the percolation function to achieve sorting.

(2) Examples:

Initial sequence: 46,79,56,38,40,84

Build heap:

swap, kick max number from heap

The remaining nodes build a heap again, and then swap to kick out the maximum number

And so on: the last two nodes remaining in the last heap are swapped, one is kicked out, and the sorting is completed.

5. Bubble sort

 

(1) Basic idea: in a set of numbers to be sorted, compare and adjust the two adjacent numbers from top to bottom for all numbers in the range that is not currently sorted, so that the larger Numbers sink, smaller ones rise. That is: whenever two adjacent numbers are compared and found that their ordering is opposite to the ordering requirements, they are exchanged.

(2) Examples:





 

6. Quick Sort

(1) Basic idea: select a reference element, usually select the first element or the last element, and divide the sequence to be sorted into two parts through one scan, one part is smaller than the reference element, and the other part is greater than or equal to the reference element. Elements are in their correct positions after sorting, and then recursively sort the divided two parts in the same way.

(2) Examples:

 

In the above figure, the sequence to be sorted is divided into two parts, one part is smaller than the reference element, and the other part is larger than the reference element, and then the solution process in the above figure is repeated for these two parts.

(This is just an implementation of quick sort, which I personally think is easier to understand)

 

7. Merge sort

 

(1) Basic sorting: Merge sorting method is to merge two (or more) ordered lists into a new ordered list, that is, the sequence to be sorted is divided into several subsequences, each subsequence has sequential. Then the ordered subsequences are merged into the overall ordered sequence.

(2) Examples:


 

8. Radix sort

 

(1) Basic idea: unify all the values ​​(positive integers) to be compared into the same digit length, and fill in zeros in front of the numbers with shorter digits. Then, starting from the lowest digit, sort in order one by one. In this way, from the lowest order until the highest order is completed, the sequence becomes an ordered sequence.

(2) Examples:


Stability Note: Before sorting, 2 (or more) equal numbers are in the same order in the front and rear of the sequence as they are in the sequence after sorting.

 

Example:

Sequence to be sorted: 5,4,8,6,1,8,7,9

Sort results: 1,4,5,6,7,8,8,9

Stable: 1,4,5,6,7,8,8,9

Unstable: 1,4,5,6,7,8,8,9

 

Instructions: Compare the red 8 and the purple 8 to see their positions before and after sorting. Before sorting, red 8 is in front of purple 8. If red 8 is still in front of purple 8 after sorting, the sorting algorithm is stable, otherwise it is unstable. 

 

Now we analyze the stability of the eight sorting algorithms.

(Please combine the basic ideas of sorting above to understand the stability of sorting.

(1) Direct insertion sort: In general insertion sort, the comparison starts from the last element of the ordered sequence, and if it is larger than it, it is inserted directly behind it, otherwise it is always compared forward. If an equal element is found, then it is inserted after the equal element. Insertion sort is stable.

(2) Hill sorting: Hill sorting is to perform insertion sorting on elements according to different synchronization lengths. One insertion sorting is stable and will not change the relative order of the same elements, but in different insertion sorting processes, the same elements may be Moving in the respective insertion sort, stability is broken, so Hill sort is not stable.

(3) Simple selection sort: In a selection, if the current element is smaller than an element, and the smaller element appears behind an element equal to the current element, then the stability after the exchange is destroyed. It may be a bit vague, let's take a small example: 858410, the first scan, the first element 8 will be swapped with 4, then the relative order of the two 8s in the original sequence is inconsistent with the original sequence, so the selection sort is not Stablize.

(4) Heap sorting: The process of heap sorting is to select the largest (large top heap) or the smallest (small top heap) from the n/2th and its child nodes, and the selection between these three elements is of course not the same. will destabilize. But when selecting elements for n/2-1, n/2-2, ... these parent nodes, it is possible that the n/2th parent node swaps the latter element, and the n/2-1th The parent node does not exchange the same element behind it, so the heap sort is not stable.

(5) Bubble sort: As can be seen from the previous content, bubble sort is a comparison of two adjacent elements, and the exchange also occurs between these two elements. If the two elements are equal, no exchange is required. So bubble sort is stable.

(6) Quick sort: When the central element is exchanged with an element in the sequence, it is likely to disrupt the stability of the previous element. Let's look at a small example: 6 4 4 5 4 7 8 9, the first sorting, the exchange of the central element 6 and the third 4 will destroy the original sequence of element 4, so the quick sort is unstable.

(7) Merge sort: In the decomposed sub-columns, when there are 1 or 2 elements, 1 element will not be exchanged, and 2 elements will not be exchanged if they are of equal size. In the process of sequence merging, if the two current elements are equal, we save the elements in the previous sequence in the front of the result sequence, so the merge sort is also stable.

(8) Cardinality sorting: sort according to the low order first, then collect; then sort according to the high order, and then collect; and so on, until the highest order. Sometimes some attributes have a priority order. First, they are sorted by low priority, and then by high priority. The final order is that the higher priority is higher, and the higher priority is the same. The lower priority is higher. Radix sort is based on sorting separately and collecting separately, so it is stable.

The classification, stability, time complexity and space complexity of 8 sorts are summarized:





Three search algorithms: sequential search, binary search (half search), block search, hash table (discussed later)

 

 

First, the basic idea of ​​sequential search:

Starting from one end of the table, scan the table sequentially, and compare the scanned node key with the given value (assumed to be a) in turn. If the current node key is equal to a, the search is successful; if the scan is over, If the node whose key is equal to a is still not found, the search fails.

To put it bluntly, from the beginning to the end, compare one by one, if you find the same, you will succeed, if you can't find it, you will fail. The obvious disadvantage is that the search efficiency is low.

Sequential storage structure and chained storage structure suitable for linear tables.

 

 

Calculate the average lookup length.

For example, in the above table, it takes 1 time to find 1, and 2 times to find 2. Pushing down in turn, it can be seen that it takes 16 times to find 16.

It can be seen that we only need to sum up these search times (in our junior high school, the upper base plus the lower base multiplied by the height divided by 2), and then divided by the number of nodes, which is the average search length.

Let n = the number of nodes

Average lookup length = (n+1)/2

Second, the basic idea of ​​binary search (half search):

 

premise:

(1) Determine the midpoint position of the interval: mid=(low+high)/2    

min represents the position of the node in the middle of the interval, low represents the position of the leftmost node in the interval, and high represents the position of the rightmost node in the interval

(2) Compare the value of a to be checked with the key of the node mid (R[mid].key below), if they are equal, the search is successful, otherwise a new search interval is determined:

If R[mid].key>a, then from the order of the table, the value on the right side of R[mid].key is greater than a, so if the keyword equal to a exists, it must be in R[mid].key in the table on the left. At this time high=mid-1

If R[mid].key<a, the keyword equal to a, if it exists, must be in the table to the right of R[mid].key. At this time low=mid

If R[mid].key=a, the lookup succeeds.

(3) The next search is for a new search interval, repeat steps (1) and (2)

(4) During the search process, low gradually increases and high gradually decreases. If high<low, the search fails.

 

Average search length=Log2(n+1)-1

Note: Although the binary search method is more efficient, it is necessary to sort the table by keywords. The sorting itself is a very time-consuming operation, so the dichotomy is more suitable for sequential storage structures. In order to maintain the order of the table, insertion and deletion in the sequential structure must move a large number of nodes. Therefore, binary search is particularly suitable for linear tables that are rarely changed once established and often need to be looked up.

3. The basic idea of ​​block search:

 

The binary lookup table makes the block ordered linear table and index table (extract the largest key in each block and its starting position to form the index table

), since the table is ordered in blocks, the index table is an incrementally ordered table, so a sequential or binary search index table is used to determine which block the node to be checked is in. Since the block is out of order, it can only be used with Search sequentially.

 

 

Let the table have n nodes in total, divided into b blocks, s=n/b

(Block lookup index table) Average lookup length=Log2(n/s+1)+s/2

(Sequential lookup index table) Average lookup length=(S2+2S+n)/(2S)

Note: The advantage of block search is that when inserting or deleting a record in the table, as long as the block to which the record belongs is found, the insert or delete operation will be performed in the block (because the block is out of order, so there is no need to move a lot of records). Its main cost is the addition of an auxiliary array of storage controls and the operation of sorting the initial table into blocks.

Its performance is between sequential search and binary search.

 

4. I have been busy recently. I will talk about the hash table (hash table) technology in the future. I hope you will pay attention!

Hash table search techniques are different from sequential search, binary search, and block search. It does not take the comparison of keywords as the basic operation, and adopts direct addressing technology. Ideally, the keyword to be searched can be found without any comparison, and the expected time of search is O(1).

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326495786&siteId=291194637