Top 10 programming algorithms that programmers must know

Algorithm (Algorithm) refers to the accurate and complete description of the problem-solving scheme. It is a series of clear instructions to solve the problem. The algorithm represents a systematic method to describe the strategy mechanism of solving the problem. In other words, it is possible to obtain the required output within a limited time for a certain standard input. If an algorithm is flawed or not suitable for a certain problem, executing this algorithm will not solve the problem. Different algorithms may use different time, space or efficiency to accomplish the same task. The pros and cons of an algorithm can be measured by space complexity and time complexity.

The instruction in the algorithm describes a calculation that can start from an initial state and (possibly empty) initial input when it runs, go through a series of finite and clearly defined states, and finally produce output and stop in a final state. The transition from one state to another is not necessarily certain. Some algorithms, including randomization algorithms, include some random inputs.

Algorithm 1: Quick sort algorithm

Quick sort is a sorting algorithm developed by Tony Hall. On average, it takes Ο (n log n) comparisons to sort n items. In the worst case, Ο(n2) comparisons are required, but this situation is not common. In fact, quicksort is usually significantly faster than other Ο(n log n) algorithms, because its inner loop can be implemented efficiently on most architectures.

Quicksort uses a divide and conquer strategy to divide a list into two sub-lists.

Algorithm steps:

1 Pick an element from the sequence and call it a "pivot",

2 Re-order the sequence, all elements smaller than the benchmark value are placed in front of the benchmark, and all elements larger than the benchmark value are placed behind the benchmark (the same number can go to either side). After the partition exits, the benchmark is in the middle of the sequence. This is called a partition operation.

3 Recursively sort the sub-sequences of elements smaller than the reference value and the sub-sequences of elements greater than the reference value.

The bottom case of recursion is that the size of the sequence is zero or one, that is, it has always been sorted. Although it has been recursive, this algorithm will always exit, because in each iteration, it will put at least one element to its last position.


Algorithm 2: Heap Sorting Algorithm

Heapsort refers to a sorting algorithm designed using the data structure of the heap. Stacking is a structure that approximates a complete binary tree, and it also satisfies the nature of stacking: that is, the key or index of the child node is always less than (or greater than) its parent node.

The average time complexity of heap sort is Ο(nlogn).

Algorithm steps:

Create a heap H[0..n-1]

Swap head (maximum) and tail

3. Reduce the size of the heap by 1, and call shift_down(0), the purpose is to adjust the top data of the new array to the corresponding position

4. Repeat step 2 until the size of the pile is 1

Algorithm 3: merge sort

Merge sort (Merge sort, Taiwan translation: merge sort) is an effective sorting algorithm based on the merge operation. This algorithm is a very typical application of Divide and Conquer.

Algorithm steps:

1. Apply for space so that its size is the sum of two sorted sequences. This space is used to store the merged sequence

2. Set two pointers, the initial positions are the starting positions of the two sorted sequences

3. Compare the elements pointed to by the two pointers, select a relatively small element to put into the merge space, and move the pointer to the next position

4. Repeat step 3 until a pointer reaches the end of the sequence

5. Copy all remaining elements of another sequence directly to the end of the merged sequence

Algorithm 4: Binary search algorithm
Binary search algorithm is a search algorithm to find a specific element in an ordered array. The search process starts from the middle element of the array. If the middle element happens to be the element to be searched, the search process ends; if a particular element is greater than or less than the middle element, search in the half of the array that is greater or less than the middle element , And start comparing from the middle element just like the beginning. If the array is empty at a certain step, it means it cannot be found. This search algorithm reduces the search range by half with every comparison. The binary search reduces the search area by half each time, and the time complexity is Ο(logn).

Algorithm Five: BFPRT (Linear Search Algorithm)

The problem solved by the BFPRT algorithm is very classic, that is, selecting the kth largest (kth smallest) element from a sequence of n elements. Through clever analysis, BFPRT can ensure that it is still linear time complexity in the worst case. The idea of ​​this algorithm is similar to that of quick sort. Of course, in order to make the algorithm still reach the time complexity of o(n) in the worst case, the five algorithm authors have done delicate processing.

Algorithm steps:

1. Divide n elements into groups of 5 (upper bound).

2. Take the median of each group and use any sorting method, such as insertion sort.

3. Recursively call the selection algorithm to find the median of all medians in the previous step and set it to x. In the case of an even number of medians, set it to select the smallest one.

4. Use x to divide the array, set the number less than or equal to x as k, and the number greater than x as nk.

5. If i==k, return x; if i<k, recursively find the i-th smallest element among elements smaller than x; if i>k, recursively find the i-th smallest element among elements larger than x.

Termination condition: When n=1, the small element i is returned.
Algorithm 6: DFS (depth first search)

Depth-First-Search is a kind of search algorithm. It traverses the nodes of the tree along the depth of the tree, searching the branches of the tree as deep as possible. When all edges of node v have been explored, the search will backtrack to the starting node of the edge where node v is found. This process continues until all nodes reachable from the source node have been found. If there are still undiscovered nodes, select one of them as the source node and repeat the above process. The whole process is repeated until all nodes are visited. DFS is a blind search.

Depth-first search is a classic algorithm in graph theory. The depth-first search algorithm can generate the corresponding topological sorting table of the target graph, and the topological sorting table can conveniently solve many related graph theory problems, such as the maximum path problem. The heap data structure is generally used to assist the realization of the DFS algorithm.

Depth-first traversal graph algorithm steps:

1. Visit vertex v;

2. Starting from the unvisited neighboring points of v in turn, perform a depth-first traversal on the graph; until the vertices in the graph that have a path to v are visited;

3. If there are still vertices in the graph that have not been visited at this time, starting from an unvisited vertex, the depth-first traversal is repeated until all vertices in the graph have been visited.

The above description may be more abstract, for example:

After DFS visits a certain starting vertex v in the graph, it starts from v and visits any of its adjacent vertices w1; then starts from w1 to visit the vertex w2 that is adjacent to w1 but has not been visited; then starts from w2 to proceed Similar visits... continue in this way until reaching the vertex u where all adjacent vertices have been visited.

Then, take a step back and go back to the vertex just visited the previous time to see if there are other adjacent vertices that have not been visited. If there is, visit this vertex, and then proceed from this vertex to conduct a visit similar to the aforementioned; if not, then go back one step to search. Repeat the above process until all vertices in the connected graph have been visited.
Algorithm 7: BFS (breadth first search)

Breadth-First-Search is a graph search algorithm. Simply put, BFS starts from the root node and traverses the nodes of the tree (graph) along the width of the tree (graph). If all nodes are visited, the algorithm stops. BFS is also a blind search. The queue data structure is generally used to assist in the realization of the BFS algorithm.

Algorithm steps:

1. First put the root node in the queue.

2. Take the first node from the queue and check if it is the target.

If the target is found, the search ends and the result is returned.

Otherwise, all of its direct child nodes that have not been checked are added to the queue.

3. If the queue is empty, it means that the whole picture has been checked-that is, there is no target to be searched in the picture. End the search and return "No target found".

4. Repeat step 2.

Algorithm eight: Dijkstra algorithm

Dijkstra's algorithm (Dijkstra's algorithm) was proposed by the Dutch computer scientist Azchel Dijkstra. Dixie's algorithm uses breadth-first search to solve the single-source shortest path problem of a directed graph with non-negative weights. The algorithm finally gets a shortest path tree. This algorithm is often used in routing algorithms or as a sub-module of other graph algorithms.

The input of the algorithm includes a weighted directed graph G, and a source vertex S in G. Let V denote the set of all vertices in G. The edges in each graph are ordered pairs of elements formed by two vertices. (u, v) means that there is a path connecting from vertex u to v. We use E to denote the set of all edges in G, and the weight of the edge is defined by the weight function w: E → [0, ∞]. Therefore, w(u, v) is the non-negative weight from vertex u to vertex v. The weight of an edge can be imagined as the distance between two vertices. The weight of the path between any two points is the sum of the weights of all edges on the path. Knowing that there are vertices s and t in V, Dijkstra's algorithm can find the path with the lowest weight from s to t (for example, the shortest path). This algorithm can also find the shortest path from a vertex s to any other vertex in a graph. For directed graphs without negative weights, Dijkstra's algorithm is the fastest single-source shortest path algorithm currently known.

Algorithm steps:

1. Initial time S={V0}, T={rest vertices}, the distance value corresponding to the vertices in T

If there is <v0,vi>, d(V0,Vi) is the weight on the <v0,vi> arc

If there is no <v0,vi>, d(V0,Vi) is ∞

2. Select a vertex W with the smallest distance value from T and not in S, and add S

3. Modify the distance value of the vertices in the remaining T: If W is added as the middle vertex, the distance value from V0 to Vi is shortened, then modify this distance value

Repeat the above steps 2 and 3 until all vertices are included in S, that is, W=Vi

Algorithm nine: dynamic programming algorithm

Dynamic programming is a method used in mathematics, computer science and economics to solve complex problems by decomposing the original problem into relatively simple sub-problems. Dynamic programming is often suitable for problems with overlapping sub-problems and optimal sub-structures. The time-consuming method of dynamic programming is often far less than that of the naive solution.

The basic idea behind dynamic programming is very simple. Roughly speaking, if we want to solve a given problem, we need to solve its different parts (that is, sub-problems), and then merge the solutions of the sub-problems to get the solution of the original problem. Usually many sub-problems are very similar. For this reason, the dynamic programming method attempts to solve each sub-problem only once, thereby reducing the amount of calculation: once the solution of a given stator problem has been calculated, it is stored in memory so that the same sub-problem is needed next time Directly check the table when solving. This approach is particularly useful when the number of repeated sub-questions increases exponentially with respect to the size of the input.

The most classic problem about dynamic programming is the knapsack problem.

Algorithm steps:

1. Optimal substructure properties. If the solution of the sub-problem contained in the optimal solution of the problem is also optimal, we say that the problem has the property of optimal sub-structure (that is, it satisfies the principle of optimization). The properties of the optimal substructure provide important clues for the dynamic programming algorithm to solve the problem.

2. The overlapping nature of sub-problems. The overlapping nature of sub-problems means that when the recursive algorithm is used to solve the problem from top to bottom, the sub-problems generated each time are not always new problems, and some sub-problems will be repeated multiple times. The dynamic programming algorithm takes advantage of the overlapping nature of this sub-problem. It calculates each sub-problem only once, and then saves its calculation results in a table. When the calculated sub-problems need to be calculated again, only in the table. Simply check the results to get higher efficiency.
Algorithm ten: Naive Bayes classification algorithm

Naive Bayes classification algorithm is a simple probability classification algorithm based on Bayes' theorem. The basis of Bayesian classification is probabilistic reasoning, which is how to complete reasoning and decision-making tasks when the existence of various conditions is uncertain and only the probability of their occurrence is known. Probabilistic reasoning corresponds to deterministic reasoning. The Naive Bayes classifier is based on the independent hypothesis, that is, it is assumed that each feature of the sample is not related to other features.

The Naive Bayes classifier relies on an accurate natural probability model to obtain very good classification results in a supervised learning sample set. In many practical applications, the Naive Bayes model parameter estimation uses the maximum likelihood estimation method. In other words, the Naive Bayes model can work without using Bayesian probability or any Bayesian model.

Quick sort, merge sort, heap sort

So how to choose in actual application? There are these selection criteria:

    If n is smaller, use insertion sort and simple selection sort. Since direct insertion sorting requires more record movement operations than simple selection sorting, it is better to use simple selection sorting when the amount of information in the record itself is relatively large.
    If the sequence to be sorted is basically in order, direct insertion sort or bubble sort can be used.
    If n is large, the algorithm with the lowest time complexity should be used, such as fast sorting, heap sorting or merging

        In terms of subdivision, when the data is randomly distributed, the fast sorting is the best (this is related to the hardware optimization of the fast sorting, which is mentioned in the previous blog post). The
        heap row only needs one auxiliary space, and there will be no fast sorting. In bad cases,
        fast sorting and heap sorting are both unstable. If stability is required, merge can be used. You can also combine direct insertion sort and merge. First use direct insertion to obtain ordered fragments, and then merge, so that the result obtained is also stable Because direct insertion is stable
 

Guess you like

Origin blog.csdn.net/wangletiancsdn/article/details/105163503