The front end does not understand the algorithm (1)-bubble sort

This article refers to the following articles and the data structure and algorithm of Geek Time. If it involves infringement, please contact me to delete it.

Preface

  When learning programming, the first algorithm that most people come into contact with should be sorting. Many programming languages ​​also provide sorting functions. In ordinary projects, we often encounter sorting. So this series of "The Front End Doesn't Know Algorithms " starts with sorting, let's learn about classic sorting together.
   The types of sorting algorithms are relatively complicated, and each sorting algorithm has a specific use scenario. Different sorting methods can be selected according to characteristics, implementation principles, time complexity, and space complexity.
  Due to the many methods, there are many unheard sorts, such as noodle sorting, trotters sorting, hot pot sorting, etc. In this series, we will learn the most commonly used ones, bubble sort, insertion sort, selection sort, quick sort, merge sort, bucket Sort, radix sort.
  In this section we first learn bubbling, insertion and selection.
Insert picture description here
  Learning with questions is the most efficient. Let's look at a question first: insertion sort and bubble sort have the same time complexity, both are O(n 2 ). In actual software development, why is it more inclined to use insertion sort instead of bubble sort?
  Think about it for a minute, and then we learn together.

How to analyze a "sorting algorithm"?

  Learning an algorithm is not only about the principle and implementation, but also how to analyze the algorithm. For the sorting algorithm, we must start with the following aspects.

Execution efficiency of sorting algorithm

  The execution efficiency of the sorting algorithm is divided into the following aspects.

1. Best case, worst case, average time complexity

  When we analyze the time complexity of the sorting algorithm, we need to give the time complexity of the optimistic situation, the pessimistic situation, and the average situation respectively. In addition, we should also give out what the original data looks like in optimistic and pessimistic situations.

Why distinguish the time complexity in the three cases?

  • 1. Some sorting algorithms will distinguish, in order to compare, we will do a classification discussion.
  • 2. For the sorting order, some are close to order, and some are completely disordered. Data with different degree of order will definitely affect the execution time of sorting. We need to know the performance of the sorting algorithm under different data.

2. Time complexity coefficients, constants, low-level

  We know that time complexity reflects a growing trend when the data size n is large, so it ignores coefficients, constants, and low-levels when it is expressed. But in actual software development, we may sort 10, 100, or 1000 such small-scale data. Therefore, when comparing the performance of sorting algorithms with the same level of time complexity, we have to compare the coefficients, Constant, low-level is also considered.

3. The number of comparisons and the number of exchanges (or moves)

  Bubbling, insertion, and selection are all based on the comparison-based sorting algorithm execution process, which involves two operations, one is comparison and the other is movement. So when we analyze the execution efficiency of the sorting algorithm, we need to take into account the number of comparisons and the number of exchanges (or moves).

Memory consumption of sorting algorithm

  Memory consumption sorting algorithm can be used to represent the spatial complexity for sorting algorithm, we also need to introduce a new concept in place sorting algorithm , the algorithm is specific to the space complexity is O (1) sorting algorithms. The three sorts discussed in this article are all in-situ sorting algorithms.

The stability of the sorting algorithm

  It is not enough to measure the performance of a sorting algorithm by only using execution efficiency and memory consumption. For the sorting algorithm, there is another indicator, stability . This concept means that if there are elements of equal value in the sequence to be sorted, the original order of equal elements will not change after sorting.

  For example: there is an array [1, 3, 7, 3, 5,4], after sorting, it will become[1, 3, 3, 4, 5,7]

  This data set has two 3 if, after sorting, if it was around two 3-order does not change, then call this sort is stable sorting algorithm , if the change occurs before and after the order, then that is unstable sorting algorithm .

  You may ask, these two 3s, who is in front and who is behind, is there any problem? What does it have to do with stability?

  Why do we need to consider the stability of the sorting algorithm? When we give examples of sorting, we always take integers as examples. However, in actual development, the objects we want to sort are often not simple integers, but a set of objects. a keyvalue sort.

eg: Now we need to sort the "orders" in the e-commerce trading system. The order has two attributes, one is the order time and the other is the order amount. If there are 100,000 orders for the "order quantity", we hope to order according to the amount, Sort from small to large. For orders of the same amount, we want to sort them according to the order time, so what should we do for such a demand?

  The first method may be the first to think of the amount of sort, and then separately for the same amount of data, according to the time order, this line of thinking is not difficult to understand, but the implementation will be very complicated.

  If we use a stable sorting algorithm, then this problem can be easily solved.

The idea is as follows:

  • Sort by order time first
  • Then use a stable algorithm to sort again according to the amount

When completed, the orders with the same amount will be arranged in chronological order. The question is, why?
Insert picture description here
  Why is it done by scheduling the time first, then the amount?

  Stable sorting can ensure that the order of two objects with the same amount remains unchanged after sorting . After the first sorting, all orders are sorted according to the order time. In the second sorting, we use stable sorting. Two objects of the same amount will not change the order, so we sorted them by time before. The order will not be disrupted.
Insert picture description here

Bubble Sort

  Bubble sort will only operate on two adjacent data. Each bubble operation will compare two adjacent data to see if the size relationship requirements are met. If not, let them exchange. One bubbling will move at least one element to the position where it should be. Repeat n times to complete the sorting of n data.

  For example, we need to sort a set of data, 4, 5, 6, 3, 2, 1, from small to large. The first bubbling process is like this:

Insert picture description here
  As can be seen from the above figure, after a bubbling operation, the 6 element has been stored in the correct position. If you want to sort all the data, you only need to bubbling 6 times.
Insert picture description here

  In fact, bubbling sorting can be optimized. When there is no data exchange for a bubbling operation, it means that it has reached a complete order, and there is no need to continue to perform subsequent bubbling operations. The following example only needs 4 bubbling to complete.
Insert picture description here
  The principle of the bubble sort algorithm is relatively clear and easy to understand. The following is the code part, which can be viewed in conjunction with the schematic diagram

function bubbleSorting(arr) {
    
    
	let n =  arr.length;
	
	if (n<=1) return;
	
	for (let i = 0; i < n; ++i) {
    
    
	    //设立提前退出标志位
	    let flag = false;
	    for (let j = 0; j < n - i - 1; ++j) {
    
    
	        //交换
	        if (arr[j] > arr[j + 1]) {
    
    
	            let temp = arr[j];
	            arr[j] = arr[j + 1];
	            arr[j + 1] = temp;
	
	            //表示有数据交换
	            flag = true;
	        }
	    }
	    //没有数据交换,提前退出
	    if (!flag) break;
	}
}

  Combining the above analysis of the three aspects of the sorting algorithm, there are now three problems

Q1, is bubble sorting in-situ sorting?

  • Bubble sort only involves the exchange of adjacent data and only requires constant-level temporary space, so the space complexity is O(1).

Q2. Is bubble sorting a stable sorting algorithm?

  • In bubble sorting, only exchange can change the order of the two elements. In order to ensure the stability of bubble sorting, when two elements are equal in size, we do not perform exchange processing. Data of the same size will not be sorted before and after sorting. Change, so it is a stable sort.

Q3. What is the time complexity of bubble sort?

  • In the best case, the data to be sorted is already in order, and we only need to bubbling once to end, so in the best case, the time complexity is O(n), and in the worst case, the sequence of numbers to be sorted is completely reversed. Yes, we need to bubble n times, so the worst-case time complexity is O(n 2 ).
    Insert picture description here

  In the best and worst cases, the time complexity is relatively easy to distinguish. What is the time complexity in the average case?

  At this time, it is necessary to combine the knowledge points of probability theory, the average time complexity is the weighted average expected time complexity

  For an array containing n data, there are n! ways to arrange the data. Different arrangements, the execution time of bubbling is definitely different. If you use the quantitative analysis of the probability theory to analyze the average time complexity, the derivation process is more complicated (well, I won't, I will help you push it once after I research it o(╥﹏╥)o).

  Here we change our way of thinking and analyze it through the concepts of " order degree " and " reverse order degree ".

  The degree of order is the number of pairs of elements in an array that have an ordered relationship. The mathematical expression of ordered elements is as follows:

  Ordered element pair:, a[i] <= a [j]if i < j.
Insert picture description here
  Similarly, for a reverse-ordered array, such as 6, 5, 4, 3, 2, 1, the degree of order is 0, and for a completely ordered array, such as 1, 2, 3, 4, 5, 6, is the degree of order n-* (n--. 1) / 2 , is 15, the degree of ordering of such fully ordered array, called the full degree of order .

  The principle of the degree of reverse order is just the opposite. The pair of elements in reverse order:, a[i] > a [j]if i < j.

  Regarding the above concept, there is another formula: reverse order degree = full order degree-order degree . The process of sorting is the process of increasing the degree of order and reducing the degree of reversal. After reaching the full degree of order, the sorting is completed.

  In the previous example, 4, 5, 6, 3, 2, 1, where the pairs of ordered elements are (4, 5), (4, 6), (5, 6), so the degree of order is 3, n = 6, So after the sorting is completed, the final full ordering degree is n*(n-1)/2 = 15 .

Insert picture description here
  Bubble sort operation comprises two atoms, comparison and exchange . Every time it is exchanged, the degree of order is +1. No matter how improved algorithm, the number of exchanges are determined, namely reverse degrees , i.e. n * (n-1) / 2 - the initial degree of order , in this case 12 times to go through the exchange.

  What is the average number of exchanges for bubble sorting an array containing n arrays? In the worst state, the initial state order is 0, so n*(n-1)/2 exchanges are required. In the best case, no exchange is required. We can go to an intermediate value n*(n-1 )/4 to indicate that the degree of order is neither high nor low.

  In other words, the average case requires n * (n-1) / 4 times the exchange operation, the comparison operation certainly better than switching operation to be more, and the upper limit of the complexity is O (n- 2 ), so that the average time complexity The degree is O(n 2 ).

  The derivation process of this average complex time complexity is not rigorous, but it is very practical. Quantitative analysis of probability theory is too complicated to be easy to use. Follow-up blog posts will be fast-ranked learning, then we will use this "non-strict" method to analyze the average time complexity.

The above is the first in a series of articles I prepared, "The front-end does not understand the algorithm ", and there will be a summary of different algorithms in the follow-up. Welcome everyone to give pointers.

  Article jump

Guess you like

Origin blog.csdn.net/EcbJS/article/details/105761748