Front-end advancement: Summary of several commonly used js search algorithms and performance comparisons

Preface

Today, let us continue to talk about the js algorithm. Through the following explanation, we can understand the basic implementation of the search algorithm and the performance of various implementation methods, and then discover the performance differences of for loop, forEach, and While , we will also understand How to do algorithm sharding through web worker, greatly improve the performance of the algorithm.

At the same time, I will also briefly introduce the classic binary algorithm and hash table lookup algorithm , but these are not the focus of this chapter. Later I will launch a corresponding article to introduce these advanced algorithms in detail. Interested friends can follow my column or join us Explore.

For algorithm performance, we will still use the getFnRunTime function in the previous chapter "Front-end Algorithm Series" How to increase the front-end code speed by 60 times . If you are interested, you can check and learn it. I won't explain too much here.

In the last chapter "Front-end Algorithm Series" how to increase the front-end code speed by 60 times, we simulated 19,000 pieces of data. In this chapter, in order to make the effect more obvious, I will fake 1.7 million pieces of data to test, but believe me, come to js Said this is nothing. . .

1. for loop search

Basic idea: traverse the array through a for loop, find the index of the value to be searched in the array, and push it into the new array

The code is implemented as follows:

const getFnRunTime = require('./getRuntime');

 /**
  * 普通算法-for循环版
  * @param {*} arr 
  * 耗时：7-9ms
  */
 function searchBy(arr, value) {
     let result = [];
    for(let i = 0, len = arr.length; i < len; i++) {
        if(arr[i] === value) {
            result.push(i);
        }
    }
    return result
 }
 getFnRunTime(searchBy, 6)

The results after testing for n times of stabilization are as follows:

2.forEach loop

The basic idea is similar to the for loop:

/**
  * 普通算法-forEach循环版
  * @param {*} arr 
  * 耗时：21-24ms
  */
 function searchByForEach(arr, value) {
    let result = [];
    arr.forEach((item,i) => {
        if(item === value) {
            result.push(i);
        }
    })
   return result
}

It takes 21-24 milliseconds, and it can be seen that the performance is not as good as the for loop (for the time being, the essence is the same).

3.while loop

code show as below:

/**
  * 普通算法-while循环版
  * @param {*} arr 
  * 耗时：11ms
  */
 function searchByWhile(arr, value) {
     let i = arr.length,
     result = [];
    while(i) {
        if(arr[i] === value) {
            result.push(i);
        }
        i--;
    }
    
   return result
}

It can be seen that while and for loops have similar performance, they are both excellent, but it does not mean that forEach performance is not good, so they are not used. Compared with for loop, foreach has reduced code, but foreach relies on IEnumerable. The efficiency is lower than for loop at runtime. But it is more convenient to use foreach when dealing with loops with uncertain loop times or when the loop times need to be calculated. And after the code of foreach is optimized by the code of the compilation system, it is similar to the for loop.

4. Binary search

More application scenarios of binary search are in an array with unique and ordered values in the array. Here we will not compare its performance with for/while/forEach.

The basic idea: start from the middle position of the sequence, if the current position value is equal to the value to be searched, the search is successful; if the value to be searched is less than the current position value, search in the first half of the sequence; if the value to be searched is greater than The current position value will be searched in the second half of the sequence until it is found

code show as below:

/**
   * 二分算法
   * @param {*} arr 
   * @param {*} value 
   */
  function binarySearch(arr, value) {
    let min = 0;
    let max = arr.length - 1;
    
    while (min <= max) {
      const mid = Math.floor((min + max) / 2);
  
      if (arr[mid] === value) {
        return mid;
      } else if (arr[mid] > value) {
        max = mid - 1;
      } else {
        min = mid + 1;
      }
    }
  
    return 'Not Found';
  }

In a scene with a large amount of data, the dichotomy is very efficient, but it is unstable, which is also a small disadvantage under the big data query.

5. Hash table lookup

Hash table lookup is also called hash table lookup. The storage location of the record can be obtained by searching for keywords without comparison. It establishes a definite correspondence between the storage location of the record and its keywords, so that Each key corresponds to a storage location f (key)

Use scenarios for hash table lookup:

The most suitable solution for a hash table is to find records that are equal to a given value
Hash search is not suitable for the situation where the same keyword corresponds to multiple records
Not suitable for range search, such as searching for classmates aged 18-22

Here I first give a simplest version of hashTable, so that everyone can understand the hash more easily:

/**
 * 散列表
 * 以下方法会出现数据覆盖的问题
 */
function HashTable() {
  var table = [];

  // 散列函数
  var loseloseHashCode = function(key) {
    var hash = 0;
    for(var i=0; i<key.length; i++) {
      hash += key.charCodeAt(i);
    }
    return hash % 37
  };

  // put
  this.put = function(key, value) {
    var position = loseloseHashCode(key);
    table[position] = value;
  }

  // get
  this.get = function(key) {
    return table[loseloseHashCode(key)]
  }

  // remove
  this.remove = function(key) {
    table[loseloseHashCode(key)] = undefined;
  }
}

This method may cause data conflicts, but there are also solutions. Since there are many knowledge points involved here, I will publish an article to introduce later:

Open addressing
Second detection method
Random detection

Use web worker optimization

Through the above methods, we already know the performance and application scenarios of various algorithms. When we use the algorithm, we can also optimize it through web workers to allow the program to be processed in parallel, such as splitting a large array into multiple pieces. The web worker threads help us process the calculation results, and finally merge the results and pass them to the browser through the worker event mechanism. The effect is very significant.

to sum up

For complex array queries, for/while performance is higher than forEach and other array methods
O(logn) of binary search method is a very efficient algorithm. But its shortcomings are also obvious: it must be in order, and it is difficult for us to guarantee that our arrays are in order. Of course, it can be sorted when constructing the array, but it falls into the second bottleneck: it must be an array. The efficiency of array reading is O(1), but the efficiency of inserting and deleting an element is O(n). As a result, the efficiency is reduced when constructing an ordered array.
Basic usage and usage scenarios of hash table lookup.
If conditions permit, we can use web workers to optimize the algorithm and let it execute in parallel in the background.

Well, although this article is relatively simple, it is very important. I hope everyone has a more intuitive understanding of search algorithms, and I hope everyone has a better way to discuss and communicate together.