Learning Javascript array of deduplication

Foreword

2895 word article, read it takes about 12 minutes.

In summary: This article summarizes 10 common array deduplication method, and various methods were compared.

  • No public: "front-end Advanced Learning ', replies," 666 ", get a package of front-end technology books

Smoke past all forgotten, selfless world wide .

text

Deduplication Array is not a common requirement for front-end, the general gave back end to do, but it is an interesting question, and often appear in the interview the interviewer to examine the degree of mastery of JS. From the data type of point of view and an array of deduplication this question, we solve an array of only basic data type situation, then go heavy object. The first is our test data:

var meta = [
    0,
    '0', 
    true,
    false,
    'true',
    'false',
    null,
    undefined,
    Infinity,
    {},
    [],
    function(){},
    { a: 1, b: 2 },
    { b: 2, a: 1 },
];
var meta2 = [
    NaN,
    NaN,
    Infinity,
    {},
    [],
    function(){},
    { a: 1, b: 2 },
    { b: 2, a: 1 },
];
var sourceArr = [...meta, ... Array(1000000)
    .fill({})
    .map(() => meta[Math.floor(Math.random() * meta.length)]),
    ...meta2];

Hereinafter all references sourceArrare the above variables. sourceArrIt contains 1000008pieces of data. It should be noted NaNthat it is the only one JS and their strict unequal value.

Then our goal is to top sourceArrthe array to get weight:

// 长度为14的数组
[false, "true", Infinity, true, 0, [], {}, "false", "0", null, undefined, {a: 1, b: 2}, NaN, function(){}]

Basic data types

1. ES6 in Set

ES6 This is a very common method for simple data types to weight basis, can use this method directly 扩展运算符 + Set:

console.time('ES6中Set耗时:');
var res = [...new Set(sourceArr)];
console.timeEnd('ES6中Set耗时:');
// ES6中Set耗时:: 28.736328125ms
console.log(res);
// 打印数组长度20: [false, "true", Infinity, true, 0, [],  [], {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, NaN, function(){}, function(){}]

Or use Array.from + Set:

console.time('ES6中Set耗时:');
var res = Array.from(new Set(sourceArr));
console.timeEnd('ES6中Set耗时:');
// ES6中Set耗时:: 28.538818359375ms
console.log(res);
// 打印数组长度20:[false, "true", Infinity, true, 0, [],  [], {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, NaN, function(){}, function(){}]

Advantages: simple and convenient, can be distinguished NaN;

Disadvantages: able to identify identical objects and arrays;

Simple scene recommend using this method to weight.

2. Use the indexOf

Use the built-indexOf method to find:

function unique(arr) {
    if (!Array.isArray(arr)) return;
    var result = [];
    for (var i = 0; i < arr.length; i++) {
        if (array.indexOf(arr[i]) === -1) {
            result.push(arr[i])
        }
    }
    return result;
}
console.time('indexOf方法耗时:');
var res = unique(sourceArr);
console.timeEnd('indexOf方法耗时:');
// indexOf方法耗时:: 23.376953125ms
console.log(res);
// 打印数组长度21: [false, "true", Infinity, true, 0, [],  [], {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, NaN,NaN, function(){}, function(){}]

Advantages : ES5 following general methods, high compatibility, easy to understand;

Drawback : it can not be distinguished NaN; require special handling;

You can use the following environment ES6.

3. Use inculdes method

And indexOfsimilar, but inculdesis ES7 (ES2016) new API:

function unique(arr) {
    if (!Array.isArray(arr)) return;
    var result = [];
    for (var i = 0; i < arr.length; i++) {
        if (!result.includes(arr[i])) {
            result.push(arr[i])
        }
    }
    return result;
}
console.time('includes方法耗时:');
var res = unique(sourceArr);
console.timeEnd('includes方法耗时:');
// includes方法耗时:: 32.412841796875ms
console.log(res);
// 打印数组长度20:[false, "true", Infinity, true, 0, [],  [], {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, NaN, function(){}, function(){}]

Advantages : can be distinguished NaN;

Shortcomings : high ES version requirements, and indexOfmethods compared to the time-consuming;

4. Use the filter and method indexOf

This method is ingenious, by determining whether the current index value and the index is equal to the lookup to determine whether the filter element:

function unique(arr) {
    if (!Array.isArray(arr)) return;
    return arr.filter(function(item, index, arr) {
        //当前元素,在原始数组中的第一个索引==当前索引值,否则返回当前元素
        return arr.indexOf(item, 0) === index;
    });
}
console.time('filter和indexOf方法耗时:');
var res = unique(sourceArr);
console.timeEnd('filter和indexOf方法耗时:');
// includes方法耗时:: 24.135009765625ms
console.log(res);
// 打印数组长度19:[false, "true", Infinity, true, 0, [],  [], {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, function(){}, function(){}]

Advantages : the function code shortened by using higher order;

Drawback : Because indexOfnot find NaN, therefore NaNbe ignored.

This method is very elegant, very little amount of code, but the structure and use the Set weight compared to still fly in the ointment.

5. 利用reduce+includes

It is also a clever use of two higher-order functions:

var unique = (arr) =>  {
   if (!Array.isArray(arr)) return;
   return arr.reduce((prev,cur) => prev.includes(cur) ? prev : [...prev,cur],[]);
}
var res = unique(sourceArr);
console.time('reduce和includes方法耗时:');
var res = unique(sourceArr);
console.timeEnd('reduce和includes方法耗时:');
// reduce和includes方法耗时:: 100.47802734375ms
console.log(res);
// 打印数组长度20:[false, "true", Infinity, true, 0, [],  [], {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, NaN, function(){}, function(){}]

Advantages : the function code shortened by using higher order;

Disadvantage : ES Version high, slower speed;

Also very elegant, but if this method can be used, also can be used to re-structure Set.

6. Use Map structure

Use the map to achieve:

function unique(arr) {
  if (!Array.isArray(arr)) return;
  let map = new Map();
  let result = [];
  for (let i = 0; i < arr.length; i++) {
    if(map .has(arr[i])) {
      map.set(arr[i], true); 
    } else { 
      map.set(arr[i], false);
      result.push(arr[i]);
    }
  } 
  return result;
}
console.time('Map结构耗时:');
var res = unique(sourceArr);
console.timeEnd('Map结构耗时:');
// Map结构耗时:: 41.483154296875ms
console.log(res);
// 打印数组长度20:[false, "true", Infinity, true, 0, [],  [], {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, NaN, function(){}, function(){}]

Set structure compared to the heavy consumption for a long time, it is not recommended.

7. Double nested, deleting duplicate the splice element

This is relatively common, double traverse of the array, pick repeat elements:

function unique(arr){    
    if (!Array.isArray(arr)) return;        
    for(var i = 0; i < arr.length; i++) {
        for(var j = i + 1; j<  arr.length; j++) {
            if(Object.is(arr[i], arr[j])) {// 第一个等同于第二个,splice方法删除第二个
                arr.splice(j,1);
                j--;
            }
        }
    }
    return arr;
}
console.time('双层嵌套方法耗时:');
var res = unique(sourceArr);
console.timeEnd('双层嵌套方法耗时:');
// 双层嵌套方法耗时:: 41500.452880859375ms
console.log(res);
// 打印数组长度20: [false, "true", Infinity, true, 0, [],  [], {b: 2, a: 1}, {b: 2, a: 1}, {}, {}, "false", "0", null, undefined, {a: 1, b: 2}, {a: 1, b: 2}, NaN, function(){}, function(){}]

Advantages : high compatibility.

Disadvantages : low performance, high time complexity.

Not recommended.

8. A method of using a sort

This idea is very simple, it is to use sortthe method to sort the array, and then loop through the array, and the adjacent elements of different elements singled out:

 function unique(arr) {
   if (!Array.isArray(arr)) return;
   arr = arr.sort((a, b) => a - b);
   var result = [arr[0]];
   for (var i = 1; i < arr.length; i++) {
     if (arr[i] !== arr[i-1]) {
       result.push(arr[i]);
     }
   }
   return result;
 }
console.time('sort方法耗时:');
var res = unique(sourceArr);
console.timeEnd('sort方法耗时:');
// sort方法耗时:: 936.071044921875ms
console.log(res);
// 数组长度357770,剩余部分省略
// 打印:(357770) [Array(0), Array(0), 0...]

Advantages : no;

Disadvantage : time-consuming, the sorted data is not controlled;

Not recommended, because the method does not use the sort sort of numeric types 0and strings type '0'sort lead to a lot of redundant data exists.

The above method is only for the underlying data type, function without regard for the array of objects, the following look at how to weight the same object.

Object

The following implementation and utilization of this structure is similar to Map, as used herein, do not overlap the object key features to achieve

And using a filter 9. hasOwnProperty

Use filterand hasOwnPropertymethods:

function unique(arr) {
    if (!Array.isArray(arr)) return;
    var obj = {};
    return arr.filter(function(item, index, arr) {
        return obj.hasOwnProperty(typeof item + item) ? false : (obj[typeof item + item] = true)
    })
}
console.time('hasOwnProperty方法耗时:');
var res = unique(sourceArr);
console.timeEnd('hasOwnProperty方法耗时:');
// hasOwnProperty方法耗时:: 258.528076171875ms
console.log(res);
// 打印数组长度13: [false, "true", Infinity, true, 0, [], {}, "false", "0", null, undefined, NaN, function(){}]

Advantages : simple code, the same array of objects may be distinguished function;

Disadvantages : higher version required, because you want to find the whole prototype chain and therefore lower performance;

The method uses the object key will not be repeated thereby distinguishing characteristics of objects and arrays, but the above is by 类型+值doing key way, so {a: 1, b: 2}and {}are treated as the same data. Therefore, this method also has shortcomings.

10. The use of a key target of unique characteristics

This method and use of Mapsimilar structure, but keyof different composition:

function unique(arr) {
    if (!Array.isArray(arr)) return;
    var result = [];
     var  obj = {};
    for (var i = 0; i < arr.length; i++) {
        var key = typeof arr[i] + JSON.stringify(arr[i]) + arr[i];
        if (!obj[key]) {
            result.push(arr[i]);
            obj[key] = 1;
        } else {
            obj[key]++;
        }
    }
    return result;
}
console.time('对象方法耗时:');
var res = unique(sourceArr);
console.timeEnd('对象方法耗时:');
// 对象方法耗时:: 585.744873046875ms
console.log(res);
// 打印数组长度15: [false, "true", Infinity, true, 0, [], {b: 2, a: 1}, {}, "false", "0", null, undefined, {a: 1, b: 2}, NaN, function(){}]

This method is relatively mature, and eliminates duplicate object repeating array, but the image {a: 1, b: 2}and {b: 2, a: 1}this can not be distinguished because of the two objects JSON.stringify()strings, respectively obtained after {"a":1,"b":2}and {"b":2,"a":1}, thus calculated two key values different. Plus a method of determining whether the objects are equal like, read as follows:

function isObject(obj) {
    return Object.prototype.toString.call(obj) === '[object Object]';
}
function unique(arr) {
    if (!Array.isArray(arr)) return;
    var result = [];
     var  obj = {};
    for (var i = 0; i < arr.length; i++) {
        // 此处加入对象和数组的判断
        if (Array.isArray(arr[i])) {
            arr[i] = arr[i].sort((a, b) => a - b);
        }
        if (isObject(arr[i])) {
            let newObj = {}
            Object.keys(arr[i]).sort().map(key => {
                newObj[key]= arr[i][key];
            });
            arr[i] = newObj;
        }
        var key = typeof arr[i] + JSON.stringify(arr[i]) + arr[i];
        if (!obj[key]) {
            result.push(arr[i]);
            obj[key] = 1;
        } else {
            obj[key]++;
        }
    }
    return result;
}
console.time('对象方法耗时:');
var res = unique(sourceArr);
console.timeEnd('对象方法耗时:');
// 对象方法耗时:: 793.142822265625ms
console.log(res);
// 打印数组长度14: [false, "true", Infinity, true, 0, [], {b: 2, a: 1}, {}, "false", "0", null, undefined, NaN, function(){}]

in conclusion

method advantage Shortcoming
Set in ES6 Simple and elegant, fast Underlying type is recommended . High version requirements, and does not support an array of objectsNaN
Use indexOf ES5 following general methods, high compatibility, ease of understanding Can not be distinguished NaN; require special handling
Use inculdes method Can distinguishNaN High ES version requirements, and indexOfmethods compared to the time-consuming
Use filter and method indexOf The function code shortened by using higher order; Because indexOfcan not find NaN, therefore NaNit is ignored.
利用reduce+includes The function code shortened by using higher order; ES7 older to use a slower speed;
Use Map structure No significant advantage ES6 or more,
Double nested, deleting duplicate the splice element High compatibility Low performance, high complexity and time, if not used Object.isto determine the need for NaNspecial handling, extremely slow.
Using the sort method no Time-consuming, the sorted data is not controlled;
Use hasOwnProperty and filter : Code simple, the same objects may be distinguished array function High version requirements, because you want to find the whole prototype chain and therefore lower performance;
Using the object key features will not be repeated Elegant, wide range of data Object recommended . The code is more complex.

Limited capacity, the level of general, welcomed the errata, be grateful.

Subscribe more articles may be concerned about the public number "front-end Advanced Learning ', replies," 666 ", get a package of front-end technology books

Advanced front-end learning

Guess you like

Origin www.cnblogs.com/jztan/p/12444477.html