The 4 most commonly used deduplication methods and performance comparison test data in Java List (programmed in Kotlin language)

Deduplication of elements in a List is an operation that we often use in usual work projects.Here are four commonly used deduplication algorithms, and performance comparison data.

Algorithm source code

package i

import java.util.*
import kotlin.collections.HashSet

/**
 * @author: Jack
 * 2020-03-28 13:33
 */


/**
 * 使用List集合contains方法循环遍历
 */
fun uniqList1(list: List<Int>): List<Int> {
    val result = mutableListOf<Int>()
    for (e in list) {
        if (!result.contains(e)) {
            result.add(e)
        }
    }
    return result
}

/**
 * 使用 HashSet
 */
fun uniqList2(list: List<Int>): List<Int> {
    val set = HashSet(list)
    val result = mutableListOf<Int>()
    result.addAll(set)
    return result
}

/**
 * 使用 TreeSet
 */
fun uniqList3(list: List<Int>): List<Int> {
    val set = TreeSet(list)
    val result = mutableListOf<Int>()
    result.addAll(set)
    return result
}

/**
 * 使用 stream API
 */
fun uniqList4(list: List<Int>): List<Int> {
    return list.distinct()
}


/**
 * 性能测试代码
 */
fun main() {
    val list1 = mutableListOf<Int>()
    val list2 = mutableListOf<Int>()
    val list3 = mutableListOf<Int>()
    val list4 = mutableListOf<Int>()

    val random = Random()

    for (i in 0..10000000) {
        val value = random.nextInt(500)
        list1.add(value)
        list2.add(value)
        list3.add(value)
        list4.add(value)
    }

    var startTime = 0L
    var endTime = 0L

    startTime = System.currentTimeMillis()
    uniqList1(list1)
    endTime = System.currentTimeMillis()
    println("使用 contains 方法 uniq1:${endTime - startTime}")

    startTime = System.currentTimeMillis()
    uniqList2(list2)
    endTime = System.currentTimeMillis()
    println("使用 HashSet uniq2:${endTime - startTime}")

    startTime = System.currentTimeMillis()
    uniqList3(list3)
    endTime = System.currentTimeMillis()
    println("使用 TreeSet uniq3:${endTime - startTime}")

    startTime = System.currentTimeMillis()
    uniqList4(list4)
    endTime = System.currentTimeMillis()
    println("使用 stream API uniq4:${endTime - startTime}")

}

Performance test results

Performance data:

使用 contains 方法 uniq1:1648
使用 HashSet uniq2:344
使用 TreeSet uniq3:598
使用 stream API uniq4:177

Conclusion: With 10 million levels of data, the performance of Stream API is the best.

Test machine configuration:

型号名称:   MacBook Pro
  型号标识符:    MacBookPro15,1
  处理器名称:    Intel Core i7
  处理器速度:    2.6 GHz
  处理器数目:    1
  核总数:  6
  L2 缓存(每个核):   256 KB
  L3 缓存:    12 MB
  超线程技术:    已启用
  内存:   16 GB
  Boot ROM 版本:  220.260.171.0.0 (iBridge: 16.16.5200.0.0,0)
  序列号(系统):  C02Z43JXLVCF
  硬件 UUID:  FAAEE2DB-8F7C-54B1-A0B7-F286C27EA35F

内存插槽:

  ECC:  已停用
  可升级内存:    否

BANK 0/ChannelA-DIMM0:

  大小:   8 GB
  类型:   DDR4
  速度:   2400 MHz
  状态:   好
  制造商:  Micron
  部件号:  8ATF1G64HZ-2G6E1
  序列号:  -

BANK 2/ChannelB-DIMM0:

  大小:   8 GB
  类型:   DDR4
  速度:   2400 MHz
  状态:   好
  制造商:  Micron
  部件号:  8ATF1G64HZ-2G6E1
  序列号:  -


Kotlin developer community

1233356-4cc10b922a41aa80

The public account of the first Kotlin developer community in China, which mainly shares and exchanges related topics such as Kotlin programming language, Spring Boot, Android, React.js / Node.js, functional programming, and programming ideas.

The more noisy the world, the more peaceful thinking is needed.

1665 original articles published · 1067 praised · 750,000 views

Guess you like

Origin blog.csdn.net/universsky2015/article/details/105175751