List collection methods and deduplication efficiency comparison

List collection I believe we almost always used in the development process. Sometimes it will inevitably encounter the collection of data is duplicated, it needs to be removed. However, there are several ways to weight the way, which way you use it? Whether deduplication efficiency is the most efficient way, the best of it? Today, we give a set of common List and explain the four common way to heavy.

01

Realization of ideas: the use of two for looping through the set of all elements , and then determines whether the same element, if any, is removed. This approach is most of the first thought is the most simple implementation. Among them, this way we can guarantee List collection of original order unchanged.

Code:

/**
* notes:使用两个for循环实现List去重
@param list
@return
*/
public static List repeatListWayOne(List<String> list){
   for(int i = 0;i < list.size();i++){
       for(int j = i+1;j < list.size();j++){
           if(list.get(i).equals(list.get(j))){
               list.remove(j);
           }
       }
   }
   return list;
}

 

02

Realization of ideas: we know HashSet implements Set interface does not allow duplicate elements . Can be based on this idea, the set of all elements into the List HashSet object, and then clear all the elements of the List collection, and finally the HashSet object element added to the List of all the collection, so as to ensure non-duplication of elements. And there is a HashSet constructor elements may be added directly during initialization. Which, HashSet not guarantee the order unchanged, so this method can not guarantee the List collection of original order unchanged.

Code:

/**
* notes:使用HashSet实现List去重
@param list
@return
*/
public static List repeatListWayTwo(List<String> list){
  //初始化HashSet对象,并把list对象元素赋值给HashSet对象
  HashSet set = new HashSet(list);
  //把List集合所有元素清空
  list.clear();
  //把HashSet对象添加至List集合
  list.addAll(set);
  return list;
}

 

03

Realization of ideas: TreeSet collection also implements the Set interface, it is an orderly, and no repeated elements of the collection . Similarly, we can go re-thinking based on the above two ways. Wherein, List can be re-set to the original order and are consistent.

Code:

/**
* notes:使用TreeSet实现List去重
@param list
@return
*/
public static List repeatListWayThird(List<String> list){
   //初始化TreeSet对象,并把list对象元素赋值给TreeSet对象
   TreeSet set = new TreeSet(list);
   //把List集合所有元素清空
   list.clear();
   //把TreeSet对象添加至List集合
   list.addAll(set);
   return list;
}

 

04

Realization of ideas: the use of loop through the List collection contains method , first create a new List collection, then loop through the original List collection to determine whether the new collection contains a collection of old, and if so, is not added to the new collection, otherwise add. Finally, the collection of old emptied, the new collection of elements assigned to the old collection.

Code:

/**
* notes:利用List集合contains方法循环遍历去重
@param list
@return
*/
public static List repeatListWayFourth(List<String> list){
   //新建新List集合,用于存放去重后的元素
   List<String> newList = new ArrayList<String>();
   //循环遍历旧集合元素
   for(int i = 0; i < list.size(); i++ ){
       //判断新集合是否包含有,如果不包含有,则存入新集合中
       boolean isContains = newList.contains(list.get(i));
       if(!isContains){
           newList.add(list.get(i));
       }
   }
   //把List集合所有元素清空
   list.clear();
   //把新集合元素添加至List集合
   list.addAll(newList);
   return list;
}

Above introduces four collections List deduplication way. So, which way is the best efficiency of it? Here we show you the comparison.

To demonstrate the way, 20,000 randomly generated integer string between 0-500, and collection List stored and printed at the relevant time comparing corresponding code. Wherein a set of randomly generated List code as follows:

/**
* 随机生成0-500之间的20000个整数字符串,并存入List集合
@return
*/
public static List<String> getRandomList(){
   List<String> list = new ArrayList<String>();
   //随机生成20000个整数字符串
   for(int i = 1; i <= 20000; i++){
       //任意取[0,500)之间整数,其中0可以取到,500取不到
       int number = new Random().nextInt(500);
       String number_str = "geshan"+number;
       list.add(number_str);
   }
   return list;
}

In order to ensure consistent collection of elements List, create four collections List, List correspond to heavy way. Efficiency Comparison code is as follows:

public static void main(String[] args){
   //随机生成0-500之间的1000个整数字符串List集合
   List<String> list = getRandomList();

   //为了演示四种方式效率,创建四个List集合,保证List集合元素一致
   //方式一List集合
   List<String> oneList = new ArrayList<>();
   oneList.addAll(list);
   //方式二List集合
   List<String> twoList = new ArrayList<>();
   twoList.addAll(list);
   //方式三List集合
   List<String> thirdList = new ArrayList<>();
   thirdList.addAll(list);
   //方式四List集合
   List<String> fourthList = new ArrayList<>();
   fourthList.addAll(list);

   System.out.println("方式一:使用两个for循环实现List去重");
   System.out.println("原来集合大小:"+oneList.size()+",集合元素>>"+oneList);
   Date oneDateBegin = new Date();
   repeatListWayOne(oneList);
   System.out.println("集合去重大小:"+oneList.size()+",集合元素>>"+oneList);
   Date oneDateEnd = new Date();
   System.out.println("去重所需时间:"+(oneDateEnd.getTime()-oneDateBegin.getTime())+"毫秒");

   System.out.println("方式二:使用HashSet实现List去重");
   System.out.println("原来集合大小:"+twoList.size()+",集合元素>>"+twoList);
   Date twoDateBegin = new Date();
   repeatListWayTwo(twoList);
   System.out.println("集合去重大小:"+twoList.size()+",集合元素>>"+twoList);
   Date twoDateEnd = new Date();
   System.out.println("去重所需时间:"+(twoDateEnd.getTime()-twoDateBegin.getTime())+"毫秒");

   System.out.println("方式三:使用TreeSet实现List去重");
   System.out.println("原来集合大小:"+thirdList.size()+",集合元素>>"+thirdList);
   Date thirdDateBegin = new Date();
   repeatListWayThird(thirdList);
   System.out.println("集合去重大小:"+thirdList.size()+",集合元素>>"+thirdList);
   Date thirdDateEnd = new Date();
   System.out.println("去重所需时间:"+(thirdDateEnd.getTime()-thirdDateBegin.getTime())+"毫秒");

   System.out.println("方式四:利用List集合contains方法循环遍历去重");
   System.out.println("原来集合大小:"+fourthList.size()+",集合元素>>"+fourthList);
   Date fourthDateBegin = new Date();
   repeatListWayFourth(fourthList);
   System.out.println("集合去重大小:"+fourthList.size()+",集合元素>>"+fourthList);
   Date fourthDateEnd = new Date();
   System.out.println("去重所需时间:"+(fourthDateEnd.getTime()-fourthDateBegin.getTime())+"毫秒");
}

Multiple runs were as follows:

The first four ways run time as follows: 223,10,16,30;

The second four ways to run the following times: 164,10,17,43;

The third time in four ways run as follows: 164,9,16,37;

Integrated code and run-time comparison, Second way is the best way to re-code the most simple and fastest total time . List collection you usually go heavy on the way with it?

Guess you like

Origin www.cnblogs.com/geshanzsq/p/10392781.html