The reason for measuring the scene, because merge multiple data sources result when, sometimes just a single sub-query results, but this time using sql database processing does not necessarily reasonable (network delay too).
100,000 lines of test data, the results line 1000
delay limit 20 offset 0 as follows:
package com.hundsun.ta.base.service; import com.hundsun.ta.utils.JsonUtils; import lombok.AllArgsConstructor; import lombok.NoArgsConstructor; import java.math.BigDecimal; import java.util.*; import java.util.stream.Collectors; import static java.util.stream.Collectors.*; /** * @author zjhua * @description * @date 2019/10/3 15:35 */ public class JavaStreamCommonSQLTest { public static void main(String[] args) { List<Person> persons = new ArrayList<>(); for (int i=100000;i>0;i--) { persons.add(new Person("Person " + (i+1)%1000, i % 100, i % 1000,new BigDecimal(i),i)); } System.out.println(System.currentTimeMillis()); Map<String,Map<Integer, Data>> result = persons.stream().collect( groupingBy(Person::getName,Collectors.groupingBy(Person::getAge, collectingAndThen(summarizingDouble(Person::getQuantity), dss -> new Data((long)dss.getAverage(), (long)dss.getSum()))))); List<ResultGroup> list = new ArrayList<>(); result.forEach((k,v)->{ v.forEach((ik,iv)->{ ResultGroup e = new ResultGroup(k,ik,iv.average,iv.sum); list.add(e); }); }); list.sort(Comparator.comparing(ResultGroup::getSum).thenComparing(ResultGroup::getAverage)); list.subList(0,20); System.out.println(System.currentTimeMillis()); System.out.println(JsonUtils.toJson(list)); } } @lombok.Data@NoArgsConstructor@AllArgsConstructor class Person { String name; int group; int age; BigDecimal balance; double quantity; } @lombok.Data@NoArgsConstructor@AllArgsConstructor @Deprecated class ResultGroup { String name; int group; long average; long sum; } class Data { long average; long sum; public Data(long average, long sum) { this.average = average; this.sum = sum; } }
Start: 1570093479002
End: 1570093479235 --200 many milliseconds
100,000 lines of test data, the results line 90000
delay limit 20 offset 10000 as follows:
package com.hundsun.ta.base.service; import com.hundsun.ta.utils.JsonUtils; import lombok.AllArgsConstructor; import lombok.NoArgsConstructor; import java.math.BigDecimal; import java.util.*; import java.util.stream.Collectors; import static java.util.stream.Collectors.*; /** * @author zjhua * @description * @date 2019/10/3 15:35 */ public class JavaStreamCommonSQLTest { public static void main(String[] args) { List<Person> persons = new ArrayList<>(); for (int i=100000;i>0;i--) { persons.add(new Person("Person " + (i+1)%1000, i>90000 ? i%10000:i, i % 1000,new BigDecimal(i),i)); } System.out.println(System.currentTimeMillis()); Map<String,Map<Integer, Data>> result = persons.stream().collect( groupingBy(Person::getName,Collectors.groupingBy(Person::getGroup, collectingAndThen(summarizingDouble(Person::getQuantity), dss -> new Data((long)dss.getAverage(), (long)dss.getSum()))))); List<ResultGroup> list = new ArrayList<>(); result.forEach((k,v)->{ v.forEach((ik,iv)->{ ResultGroup e = new ResultGroup(k,ik,iv.average,iv.sum); list.add(e); }); }); list.sort(Comparator.comparing(ResultGroup::getSum).thenComparing(ResultGroup::getAverage)); System.out.println(list.size()); list.subList(10000,10020); System.out.println(System.currentTimeMillis()); System.out.println(JsonUtils.toJson(list)); } } @lombok.Data@NoArgsConstructor@AllArgsConstructor class Person { String name; int group; int age; BigDecimal balance; double quantity; } @lombok.Data@NoArgsConstructor@AllArgsConstructor @Deprecated class ResultGroup { String name; int group; long average; long sum; } class Data { long average; long sum; public Data(long average, long sum) { this.average = average; this.sum = sum; } }
Start: 1570093823404
End: 1570093823758 --350 many milliseconds
Overall, up to now, java stream can not directly replace sql lower costs, such as the typical group by more than one field does not support the need for multi-level map (not only complex, low performance), and group by statistics i must also result in a separate class. Development costs would be too high.
Reference: https: //stackoverflow.com/questions/32071726/java-8-stream-groupingby-with-multiple-collectors