I have the following HashSet, which needs to have TreeSets as elements :
HashSet<TreeSet<Integer>> hash=new HashSet<TreeSet<Integer>>();
I want to be able to store ordered elements into the TreeSet, order is also important hence the choice of a TreeSet. Those elements themselves need to be ordered but not necessarily sorted. But I must return :
List<List<Integer>>
What is the most efficient way to convert my hash into a List of a List of Integers in terms of performance ?
Thank you
You still need to iterate over collection and convert inner sets into lists. Although if performance is of the greatest importance, consider using some third-party libraries, that provide some optimizations.
1. Benchmarks
I have written a couple of simple benchmarks using JMH to compare them with vanilla java solution:
public class Set2ListTest {
private static final int SMALL_SET_SIZE = 10;
private static final int LARGE_SET_SIZE = 1000;
public static void main(String[] args) throws RunnerException {
Options options = new OptionsBuilder()
.include(Set2ListTest.class.getSimpleName())
.forks(1)
.threads(8)
.warmupIterations(1)
.measurementIterations(1)
.build();
new Runner(options).run();
}
@State(Scope.Benchmark)
public static class Provider {
Set<Set<Integer>> smallSet = new HashSet<>(SMALL_SET_SIZE);
Set<Set<Integer>> largeSet = new HashSet<>(LARGE_SET_SIZE);
@Setup
public void setup() {
fillSet(smallSet, SMALL_SET_SIZE);
fillSet(largeSet, LARGE_SET_SIZE);
}
private void fillSet(Set<Set<Integer>> set, int count) {
Random random = new Random();
for (int i = 0; i < count; i++) {
Set<Integer> innerSet = new TreeSet<>();
for (int j = 0; j < count; j++) {
innerSet.add(random.nextInt(Integer.MAX_VALUE));
}
set.add(innerSet);
}
}
}
@Benchmark
public void small_plainJava(Provider provider, Blackhole blackhole) {
List<List<Integer>> list = new ArrayList<>(SMALL_SET_SIZE);
for (Set<Integer> set : provider.smallSet) {
list.add(new ArrayList<>(set));
}
blackhole.consume(list);
}
@Benchmark
public void large_plainJava(Provider provider, Blackhole blackhole) {
List<List<Integer>> list = new ArrayList<>(LARGE_SET_SIZE);
for (Set<Integer> set : provider.largeSet) {
list.add(new ArrayList<>(set));
}
blackhole.consume(list);
}
@Benchmark
public void small_guava(Provider provider, Blackhole blackhole) {
List<List<Integer>> list = new ArrayList<>(SMALL_SET_SIZE);
for (Set<Integer> set : provider.smallSet) {
list.add(com.google.common.collect.Lists.newArrayList(set));
}
blackhole.consume(list);
}
@Benchmark
public void large_guava(Provider provider, Blackhole blackhole) {
List<List<Integer>> list = new ArrayList<>(LARGE_SET_SIZE);
for (Set<Integer> set : provider.largeSet) {
list.add(com.google.common.collect.Lists.newArrayList(set));
}
blackhole.consume(list);
}
@Benchmark
public void small_commons(Provider provider, Blackhole blackhole) {
List<List<Integer>> list = new ArrayList<>(SMALL_SET_SIZE);
for (Set<Integer> set : provider.smallSet) {
List<Integer> innerList = new ArrayList<>(SMALL_SET_SIZE);
CollectionUtils.addAll(innerList, set);
list.add(innerList);
}
blackhole.consume(list);
}
@Benchmark
public void large_commons(Provider provider, Blackhole blackhole) {
List<List<Integer>> list = new ArrayList<>(LARGE_SET_SIZE);
for (Set<Integer> set : provider.largeSet) {
List<Integer> innerList = new ArrayList<>(LARGE_SET_SIZE);
CollectionUtils.addAll(innerList, set);
list.add(innerList);
}
blackhole.consume(list);
}
@Benchmark
public void small_eclipse(Provider provider, Blackhole blackhole) {
List<List<Integer>> list = FastList.newList();
for (Set<Integer> set : provider.smallSet) {
list.add(FastList.newList(set));
}
blackhole.consume(list);
}
@Benchmark
public void large_eclipse(Provider provider, Blackhole blackhole) {
List<List<Integer>> list = FastList.newList();
for (Set<Integer> set : provider.largeSet) {
list.add(FastList.newList(set));
}
blackhole.consume(list);
}
}
2. Results
The results are as follows:
Benchmark Mode Cnt Score Error Units
Set2ListTest.large_commons thrpt 183.205 ops/s
Set2ListTest.large_eclipse thrpt 219.068 ops/s
Set2ListTest.large_guava thrpt 178.478 ops/s
Set2ListTest.large_plainJava thrpt 148.058 ops/s
Set2ListTest.small_commons thrpt 2140560.198 ops/s
Set2ListTest.small_eclipse thrpt 2619862.935 ops/s
Set2ListTest.small_guava thrpt 2720692.868 ops/s
Set2ListTest.small_plainJava thrpt 2472748.566 ops/s
3. Discussion
Two different sizes of Set
s were used to measure performance: 10 sets of size 10 (100 elements) and 1000 sets of 1000 elements (1.000.000 elements).
For the small Set
, eclipse collections outperformed the other tools. Plain java seems to be the slowest solution.
In the case of the larger Set
, guava provides the best results.
It's obvious, that the size of collections has a great impact on performance and can influence the best choice among those libraries. I don't know the size of your data, but my goal is not to give you the best collections library but to give you a tool to find one.
4. Additional notes
I do not consider myself a specialist in any of the collection frameworks and there might be a better way to use them.
If those results are not satisfactory, you can use primitive collections, so that much less memory has to be allocated. This will result in a huge performance boost.
Note, that if you want to experiment with the benchmarks provided, it's important, that you compare the implementations against each other, not against the results I got, because that might be hardware-related.