I'm comparing files in folders (acceptor & sender) using JCIFS. During comparation two situations may occur: - file not exists at acceptor - file exists at acceptor
I need to get a map, where compared files are groupped by mentioned two types, so i could copy non-existing files or chech size and modification date of existing...
I want to make it using lambdas and streams, because i woult use parallel streams in near future, and it's also convinient...\
I've managed to make a working prototype method that checks whether file exists and creates a map:
private Map<String, Boolean> compareFiles(String[] acceptor, String[] sender) {
return Arrays.stream(sender)
.map(s -> new AbstractMap.SimpleEntry<>(s, Stream.of(acceptor).anyMatch(s::equals)))
Map.Entry::getValue)));
.collect(collectingAndThen(
toMap(Map.Entry::getKey, Map.Entry::getValue),
Collections::<String,Boolean> unmodifiableMap));
}
but i cant add higher level grouping by map value...
I have such a non-working piece of code:
private Map<String, Boolean> compareFiles(String[] acceptor, String[] sender) {
return Arrays.stream(sender)
.map(s -> new AbstractMap.SimpleEntry<>(s, Stream.of(acceptor).anyMatch(s::equals)))
.collect(groupingBy(
Map.Entry::getValue,
groupingBy(Map.Entry::getKey, Map.Entry::getValue)));
}
}
My code can't compile, because i missed something very important.. Could anyone help me please and exlain how to make this lambda correct?
P.S. arrays from method parameters are SmbFiles samba directories:
private final String master = "smb://192.168.1.118/mastershare/";
private final String node = "smb://192.168.1.118/nodeshare/";
SmbFile masterDir = new SmbFile(master);
SmbFile nodeDir = new SmbFile(node);
Map<Boolean, <Map<String, Boolean>>> resultingMap = compareFiles(masterDir, nodeDir);
Collecting into nested maps with the same values, is not very useful. The resulting Map<Boolean, Map<String, Boolean>>
can only have two keys, true
and false
. When you call get(true)
on it, you’ll get a Map<String, Boolean>
where all string keys redundantly map to true
. Likewise, get(false)
will give a you map where all values are false
.
To me, it looks like you actually want
private Map<Boolean, Set<String>> compareFiles(String[] acceptor, String[] sender) {
return Arrays.stream(sender)
.collect(partitioningBy(Arrays.asList(acceptor)::contains, toSet()));
}
where get(true)
gives you a set of all strings where the predicate evaluated to true
and vice versa.
partitioningBy
is an optimized version of groupingBy
for boolean
keys.
Note that Stream.of(acceptor).anyMatch(s::equals)
is an overuse of Stream features. Arrays(acceptor).contains(s)
is simpler and when being used as a predicate like Arrays.asList(acceptor)::contains
, the expression Arrays.asList(acceptor)
will get evaluated only once and a function calling contains
on each evaluation is passed to the collector.
When acceptor
gets large, you should not consider parallel processing, but replacing the linear search with a hash lookup
private Map<Boolean, Set<String>> compareFiles(String[] acceptor, String[] sender) {
return Arrays.stream(sender)
.collect(partitioningBy(new HashSet<>(Arrays.asList(acceptor))::contains, toSet()));
}
Again, the preparation work of new HashSet<>(Arrays.asList(acceptor))
is only done once, whereas the contains
invocation, done for every element of sender
, will not depend on the size of acceptor
anymore.