Extracting Map<K, Multiset<V>> from Stream of Streams in Java 8

Anoop :

I have Stream of Stream of Words(This format is not set by me and cannot be changed). For ex

Stream<String> doc1 = Stream.of("how", "are", "you", "doing", "doing", "doing");
Stream<String> doc2 = Stream.of("what", "what", "you", "upto");
Stream<String> doc3 = Stream.of("how", "are", "what", "how");
Stream<Stream<String>> docs = Stream.of(doc1, doc2, doc3);

I'm trying to get this into a structure of Map<String, Multiset<Integer>> (or its corresponding stream as I want to process this further), where the key String is the word itself and the Multiset<Integer> represents the number of that word appearances in each document (0's should be excluded). Multiset is a google guava class(not from java.util.).

For example:

how   -> {1, 2}  // because it appears once in doc1, twice in doc3 and none in doc2(so doc2's count should not be included)
are   -> {1, 1}  // once in doc1 and once in doc3
you   -> {1, 1}  // once in doc1 and once in doc2
doing -> {3}     // thrice in doc3, none in others 
what  -> {2,1}   // so on
upto  -> {1}  

What is a good way to do this in Java 8 ?

I tried using a flatMap , but the inner Stream is greatly limiting the options of I have.

Eugene :
 Map<String, List<Long>> map = docs.flatMap(
            inner -> inner.collect(
                    Collectors.groupingBy(Function.identity(), Collectors.counting()))
                    .entrySet()
                    .stream())
            .collect(Collectors.groupingBy(
                    Entry::getKey,
                    Collectors.mapping(Entry::getValue, Collectors.toList())));

System.out.println(map);

// {upto=[1], how=[1, 2], doing=[3], what=[2, 1], are=[1, 1], you=[1, 1]}

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=435117&siteId=1