I have two lists of Strings. Need to create a map of occurrences of each string of one list in another list of string. If a String is present even more than in a single string, it should be counted as one occurrence.
For example:
String[] listA={"the", "you" , "how"};
String[] listB = {"the dog ate the food", "how is the weather" , "how are you"};
The Map<String, Integer> map
will take keys as Strings from listA
, and value as the occurence. So map will have key-values as : ("the",2)("you",1)("how",2)
.
Note: Though "the"
is repeated twice in "the dog ate the food"
, it counted as only one occurrence as it is in the same string.
How do I write this using java-stream? I tried this approach but does not work:
Set<String> sentenceSet = Stream.of(listB).collect(Collectors.toSet());
Map<String, Long> frequency1 = Stream.of(listA)
.filter(e -> sentenceSet.contains(e))
.collect(Collectors.groupingBy(t -> t, Collectors.counting()));
You need to extract all the words from listB
and keep only these that are also listed in listA
. Then you simply collect the pairs word -> count to the Map<String, Long>
:
String[] listA={"the", "you", "how"};
String[] listB = {"the dog ate the food", "how is the weather" , "how are you"};
Set<String> qualified = new HashSet<>(Arrays.asList(listA)); // make searching easier
Map<String, Long> map = Arrays.stream(listB) // stream the sentences
.map(sentence -> sentence.split("\\s+")) // split by words to Stream<String[]>
.flatMap(words -> Arrays.stream(words) // flatmap to Stream<String>
.distinct()) // ... as distinct words by sentence
.filter(qualified::contains) // keep only the qualified words
.collect(Collectors.groupingBy( // collect to the Map
Function.identity(), // ... the key is the words itself
Collectors.counting())); // ... the value is its frequency
Output:
{the=2, how=2, you=1}