inner join in java using streams

nirvair :

I have two List Map:

orders

[
    {
        item_id=1, 
        item=item-1, 
        user_id=1
    },
    {
        item_id=2, 
        item=item-2, 
        user_id=2
    }, 
    {
        item_id=3, 
        item=item-3, 
        user_id=3
    }
]

users

[
    {
        user_id=1, 
        name=abh, 
        [email protected]
    }, 
    {
        user_id=2, 
        name=pol, 
        [email protected]
    }, 
    {
        user_id=3, 
        name=tre, 
        [email protected]
    }
]

They are initialized as

List<Map<String, String>> data

I want to do an sql equivalent inner join on this List Maps using Streams.

I tried this:

List<Map<String, String>> collect = leftData.stream().flatMap(t1 -> rightData.stream())
                .filter(t -> t.get(joinColumnTableLeft).equals(t.get(joinColumnTableRight)))
                .collect(Collectors.toList());

This gives me a result of size size(users) * size(orders), which is 9. And the collect has orders.

But I want both the Map to merged into single and then create a list out of it.

Cannot use any library as of now.

ernest_k :

Assuming that you don't have duplicate entries (by the merge column key), you can use a method like this to merge.

This creates a map of the mergeColumn key to the full map by row in one of the lists, then uses that for lookup when merging by iterating through the other map.

static List<Map<String, String>> merge(List<Map<String, String>> left, 
       List<Map<String, String>> right, String joinColumnTableLeft,
       String joinColumnTableRight) {

    Map<String, Map<String, String>> rightById = right.stream()
            .collect(Collectors.toMap(m -> m.get(joinColumnTableRight), 
                                      Function.identity()));

    return left.stream()
               .filter(e -> rightById.containsKey(e.get(joinColumnTableLeft)))
               .map(l -> {
                 Map<String, String> all = new HashMap<>();
                 all.putAll(l);
                 all.putAll(rightById.get(l.get(joinColumnTableLeft)));

                 return all;
               })
               .collect(Collectors.toList());
}

As a test:

Map<String, String> left1 = new HashMap<>(), right1 = new HashMap<>();
left1.put("a", "A");
left1.put("b", "B");
left1.put("c", "C");

right1.put("a", "A");
right1.put("d", "B");

Map<String, String> left2 = new HashMap<>(), right2 = new HashMap<>();
left2.put("a", "AA");
left2.put("b", "BB");
left2.put("c", "CC");

right2.put("a", "AA");
right2.put("d", "BB");

System.out.println(merge(Arrays.asList(left1, left2), 
        Arrays.asList(right1, right2), "a", "a"));

The output is: [{a=A, b=B, c=C, d=B}, {a=AA, b=BB, c=CC, d=BB}]

The order of entries isn't important, though. Just note that this assumes that there are no overlapping keys other than the join column. Otherwise, you may want to collect pairs of maps instead of calling putAll on a new map.


The following will support duplicate join keys (and will produce a cartesian product for all entries per key):

static List<Map<String, String>> merge(List<Map<String, String>> left, 
        List<Map<String, String>> right,
        String joinColumnTableLeft, String joinColumnTableRight) {

    Map<String, List<Map<String, String>>> rightById = right.stream()
            .collect(Collectors.groupingBy(m -> m.get(joinColumnTableRight)));

    return left.stream()
            .filter(e -> rightById.containsKey(e.get(joinColumnTableLeft)))
            .flatMap(l -> rightById.get(l.get(joinColumnTableLeft)).stream()
                    .map(r -> {
                                Map<String, String> all = new HashMap<>();
                                all.putAll(l);
                                all.putAll(r);

                                return all;
                            }
                    )
    ).collect(Collectors.toList());
}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=75970&siteId=1