Java Stream reduce unexplained behaviour

tangokhi :

Can anyone please point me in the right direction as I cannot understand the issue.

I am executing following method.

private static void reduce_parallelStream() {
    List<String> vals = Arrays.asList("a", "b");

    List<String> join = vals.parallelStream().reduce(new ArrayList<String>(),
            (List<String> l, String v) -> {

                l.add(v);

                return l;
            }, (a, b) -> {                   
                a.addAll(b);
                return a;
            }

    );

   System.out.println(join);

}

It prints

[null, a, null, a]

I cannot understand why does it put two null in the resultant list. I expected the answer to be

[a, b]

as it is a parallel stream so the first parameter to reduce

new ArrayList()

would probably be called twice for each input value a and b.

Then the accumulator function would probably be called twice as it is a parallelStream and pass each input "a and b" in each call along with the lists provided by seeded value. So a is added to list 1 and b is added to list 2 (or vice versa). Afterwards the combinator will combine both lists but it doesn't happen.

Interestingly, if I put a print statement inside my accumulator to print the value of input, the output changes. So following

private static void reduce_parallelStream() {
    List<String> vals = Arrays.asList("a", "b");

    List<String> join = vals.parallelStream().reduce(new ArrayList<String>(),
            (List<String> l, String v) -> {
                System.out.printf("l is %s", l);
                l.add(v);
                System.out.printf("l is %s", l);
                return l;
            }, (a, b) -> {
                a.addAll(b);
                return a;
            }

    );

   System.out.println(join);

}

results in this output

l is []l is [b]l is [b, a]l is [b, a][b, a, b, a]

Can anyone please explain.

Mushif Ali Nawaz :

You should be using Collections.synchronizedList() when working with parallelStream(). Because ArrayList is not threadsafe and you get unexpected behavior when accessing it concurrently, like you're doing it with parallelStream().

I have modified your code and now it's working correctly:

private static void reduce_parallelStream() {
    List<String> vals = Arrays.asList("a", "b");

    // Use Synchronized List when with parallelStream()
    List<String> join = vals.parallelStream().reduce(Collections.synchronizedList(new ArrayList<>()),
            (l, v) -> {
                l.add(v);
                return l;
            }, (a, b) -> a // don't use addAll() here to multiplicate the output like [a, b, a, b]
    );
    System.out.println(join);
}

Output:

Sometimes you'll get this output:

[a, b]

And sometimes this one:

[b, a]

Reason for this is that it's a parallelStream() so you can't be sure about the order of execution.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=163820&siteId=1