Why do I have to chain Stream operations in Java?

Koray Tugay :

I think all of the resources I have studied one way or another emphasize that a stream can be consumed only once, and the consumption is done by so-called terminal operations (which is very clear to me).

Just out of curiosity I tried this:

import java.util.stream.IntStream;

class App {
    public static void main(String[] args) {
        IntStream is = IntStream.of(1, 2, 3, 4);
        is.map(i -> i + 1);
        int sum = is.sum();
    }
}

which ends up throwing a Runtime Exception:

Exception in thread "main" java.lang.IllegalStateException: stream has already been operated upon or closed
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:229)
    at java.util.stream.IntPipeline.reduce(IntPipeline.java:456)
    at java.util.stream.IntPipeline.sum(IntPipeline.java:414)
    at App.main(scratch.java:10)

This is usual, I am missing something, but still want to ask: As far as I know map is an intermediate (and lazy) operation and does nothing on the Stream by itself. Only when the terminal operation sum (which is an eager operation) is called, the Stream gets consumed and operated on.

But why do I have to chain them?

What is the difference between

is.map(i -> i + 1);
is.sum();

and

is.map(i -> i + 1).sum();

?

Federico Peralta Schaffner :

When you do this:

int sum = IntStream.of(1, 2, 3, 4).map(i -> i + 1).sum();

Every chained method is being invoked on the return value of the previous method in the chain.

So map is invoked on what IntStream.of(1, 2, 3, 4) returns and sum on what map(i -> i + 1) returns.

You don't have to chain stream methods, but it's more readable and less error-prone than using this equivalent code:

IntStream is = IntStream.of(1, 2, 3, 4);
is = is.map(i -> i + 1);
int sum = is.sum();

Which is not the same as the code you've shown in your question:

IntStream is = IntStream.of(1, 2, 3, 4);
is.map(i -> i + 1);
int sum = is.sum();

As you see, you're disregarding the reference returned by map. This is the cause of the error.


EDIT (as per the comments, thanks to @IanKemp for pointing this out): Actually, this is the external cause of the error. If you stop to think about it, map must be doing something internally to the stream itself, otherwise, how would then the terminal operation trigger the transformation passed to map on each element? I agree in that intermediate operations are lazy, i.e. when invoked, they do nothing to the elements of the stream. But internally, they must configure some state into the stream pipeline itself, so that they can be applied later.

Despite I'm not aware of the full details, what happens is that, conceptually, map is doing at least 2 things:

  1. It's creating and returning a new stream that holds the function passed as an argument somewhere, so that it can be applied to elements later, when the terminal operation is invoked.

  2. It is also setting a flag to the old stream instance, i.e. the one which it has been called on, indicating that this stream instance no longer represents a valid state for the pipeline. This is because the new, updated state which holds the function passed to map is now encapsulated by the instance it has returned. (I believe that this decision might have been taken by the jdk team to make errors appear as early as possible, i.e. by throwing an early exception instead of letting the pipeline go on with an invalid/old state that doesn't hold the function to be applied, thus letting the terminal operation return unexpected results).

Later on, when a terminal operation is invoked on this instance flagged as invalid, you're getting that IllegalStateException. The two items above configure the deep, internal cause of the error.


Another way to see all this is to make sure that a Stream instance is operated only once, by means of either an intermediate or a terminal operation. Here you are violating this requirement, because you are calling map and sum on the same instance.

In fact, javadocs for Stream state it clearly:

A stream should be operated on (invoking an intermediate or terminal stream operation) only once. This rules out, for example, "forked" streams, where the same source feeds two or more pipelines, or multiple traversals of the same stream. A stream implementation may throw IllegalStateException if it detects that the stream is being reused. However, since some stream operations may return their receiver rather than a new stream object, it may not be possible to detect reuse in all cases.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=35310&siteId=1