java8- streaming programming Stream

What is Stream?

Stream it is not a container, it is only the function of the vessel has been enhanced to add a lot of convenience operations, such as finding, filtering, grouping, sorting, and a series of operations. And there is a serial, parallel two execution modes, full parallel mode takes advantage of multi-core processors, using fork / join split task framework, while increasing the execution speed. Briefly, Stream is to provide an efficient and easy to use way of processing data.

Features:

Stream does not own storage elements.
Stream does not change the operation of the source object. Instead, they will return the result of holding a new Stream.
Stream delay operation is performed. It will need to wait until the results have to be performed. That is the time to perform terminal operations.

Graphic:

A Stream operation to above, within a duct, is divided into three steps, the first step is to create Stream, acquiring a stream from the collection array, the second step is an intermediate chain of operations, data is processed. The third step is operating the terminal, for performing operations chain intermediate, return results.

How to create a Stream?

Created by collection:

The Java8 Collection interface is extended to provide a method for obtaining two streams, the two methods is default method, which means that all interface Collection interfaces do not need to implement can be used directly:

default Stream <E> stream (): returns a sequence flow.
default Stream <E> parallelStream (): returns a parallel stream.

例如：
    List<Integer> integerList = new ArrayList<>();
    integerList.add(1);
    integerList.add(2);
    Stream<Integer> stream = integerList.stream();
    Stream<Integer> stream1 = integerList.parallelStream();

Created by an array:

Arrays of static methods stream Java8 () Gets an array of streams may be:

static <T> Stream <T> stream (T [] array): Returns a stream
Overloads, capable of handling an array corresponding to the basic types:

public static IntStream stream(int[] array)
public static LongStream stream(long[] array)
public static DoubleStream stream(double[] array)

例如：
    int[] intArray = {1,2,3};
    IntStream stream = Arrays.stream(intArray);

Created by value:

Use the static method Stream.of (), create a flow through the displayed value. It can receive any number of parameters.

public static <T> Stream <T> of (T ... values): Returns a stream.

For example: 
Stream <Integer> = integerStream Stream.of (. 1, 2,. 3,. 4,. 5,. 6,. 7,. 8);

Created by function: to create an unlimited stream

You can use the static method Stream.iterate () and Stream.generate () to create an infinite stream.

迭代
public static<T> Stream<T> iterate(final T seed, final UnaryOperator<T> f)
生成
public static<T> Stream<T> generate(Supplier<T> s)

例如：
 Stream.generate(Math::random).limit(5).forEach(System.out::print); 
 List<Integer> collect = Stream.iterate(0,i -> i + 1).limit(5).collect(Collectors.toList());

注意：使用无限流一定要配合limit截断，不然会无限制创建下去。

Stream intermediate operation

If only the middle Stream operation is not performed, when the operation is performed until the terminal performs an intermediate operation, this is called lazy loading or lazy evaluation. More intermediate operations make up a middle chain operations, only when performing terminal operation will be performed over the middle chain operations, in particular because of what we'll explain later. Stream look below which there is an intermediate operation.

Stream<T> distinct()：

Deduplication, the flow generated by the elements hashCode () and equals () removing repetitive elements.

Stream<T> filter(Predicate<? super T> predicate):

Predicate functions on one which we have said, it is asserted type interface, the filter method is a receiving function corresponds Lambda expressions and Predicate returns a Boolean value, the filter certain elements from the stream.

Stream<T> sorted(Comparator<? super T> comparator):

Comparison rules specified sort.

Stream<T> limit(long maxSize):

Cut off the flow, so that the element does not exceed a given number. If the number of elements is less than maxSize, then get all the elements.

Stream<T> skip(long n):

跳过元素，返回一个扔掉了前 n 个元素的流。若流中元素不足 n 个，则返回一个空流。与 limit(n) 互补。

Stream<R> map(Function<? super T, ? extends R> mapper)：

接收一个Function函数作为参数，该函数会被应用到每个元素上，并将其映射成一个新的元素。也就是转换操作，map还有三个应用于具体类型方法，分别是：mapToInt，mapToLong和mapToDouble。这三个方法也比较好理解，比如mapToInt就是把原始Stream转换成一个新的Stream，这个新生成的Stream中的元素都是int类型。这三个方法可以免除自动装箱/拆箱的额外消耗。

Stream<R> flatMap(Function<? super T, ? extends Stream<? extends R>> mapper)：

接收一个Function函数作为参数，将流中的每个值都转换成另一个流，然后把所有流连接成一个流。flatMap也有三个应用于具体类型的方法，分别是：flatMapToInt、flatMapToLong、flatMapToDouble，其作用于map的三个衍生方法相同。

Stream的终端操作

终端操作执行中间操作链，并返回结果。终端操作我们就不一一介绍了，只介绍一下常用的操作。详细可看java.util.stream.Stream接口中的方法。

void forEach(Consumer<? super T> action):

内部迭代(需要用户去做迭代，称为外部迭代。相反，Stream API使用内部迭代帮你把迭代做了)

users.stream().forEach(user -> System.out.println(user.getName()));

<R, A> R collect(Collector<? super T, A, R> collector):

收集、将流转换为其他形式，比如转换成List、Set、Map。collect方法是用Collector作为参数，Collector接口中方法的实现决定了如何对流执行收集操作(如收集到 List、Set、Map)。但是 Collectors 实用类提供了很多静态方法，可以方便地创建常见收集器实例。例举一些常用的：

List<User> users = Lists.newArrayList();
 users.add(new User(15, "A", ImmutableList.of("1元", "5元")));
 users.add(new User(25, "B", ImmutableList.of("10元", "50元")));
 users.add(new User(21, "C", ImmutableList.of("100元")));
 //收集名称到List
 List<String> nameList = users.stream().map(User::getName).collect(Collectors.toList());
 //收集名称到List
 Set<String> nameSet = users.stream().map(User::getName).collect(Collectors.toSet());
 //收集到map,名字作为key,user对象作为value
 Map<String, User> userMap = users.stream()
                .collect(Collectors.toMap(User::getName, Function.identity(), (k1, k2) -> k2));

其他终端操作：
1. boolean allMatch(Predicate<? super T> predicate); 检查是否匹配所有元素。
2. boolean anyMatch(Predicate<? super T> predicate); 检查是否至少匹配一个元素。
3. boolean noneMatch(Predicate<? super T> predicate); 检查是否没有匹配所有元素。
4. Optional<T> findFirst(); 返回当前流中的第一个元素。
5. Optional<T> findAny(); 返回当前流中的任意元素。
6. long count(); 返回流中元素总数。
7. Optional<T> max(Comparator<? super T> comparator); 返回流中最大值。
8. Optional<T> min(Comparator<? super T> comparator); 返回流中最小值。
9. T reduce(T identity, BinaryOperator<T> accumulator); 可以将流中元素反复结合起来，得到一个值。返回 T。这是一个归约操作。

Fork/Join框架

上面我们提到过，说Stream的并行模式使用了Fork/Join框架，这里简单说下Fork/Join框架是什么？Fork/Join框架是java7中加入的一个并行任务框架，可以将任务拆分为多个小任务，每个小任务执行完的结果在合并成为一个结果。在任务的执行过程中使用工作窃取（work-stealing）算法，减少线程之间的竞争。

Fork/Join图解
工作窃取图解

Stream是怎么实现的

先看下整体类图:蓝色箭头代表继承，绿色箭头代表实现，红色箭头代表内部类。

实际上Stream只有两种操作，中间操作、终端操作，中间操作只是一种标记，只有终端操作才会实际触发执行。所以Stream流水线式的操作大致应该是用某种方式记录中间操作，只有调用终端操作才会将所有的中间操作叠加在一起在一次迭代中全部执行。这里只做简单的介绍，想详细了解的可以参考下面的参考资料中的链接。

操作怎么记录？

Stream的操作记录是通过ReferencePipeline记录的，ReferencePipeline有三个内部类Head、StatelessOp、StatefulOp，Stream中使用Stage的概念来描述一个完整的操作，并用某种实例化后的ReferencePipeline来代表Stage，Head用于表示第一个Stage，即调用诸如Collection.stream()方法产生的Stage，很显然这个Stage里不包含任何操作，StatelessOp和StatefulOp分别表示无状态和有状态的Stage，对应于无状态和有状态的中间操作。

操作怎么叠加？

操作是记录完了，但是前面的Stage并不知道后面Stage到底执行了哪种操作，以及回调函数是哪种形式。这就需要有某种协议来协调相邻Stage之间的调用关系。
这种协议由Sink接口完成，Sink接口包含的方法如下表所示:

void begin(long size)，开始遍历元素之前调用该方法，通知Sink做好准备。
void end()，所有元素遍历完成之后调用，通知Sink没有更多的元素了。
boolean cancellationRequested()，是否可以结束操作，可以让短路操作尽早结束。
void accept(T t)，遍历元素时调用，接受一个待处理元素，并对元素进行处理。Stage把自己包含的操作和回调方法封装到该方法里，前一个Stage只需要调用当前Stage.accept(T t)方法就行了。

每个Stage都会将自己的操作封装到一个Sink里，前一个Stage只需调用后一个Stage的accept()方法即可，并不需要知道其内部是如何处理的。有了Sink对操作的包装，Stage之间的调用问题就解决了，执行时只需要从流水线的head开始对数据源依次调用每个Stage对应的Sink.{begin(), accept(), cancellationRequested(), end()}方法就可以了。

操作怎么执行？

Sink完美封装了Stream每一步操作，并给出了[处理->转发]的模式来叠加操作。这一连串的齿轮已经咬合，就差最后一步拨动齿轮启动执行。是什么启动这一连串的操作呢？也许你已经想到了启动的原始动力就是结束操作(Terminal Operation)，一旦调用某个结束操作，就会触发整个流水线的执行。