JAVA8 learning - depth Stream flow from the source point of view (the learning process)

Depth from the source code Stream /

When learning is the most important official documents.

Its important element we must not only know the stream with, you know why so used, but also know the bottom is how to achieve.

- Personal Note: Since seen, although a new version of jdk developers to provide a great traversal, but from the bottom point of view, to achieve indeed very complex.
- External provide a very simple interface to use (certain. it is encapsulated into the frame to the bottom, so you only use a simple.)

Encounter problems, can solve the problem in depth from the bottom.

When learning a skill, learn to use first, then to dig deeper content (the underlying code and how it works).

Introduction: Example.

public class StudentTest1 {
    public static void main(String[] args) {
        Student student1 = new Student("zhangsan", 80);
        Student student2 = new Student("lisi", 90);
        Student student3 = new Student("wangwu", 100);
        Student student4 = new Student("zhaoliu", 90);

        List<Student> students = Arrays.asList(student1, student2, student3, student4);

        //collect()方法深入源码详解
        //op1:集合转换为stream, 然后stream转换为List
        List<Student> students1 = students.stream().collect(Collectors.toList());
        students1.forEach(System.out::println);

        System.out.println("----------");
        System.out.println("count: "+ students.stream().collect(counting()));//Collectors类提供的counting()方法
        System.out.println("count: "+ students.stream().count()); //stream提供的方法 , 底层实现 mapToLong()->sum

        //当jdk底层提供有通用的方法和具体的实现方法,越具体的越好.
    }
}

Static import (direct introduction methods specified Java class implemented)

import static java.util.stream.Collectors.*;
  • collect: Collector
  • Collector is an interface, the interface is particularly important.

Collector Interface source code interpretation

Digression: While the JDK provides a lot Collector's implementation, but many people just stay in the use phase.

We read this line by line javadoc. Because really important.

/**
 * A <a href="package-summary.html#Reduction">mutable reduction operation</a> that
 * accumulates input elements into a mutable result container, optionally transforming
 * the accumulated result into a final representation after all input elements
 * have been processed.  Reduction operations can be performed either sequentially
 * or in parallel.
 一个可变的汇聚操作.将输入元素累积到可变的结果容器当中.它会在所有元素都处理完毕后,将累积之后的结果转换成一个最终的表示(这是一个可选操作).汇聚操作支持串行和并行两种方式执行.
 
 --如 ArrayList:就是一个可变的容器.
 --支持并行操作:确保数据不会错,线程可以并发.很难.另外并不是说并行一定比串行要快,因为并行是有额外开销的.
 
 *
 * <p>Examples of mutable reduction operations include:
 * accumulating elements into a {@code Collection}; concatenating
 * strings using a {@code StringBuilder}; computing summary information about
 * elements such as sum, min, max, or average; computing "pivot table" summaries
 * such as "maximum valued transaction by seller", etc.  The class {@link Collectors}
 * provides implementations of many common mutable reductions.
 可变的reduction(汇聚)操作包括:将元素累积到集合当中,使用StringBuilder将字符串给拼在一起,计算关于元素的sum,min,max or average等,计算数据透视图计算:如根据销售商获取最大销售额等.这个Collectors类,提供了大量的可变汇聚的实现.
 
 -- Collectors本身实际上是一个工厂.
 
 *
 * <p>A {@code Collector} is specified by four functions that work together to
 * accumulate entries into a mutable result container, and optionally perform
 * a final transform on the result.  They are: <ul>
 *     <li>creation of a new result container ({@link #supplier()})</li>
 *     <li>incorporating a new data element into a result container ({@link #accumulator()})</li>
 *     <li>combining two result containers into one ({@link #combiner()})</li>
 *     <li>performing an optional final transform on the container ({@link #finisher()})</li>
 * </ul>
 一个Collector是由4个函数组成的,可以对结果进行一个最终的转化.
 4个方法分别是:
 1.创建一个新的接结果容器 <supplier()>  new  
 2.将新的数据元素给合并到一个结果容器中.<accumulator()>  add
 3.将两个结果容器合并成一个.<combiner()>  + 
 4.将中间的累积类型,转换成结果类型. <finisher()>  result
 每个方法都会返回一个函数式皆苦.
 --学习的时候,官方文档是最重要的.
 *
 * <p>Collectors also have a set of characteristics, such as
 * {@link Characteristics#CONCURRENT}, that provide hints that can be used by a
 * reduction implementation to provide better performance.
 Collectors 还会返回这么一个集合 Characteristics#CONCURRENT. (也就是这个类中的枚举类)
 
 *
 * <p>A sequential implementation of a reduction using a collector would
 * create a single result container using the supplier function, and invoke the
 * accumulator function once for each input element. 
 
 * A parallel implementation
 * would partition the input, create a result container for each partition,
 * accumulate the contents of each partition into a subresult for that partition,
 * and then use the combiner function to merge the subresults into a combined
 * result.
 一个汇聚操作串行的实现,会创建一个唯一的一个结果容器.使用<Supplier>函数. 每一个输入元素都会调用累积函数(accumulator())一次. 
 
 一个并行的实现,将会对输入进行分区,分成多个区域,每一次分区都会创建一个结果容器,然后函数.累积每一个结果容器的内容区内形成一个,然后通过comtainer()给合并成一个.
 -- 解释:
 combiner函数,假如有4个线程同时去执行,那么就会生成4个部分结果.
 结果分别是:1.2.3.4
 可能是:
 1.2 -> 5 
 5.3 -> 6
 6.4 -> 7
 这5.6.7新创建的集合,就叫做 新的结果容器
 
 也可能是:
 1.2 -> 1+2 (新的一个)
 1.3 -> 1(新的一个)
 这种新的折叠后的,叫做折叠成一个参数容器.
 
 *
 * <p>To ensure that sequential and parallel executions produce equivalent
 * results, the collector functions must satisfy an <em>identity</em> and an
 * <a href="package-summary.html#Associativity">associativity</a> constraints.
 为了确保串行与并行获得等价的结果.  collector(收集器)的函数必须满足2个条件.
 1. identity: 同一性
 2. Associativity :结合性
 
 *
 * <p>The identity constraint says that for any partially accumulated result,
 * combining it with an empty result container must produce an equivalent
 * result.  That is, for a partially accumulated result {@code a} that is the
 * result of any series of accumulator and combiner invocations, {@code a} must
 * be equivalent to {@code combiner.apply(a, supplier.get())}.
 同一性是说:针对于任何部分累积的结果来说,将他与一个空的容器融合,必须会生成一个等价的结果.等价于部分的累积结果.
 也就是说对于一个部分的累积结果a,对于任何一条线上的combiner invocations.
 a == combiner.apply(a, supplier.get())
    supplier.get() ,获取一个空的结果容器.
    然后将a与空的结果容器容器. 保证a == (融合等式) .
    
    这个特性就是:同一性.
    
 --部分累积的结果:是在流程中产生的中间结果.
 --解释上述等式为什么成立:a是线程某一个分支得到的部分结果. 后面的是调用BiarnyOperator.apply()
 (List<String> list1,List<String> list2)->{list1.addAll(list2);return list1;}
 这个类似于之前说的: 将两个结果集折叠到同一个容器.然后返回来第一个结果的融合.
 
 *
 * <p>The associativity constraint says that splitting the computation must
 * produce an equivalent result.  That is, for any input elements {@code t1}
 * and {@code t2}, the results {@code r1} and {@code r2} in the computation
 * below must be equivalent:
  结合性是说:分割执行的时候,也必须产生相同的结果.每一份处理完之后,也得到相应的结果.
 
 * <pre>{@code
 *     A a1 = supplier.get();//获取结果容器 a1.
 *     accumulator.accept(a1, t1); //a1:每一次累积的中间结果, t1:流中下一个待累积的元素.
 *     accumulator.accept(a1, t2); //t1->a1, a1已经有东西. 然后 t2->t1 = r1 (也就是下一步)
 *     R r1 = finisher.apply(a1);  // result without splitting
 *
 *     A a2 = supplier.get();  //另外一个线程
 *     accumulator.accept(a2, t1);   //两个结果集转换成中间结果.
 *     A a3 = supplier.get();  //第三个线程
 *     accumulator.accept(a3, t2); //两个中间结果转换成最终结果.
 *     R r2 = finisher.apply(combiner.apply(a2, a3));  // result with splitting
 * } </pre>
所以要保证:无论是单线程,还是多线程(串行和并行)的结果都要是一样的. 
这就是所谓的:结合性.
 
 --个人注释:从此看出,虽然新的jdk版本对开发人员提供了很大的遍历,但是从底层角度来说,实现确实是非常复杂的.
 --对外提供很简单的接口使用.  (一定是框架给封装到底层了,所以你才用着简单.)
 
 *
 * <p>For collectors that do not have the {@code UNORDERED} characteristic,
 * two accumulated results {@code a1} and {@code a2} are equivalent if
 * {@code finisher.apply(a1).equals(finisher.apply(a2))}.  For unordered
 * collectors, equivalence is relaxed to allow for non-equality related to
 * differences in order.  (For example, an unordered collector that accumulated
 * elements to a {@code List} would consider two lists equivalent if they
 * contained the same elements, ignoring order.)
对于一个不包含无序的收集器来说, a1 和 a2是等价的. 条件:finisher.apply(a1).equals(finisher.apply(a2)
对于无序的收集器来说:这种等价性就没有那么严格了,它会考虑到顺序上的区别所对应的不相等性.

 *
 * <p>Libraries that implement reduction based on {@code Collector}, such as
 * {@link Stream#collect(Collector)}, must adhere to the following constraints:
 基于Collector 去实现汇聚(reduction)操作的这种库, 必须遵守如下的约定.
 
 - 注释:汇聚其实有多种实现.
 如Collectors中的reducting(). 
 如Stream接口中有三种reduce()重载的方法.
 
 这两个有很大的本质的差别: (注意单线程和多线程情况下的影响.)
 reduce:要求不可变性
 Collectors收集器方式:可变的结果容器.
 
 * <ul>
 *     <li>The first argument passed to the accumulator function, both
 *     arguments passed to the combiner function, and the argument passed to the
 *     finisher function must be the result of a previous invocation of the
 *     result supplier, accumulator, or combiner functions.</li>
            1. 传递给accumulate函数的参数,以及给Combiner的两个参数,以及finisher函数的参数,
                他们必须是 这几个supplier, accumulator, or combiner 函数函数上一次调用的结果(泛型-T).
 
 *     <li>The implementation should not do anything with the result of any of
 *     the result supplier, accumulator, or combiner functions other than to
 *     pass them again to the accumulator, combiner, or finisher functions,
 *     or return them to the caller of the reduction operation.</li>
            2. 实现不应该对, 生成的  ---  结果 做任何的事情. 除了将他们再传给下一个函数.
            (中间不要做任何的操作,否则肯定是紊乱的.)
            
 *     <li>If a result is passed to the combiner or finisher
 *     function, and the same object is not returned from that function, it is
 *     never used again.</li>
            3.如果一个结果被传递给combiner或者finisher函数,相同的对象并没有从函数里面返回,
                那么他们再也不会被使用了.(表示已经被用完了.)
                
 *     <li>Once a result is passed to the combiner or finisher function, it
 *     is never passed to the accumulator function again.</li>
            4.一个函数如果被执行给了combiner或者finisher函数之后,它再也不会被accumulate函数调用了.
                (就是说,如果被结束函数执行完了. 就不会再被中间操作了.)
 
 *     <li>For non-concurrent collectors, any result returned from the result
 *     supplier, accumulator, or combiner functions must be serially
 *     thread-confined.  This enables collection to occur in parallel without
 *     the {@code Collector} needing to implement any additional synchronization.
 *     The reduction implementation must manage that the input is properly
 *     partitioned, that partitions are processed in isolation, and combining
 *     happens only after accumulation is complete.</li>
            5. 对于非并发的收集起来说.从supplier, accumulator, or combiner任何的结果返回一定是被限定在当前的线程了. 所以可以被用在并行的操作了.
            reduction的操作必须被确保被正确的分析了,4个线程,被分为4个区,不会相互干扰,再都执行完毕之后,再讲中间容器进行融合.形成最终结果返回.
 
 *     <li>For concurrent collectors, an implementation is free to (but not
 *     required to) implement reduction concurrently.  A concurrent reduction
 *     is one where the accumulator function is called concurrently from
 *     multiple threads, using the same concurrently-modifiable result container,
 *     rather than keeping the result isolated during accumulation.
            6.对于并发的收集器,实现可以自由的选择. 和上面的5相对于.
            在累积阶段不需要保持独立性.
 
 *     A concurrent reduction should only be applied if the collector has the
 *     {@link Characteristics#UNORDERED} characteristics or if the
 *     originating data is unordered.</li>
 一个并发的,在这个时候一定会被使用; 无序的.
    
--到此结束,重要的 概念基本上已经介绍完毕了.

 * </ul>
 
 *
 * <p>In addition to the predefined implementations in {@link Collectors}, the
 * static factory methods {@link #of(Supplier, BiConsumer, BinaryOperator, Characteristics...)}
 * can be used to construct collectors.  For example, you could create a collector
 * that accumulates widgets into a {@code TreeSet} with:
 *
 * <pre>{@code
 *     Collector<Widget, ?, TreeSet<Widget>> intoSet =
 *         Collector.of(TreeSet::new, TreeSet::add,
 *                      (left, right) -> { left.addAll(right); return left; });
 * }</pre>

        使用.三个参数构造的 of 方法,()
        三个参数
        1.结果容器
        2.将数据元素累积添加到结果容器
        3.返回结果容器.(此处使用TreeSet)

 *
 * (This behavior is also implemented by the predefined collector.预定义的Collector.
 * {@link Collectors#toCollection(Supplier)}).
 *
 * @apiNote
 * Performing a reduction operation with a {@code Collector} should produce a
 * result equivalent to:
 * <pre>{@code
 *     R container = collector.supplier().get();
 *     for (T t : data)
 *         collector.accumulator().accept(container, t);
 *     return collector.finisher().apply(container);
 * }</pre>
 上述:汇聚容器的实现过程.
 1.创建一个容器
 2.累加到容器
 3.返回结果容器.
 
 *
 * <p>However, the library is free to partition the input, perform the reduction
 * on the partitions, and then use the combiner function to combine the partial
 * results to achieve a parallel reduction.  (Depending on the specific reduction
 * operation, this may perform better or worse, depending on the relative cost
 * of the accumulator and combiner functions.)
 性能的好坏:取决于实际情况.
 (并行不一定比串行性能高.)
 
 *
 * <p>Collectors are designed to be <em>composed</em>; many of the methods
 * in {@link Collectors} are functions that take a collector and produce
 * a new collector.  For example, given the following collector that computes
 * the sum of the salaries of a stream of employees:
 收集器本身被设计成可以组合的. 也就是说收集器本身的组合.例如下.
 
 *
 * <pre>{@code
 *     Collector<Employee, ?, Integer> summingSalaries
 *         = Collectors.summingInt(Employee::getSalary))
 * }</pre>
 Collector(),三个参数.  
 
 *
 * If we wanted to create a collector to tabulate the sum of salaries by
 * department, we could reuse the "sum of salaries" logic using
 * {@link Collectors#groupingBy(Function, Collector)}:
 如果想创建一个组合的容器.
 就是之前用的groupingBy()的分类函数.如下例子.
 
 *
 * <pre>{@code
 *     Collector<Employee, ?, Map<Department, Integer>> summingSalariesByDept
 *         = Collectors.groupingBy(Employee::getDepartment, summingSalaries);
 * }</pre>
 分组->求和
 分组->求和
 二级分组.
 
 *
 * @see Stream#collect(Collector)
 * @see Collectors
 *
 * @param <T> the type of input elements to the reduction operation
 * @param <A> the mutable accumulation type of the reduction operation (often
 *            hidden as an implementation detail)
 * @param <R> the result type of the reduction operation
 * @since 1.8
 */

Understanding here, benefit.

Detailed Collector Interface

Collector of three generic <T, A, R> Detailed

 * @param <T> the type of input elements to the reduction operation
 * @param <A> the mutable accumulation type of the reduction operation (often
 *            hidden as an implementation detail)
 * @param <R> the result type of the reduction operatio
  • T: the type of input parameters to be fusion operations (i.e. the stream type of each element)
  • A: type of variable cumulative reduction operation (cumulative set of types.) (Intermediate results container type.) (Return result type container).
  • R: result type of operations converge.

supplier()

    /**
     * A function that creates and returns a new mutable result container.
     * 创建一个新的可变结果容器.返回 Supplier函数式接口.
     * @return a function which returns a new, mutable result container
            泛型 - A : 可变容器的类型.
     */
    Supplier<A> supplier();

accumulator()

/**
 * A function that folds a value into a mutable result container.
 * 将一个新的元素数据元素折叠(累加)到一个结果容器当中. 返回值为 BiConsumer函数式接口
 * @return a function which folds a value into a mutable result container
 泛型-A:返回的中间容器的类型(结果类型)
 泛型-T:流中待处理的下一个元素的类型.(源类型)
 */
BiConsumer<A, T> accumulator();

combiner()

/**
和并行流紧密相关.
 * A function that accepts two partial results and merges them.  The
 * combiner function may fold state from one argument into the other and
 * return that, or may return a new result container.
 * 接收两个部分结果,然后给合并起来.将结果状态从一个参数转换成另一个参数,或者返回一个新的结果容器....*(有点难理解.) 返回一个组合的操作符函数接口类.
 
 -- 解释:
 combiner函数,假如有4个线程同时去执行,那么就会生成4个部分结果.
 结果分别是:1.2.3.4
 可能是:
 1.2 -> 5 
 5.3 -> 6
 6.4 -> 7
 这5.6.7新创建的集合,就叫做 新的结果容器
 
 也可能是:
 1.2 -> 1+2 (新的一个)
 1.3 -> 1(新的一个)
 这种新的折叠后的,叫做折叠成一个参数容器.
 
 所以:combiner 是 专门用在 并行流中的.
 
 * @return a function which combines two partial results into a combined
 * result
 泛型-A: (结果容器类型.中间结果容器的类型.) TTT 
 */
BinaryOperator<A> combiner();

finisher()

/**
 * Perform the final transformation from the intermediate accumulation type
 * {@code A} to the final result type {@code R}.
 *接收一个中间对象,返回另外一个结果.对象.
 * <p>If the characteristic {@code IDENTITY_TRANSFORM} is
 * set, this function may be presumed to be an identity transform with an
 * unchecked cast from {@code A} to {@code R}.
 *如果这个特性被设置值了的话,.....  返回一个Function接口类型.
 * @return a function which transforms the intermediate result to the final
 * result
 泛型-A :结果容器类型
 泛型-R : 最终要使用的类型.(最终返回的结果的类型.)
 */
Function<A, R> finisher();

Enum class Characteristics

        /**
     * Characteristics indicating properties of a {@code Collector}, which can
     * be used to optimize reduction implementations.
      这个类中显示的这些属性,被用作:优化汇聚的实现.
      
      --解释: 类的作用:告诉收集器,我可以对这个目标进行怎么样的执行动作.
     */
    enum Characteristics {
        /**
         * Indicates that this collector is <em>concurrent</em>, meaning that
         * the result container can support the accumulator function being
         * called concurrently with the same result container from multiple
         * threads.
         *
         * <p>If a {@code CONCURRENT} collector is not also {@code UNORDERED},
         * then it should only be evaluated concurrently if applied to an
         * unordered data source.
         */
        CONCURRENT,//表示可以支持并发.

        /**
         * Indicates that the collection operation does not commit to preserving
         * the encounter order of input elements.  (This might be true if the
         * result container has no intrinsic order, such as a {@link Set}.)
         */
        UNORDERED,

        /**
         * Indicates that the finisher function is the identity function and
         * can be elided.  If set, it must be the case that an unchecked cast
         * from A to R will succeed.
         */
        IDENTITY_FINISH
    }

Guess you like

Origin www.cnblogs.com/bigbaby/p/12150706.html