Parallel data processing and performance
Before Java 7, parallel processing of data collections was very cumbersome. First, you have to explicitly divide the data structure containing the data into subparts. Second, you give each subsection a separate thread. Third, you need to synchronize them at the right time to avoid unwanted race conditions, wait for all threads to complete, and finally merge the partial results. Java 7 introduces a framework called branch/merge to make these operations more stable and less error-prone. In this chapter, you'll learn how the Stream interface allows you to perform parallel operations on a data set without much effort. It allows you to declaratively turn a sequential stream into a parallel stream. Additionally, you will see how Java does magic, or more practically speaking, how streams work behind the scenes using the branch/merge framework introduced in Java 7. You'll also find it important to understand how parallel streams work internally, because if you ignore this aspect, you may get unexpected (and probably erroneous) results through misuse. In particular, we will demonstrate that the way parallel streams are divided into chunks of data before they are processed in parallel is in some cases the exact source of these erroneous and unexplained results. Therefore, you will learn how to control this splitting process by implementing and using your own Spliterator.
Parallel data processing
We briefly mentioned that the Stream interface allows you to process its elements very conveniently: you can convert a collection into a parallel stream by calling the parallelStream method on the collection source. A parallel stream is a stream that divides content into multiple data blocks and uses different threads to process each data block separately. This way, you can automatically distribute the workload of a given operation to all the cores of a multi-core processor, keeping them all busy.
public void demo1(){
// 生成自然数无限流,对前1000个数字求和
Long reduce = Stream.iterate(1L, i -> i + 1)
.limit(1000)
.reduce(0L, Long::sum);
System.out.println(reduce);
// 用更为传统的Java术语来说,这段代码与下面的迭代等价:
// public static long iterativeSum(long n) {
// long result = 0;
// for (long i = 1L; i <= n; i++) {
// result += i;
// }
// return result;
// }
// 你可以把流转换成并行流,从而让前面的函数归约过程(也就是求和)并行运行——对顺序流调用parallel方法:
Long reduceParallel = Stream.iterate(1L, i -> i + 1)
.limit(1000)
.parallel()
.reduce(0L, Long::sum);
System.out.println(reduceParallel);
/**
* 请注意,在现实中,对顺序流调用parallel方法并不意味着流本身有任何实际的变化。它
* 在内部实际上就是设了一个boolean标志,表示你想让调用parallel之后进行的所有操作都并
* 行执行。类似地,你只需要对并行流调用sequential方法就可以把它变成顺序流。请注意,你
* 可能以为把这两个方法结合起来,就可以更细化地控制在遍历流时哪些操作要并行执行,哪些要
* 顺序执行。例如,你可以这样做:
*/
ArrayList<User> userList = Lists.newArrayList();
userList.add(User.builder().sex("女").name("小红").type(true).uId(10).build());
userList.add(User.builder().sex("女").name("小花").type(false).uId(11).build());
userList.add(User.builder().sex("男").name("小张").type(true).uId(12).build());
userList.add(User.builder().sex("男").name("小网").type(false).uId(13).build());
userList.add(User.builder().sex("男").name("小里").type(true).uId(14).build());
Integer reduceSequential = userList.stream().parallel()
.filter(val -> val.getUId() > 10)
.sequential()
.map(user -> user.getUId())
.limit(3)
.parallel()
.reduce(0, Integer::sum);
System.out.println(reduceSequential);
/**
* 留意装箱。自动装箱和拆箱操作会大大降低性能。Java 8中有原始类型流(IntStream、
* LongStream、DoubleStream)来避免这种操作,但凡有可能都应该用这些流。
*
* 要考虑流背后的数据结构是否易于分解。例如,ArrayList的拆分效率比LinkedList
* 高得多,因为前者用不着遍历就可以平均拆分,而后者则必须遍历
*/
}
Figure 7-3 Branch/merge process
You may have noticed that this is just a parallel version of the famous divide and conquer algorithm. Here is a practical example of using the branch/merge framework. Building on the previous example, let's try to use this framework to sum a range of numbers (here represented by a long[] array). As mentioned before, you need to first make an implementation for the RecursiveTask class, which is the ForkJoinSumCalculator in the code listing below.
public static long forkJoinSum(long n) {
long[] numbers = LongStream.rangeClosed(1, n).toArray();
ForkJoinTask<Long> task = new ForkJoinSumCalculator(numbers);
return new ForkJoinPool().invoke(task);
}
Run ForkJoinSumCalculator
Spliterator interface
public interface Spliterator<T> {
boolean tryAdvance(Consumer<? super T> action);
Spliterator<T> trySplit();
long estimateSize();
int characteristics();
}
As always, T is the type of elements that the Spliterator traverses. The tryAdvance method behaves like a normal Iterator in that it uses the elements in the Spliterator one by one in order and returns true if there are other elements to iterate over. But trySplit is specially designed for the Spliterator interface, because it can divide some elements and assign them to the second Spliterator (returned by this method), so that they can be processed in parallel.
Spliterator can also estimate how many elements are left to traverse through the estimateSize method, because even if it is not so exact, being able to quickly calculate a value can help make the split more even. It is important to understand how this unbundling process is performed internally so that you can take control of it if the need arises.
Finally implement your own Spliterator...