Thoroughly understand the Stream operation of JAVA


In JAVA , when it comes to operations on elements in collection classes such as arrays and Collections , it is usually processed one by one by means of loops, or processed by means of Stream .

For example, now there is such a requirement:

Return a list of words whose word length is greater than 5 from a given sentence, output in reverse order of length, and return at most 3

In JAVA7 and previous codes , we can implement it in the following way:

public List<String> sortGetTop3LongWords(@NotNull String sentence) {
    
    
    // 先切割句子,获取具体的单词信息
    String[] words = sentence.split(" ");
    List<String> wordList = new ArrayList<>();
    // 循环判断单词的长度,先过滤出符合长度要求的单词
    for (String word : words) {
    
    
        if (word.length() > 5) {
    
    
            wordList.add(word);
        }
    }
    // 对符合条件的列表按照长度进行排序
    wordList.sort((o1, o2) -> o2.length() - o1.length());
    // 判断list结果长度,如果大于3则截取前三个数据的子list返回
    if (wordList.size() > 3) {
    
    
        wordList = wordList.subList(0, 3);
    }
    return wordList;
}

In JAVA8 and later versions , with the help of Stream flow, we can write the following code more elegantly:

public List<String> sortGetTop3LongWordsByStream(@NotNull String sentence) {
    
    
    return Arrays.stream(sentence.split(" "))
            .filter(word -> word.length() > 5)
            .sorted((o1, o2) -> o2.length() - o1.length())
            .limit(3)
            .collect(Collectors.toList());
}

Intuitively, the implementation of Stream code is more concise and complete in one go. Many students often use Stream in code, but their understanding of Stream is often limited to some simple operations such as filter , map , collect , etc., but the applicable scenarios and capabilities of JAVA Stream are far more than these.

So here comes the question: Compared with the traditional foreach method, what advantages does Stream have?

Here we can shelve this issue first, first have a comprehensive understanding of Stream, and then discuss this issue.

First acquaintance with Stream and related APIs

In a nutshell, Stream operations can be divided into 3 types:

  • Create Stream
  • Stream intermediate processing
  • Terminate Steam
    insert image description here
    Each Stream pipeline operation type contains several API methods, and the function introduction of each API method is listed first.

start pipeline

It is mainly responsible for creating a new Stream, or creating a new Stream based on existing array, List, Set, Map and other collection type objects.

API Function Description
stream() Create a new stream serial stream object
parallelStream() Create a stream object that can be executed in parallel
Stream.of() Creates a new Stream object from the given sequence of elements

middle pipe

It is responsible for processing the Stream and returning a new Stream object, and the intermediate pipeline operations can be superimposed.

API Function Description
filter() Filter the elements that meet the requirements according to the conditions, and return a new stream
map() Convert an existing element to another object type, one-to-one logic, and return a new stream
flatMap() Convert an existing element to another object type, one-to-many logic, that is, an original element object may be converted to one or more elements of a new type, and return a new stream
limit() Only retain the specified number of elements in front of the collection and return a new stream
skip() Skip the specified number of elements in front of the collection and return a new stream
concat() Merge the data of the two streams into a new stream and return the new stream
distinct() Deduplicate all elements in the Stream and return a new stream
sorted() Sort all the elements in the stream according to the specified rules and return a new stream
peek() Traverse each element in the stream one by one and return the processed stream

terminate the pipeline

As the name implies, after the pipeline operation is terminated, the Stream stream will end, and some logic processing may be performed at the end, or some execution result data may be returned as required.

API Function Description
count() Returns the final number of elements after stream processing
max() Returns the maximum value of elements after stream processing
min() Returns the minimum value of the element after stream processing
findFirst() Terminate stream processing when the first matching element is found
findAny() Exit stream processing when any element that meets the conditions is found. This is the same as findFirst for serial streams. It is more efficient for parallel streams. Finding any fragment will terminate subsequent calculation logic
anyMatch() Returns a boolean value, similar to isContains(), used to determine whether there are eligible elements
allMatch() Returns a boolean value used to determine whether all elements meet the criteria
noneMatch() Returns a boolean value used to determine whether all elements do not meet the conditions
collect() Convert the stream to the specified type, specified by Collectors
toArray() convert stream to array
iterator() Convert the stream to an Iterator object
foreach() No return value, traverse the elements one by one, and then execute the given processing logic

The Stream method uses

map and flatMap

Both map and flatMap are used to convert existing elements into other elements, the difference is:

  • map must be one-to -one, that is, each element can only be transformed into 1 new element
  • flatMap can be one-to -many, that is, each element can be converted into one or more new elements.
    insert image description here
    For example: there is a list of string IDs, and now it needs to be converted into a list of User objects. You can use map to achieve:
/**
 * 演示map的用途:一对一转换
 */
public void stringToIntMap() {
    
    
    List<String> ids = Arrays.asList("205", "105", "308", "469", "627", "193", "111");
    // 使用流操作
    List<User> results = ids.stream()
            .map(id -> {
    
    
                User user = new User();
                user.setId(id);
                return user;
            })
            .collect(Collectors.toList());
    System.out.println(results);
}

After execution, you will find that each element is converted into a corresponding new element, but the total number of elements before and after is the same:

[User{
    
    id='205'}, 
 User{
    
    id='105'},
 User{
    
    id='308'}, 
 User{
    
    id='469'}, 
 User{
    
    id='627'}, 
 User{
    
    id='193'}, 
 User{
    
    id='111'}]

Another example: there is a list of sentences, and each word in the sentence needs to be extracted to obtain a list of all words. In this case, you can’t handle it with map, and you need flatMap to play:

public void stringToIntFlatmap() {
    
    
    List<String> sentences = Arrays.asList("hello world","Jia Gou Wu Dao");
    // 使用流操作
    List<String> results = sentences.stream()
            .flatMap(sentence -> Arrays.stream(sentence.split(" ")))
            .collect(Collectors.toList());
    System.out.println(results);
}

The execution result is as follows, you can see that the number of elements in the result list is more than the number of elements in the original list:

[hello, world, Jia, Gou, Wu, Dao]

It needs to be added here that when operating flatMap, each element is processed first and a new Stream is returned, and then multiple Streams are expanded and merged into a complete new Stream, as follows:
insert image description here

peek and foreach methods

Both peek and foreach can be used to traverse elements and then process them one by one.
But according to the previous introduction, peek is an intermediate method , while foreach is a termination method . This means that peek can only be used as a processing step in the middle of the pipeline, and cannot be directly executed to get the result. It will be executed only when there are other termination operations behind it; and foreach is a termination method with no return value. You can directly perform related operations.

public void testPeekAndforeach() {
    
    
    List<String> sentences = Arrays.asList("hello world","Jia Gou Wu Dao");
    // 演示点1: 仅peek操作,最终不会执行
    System.out.println("----before peek----");
    sentences.stream().peek(sentence -> System.out.println(sentence));
    System.out.println("----after peek----");
    // 演示点2: 仅foreach操作,最终会执行
    System.out.println("----before foreach----");
    sentences.stream().forEach(sentence -> System.out.println(sentence));
    System.out.println("----after foreach----");
    // 演示点3: peek操作后面增加终止操作,peek会执行
    System.out.println("----before peek and count----");
    sentences.stream().peek(sentence -> System.out.println(sentence)).count();
    System.out.println("----after peek and count----");
}

It can be seen from the output results that peek is not executed when it is called alone, but it can be executed after peek is followed by a termination operation, and foreach can be executed directly:

----before peek----
----after peek----
----before foreach----
hello world
Jia Gou Wu Dao
----after foreach----
----before peek and count----
hello world
Jia Gou Wu Dao
----after peek and count----

filter、sorted、distinct、limit

These are the commonly used intermediate operation methods of Stream, and the meaning of the specific methods is explained in the table above. When using it specifically, you can choose one or more to use in combination according to your needs, or use multiple combinations of the same method at the same time:

public void testGetTargetUsers() {
    
    
    List<String> ids = Arrays.asList("205","10","308","49","627","193","111", "193");
    // 使用流操作
    List<Dept> results = ids.stream()
            .filter(s -> s.length() > 2)
            .distinct()
            .map(Integer::valueOf)
            .sorted(Comparator.comparingInt(o -> o))
            .limit(3)
            .map(id -> new Dept(id))
            .collect(Collectors.toList());
    System.out.println(results);
}

The processing logic of the above code snippet is clear:

  1. Use filter to filter out unqualified data
  2. Use distinct to deduplicate stored elements
  3. Convert a string to an integer type through a map operation
  4. Use sorted to specify that they are arranged in positive order according to the size of the number
  5. Use limit to intercept the top 3 elements
  6. Use map again to convert id to Dept object type
  7. Use collect to terminate the operation to collect the final processed data into the list

Output result:

[Dept{
    
    id=111},  Dept{
    
    id=193},  Dept{
    
    id=205}]

Simple result termination method

According to the previous introduction, methods such as count, max, min, findAny, findFirst, anyMatch, allMatch, nonneMatch and other methods in the termination method belong to the simple result termination method mentioned here. The so-called simple means that the result form is a number, a Boolean value, or an Optional object value.

public void testSimpleStopOptions() {
    
    
    List<String> ids = Arrays.asList("205", "10", "308", "49", "627", "193", "111", "193");
    // 统计stream操作后剩余的元素个数
    System.out.println(ids.stream().filter(s -> s.length() > 2).count());
    // 判断是否有元素值等于205
    System.out.println(ids.stream().filter(s -> s.length() > 2).anyMatch("205"::equals));
    // findFirst操作
    ids.stream().filter(s -> s.length() > 2)
            .findFirst()
            .ifPresent(s -> System.out.println("findFirst:" + s));
}

The result after execution is:

6
true
findFirst:205

Pit avoidance reminder

Here we need to add a reminder that once a Stream has been terminated, it is not possible to read the stream and perform other operations, otherwise an error will be reported, see the following example:

public void testHandleStreamAfterClosed() {
    
    
    List<String> ids = Arrays.asList("205", "10", "308", "49", "627", "193", "111", "193");
    Stream<String> stream = ids.stream().filter(s -> s.length() > 2);
    // 统计stream操作后剩余的元素个数
    System.out.println(stream.count());
    System.out.println("-----下面会报错-----");
    // 判断是否有元素值等于205
    try {
    
    
        System.out.println(stream.anyMatch("205"::equals));
    } catch (Exception e) {
    
    
        e.printStackTrace();
    }
    System.out.println("-----上面会报错-----");
}

When executed, the result is as follows:

6
-----下面会报错-----
java.lang.IllegalStateException: stream has already been operated upon or closed
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:229)
	at java.util.stream.ReferencePipeline.anyMatch(ReferencePipeline.java:449)
	at com.veezean.skills.stream.StreamService.testHandleStreamAfterClosed(StreamService.java:153)
	at com.veezean.skills.stream.StreamService.main(StreamService.java:176)
-----上面会报错-----

Because the stream has already been terminated by the count() method, when the anyMatch method is executed on the stream, it will report an error stream has already been operated upon or closed , which requires special attention when using it.

Result Collection Termination Method

Because Stream is mainly used to process collection data, in addition to the above termination methods for obtaining simple results, more scenarios are to obtain a collection-type result object, such as List, Set, or HashMap. The collect
method is needed here , and it can support the generation of the following types of result data:

  • A collection class , such as List, Set or HashMap, etc.
  • StringBuilder object, which supports splicing multiple strings and outputting the spliced ​​result
  • An object that can record the number or calculate the sum ( data batch operation statistics )

generate collection

It should be regarded as one of the most commonly used scenarios of collect:

public void testCollectStopOptions() {
    
    
    List<Dept> ids = Arrays.asList(new Dept(17), new Dept(22), new Dept(23));
    // collect成list
    List<Dept> collectList = ids.stream().filter(dept -> dept.getId() > 20)
            .collect(Collectors.toList());
    System.out.println("collectList:" + collectList);
    // collect成Set
    Set<Dept> collectSet = ids.stream().filter(dept -> dept.getId() > 20)
            .collect(Collectors.toSet());
    System.out.println("collectSet:" + collectSet);
    // collect成HashMap,key为id,value为Dept对象
    Map<Integer, Dept> collectMap = ids.stream().filter(dept -> dept.getId() > 20)
            .collect(Collectors.toMap(Dept::getId, dept -> dept, (e1, e2) -> e1)); //加上(e1, e2) -> e1 防止key冲突
    System.out.println("collectMap:" + collectMap);
}

The result is as follows:

collectList:[Dept{
    
    id=22}, Dept{
    
    id=23}]
collectSet:[Dept{
    
    id=23}, Dept{
    
    id=22}]
collectMap:{
    
    22=Dept{
    
    id=22}, 23=Dept{
    
    id=23}}

Generate a concatenated string

Concatenate the values ​​in a List or array into a string and separate them with commas. I believe everyone is familiar with this scene, right?
If you use the for loop and StringBuilder to loop splicing, you have to consider how to deal with the last comma, which is very cumbersome:

public void testForJoinStrings() {
    
    
    List<String> ids = Arrays.asList("205", "10", "308", "49", "627", "193", "111", "193");
    StringBuilder builder = new StringBuilder();
    for (String id : ids) {
    
    
        builder.append(id).append(',');
    }
    // 去掉末尾多拼接的逗号
    builder.deleteCharAt(builder.length() - 1);
    System.out.println("拼接后:" + builder.toString());
}

But now with Stream, it can be easily implemented using collect:

public void testCollectJoinStrings() {
    
    
    List<String> ids = Arrays.asList("205", "10", "308", "49", "627", "193", "111", "193");
    String joinResult = ids.stream().collect(Collectors.joining(","));
    System.out.println("拼接后:" + joinResult);
}

You can get the exact same result either way, but the Stream way is more elegant:

拼接后:205,10,308,49,627,193,111,193

Knock on the blackboard:
The explanation here is that this scenario can actually be done using String.join(), and it does not need to use the stream method above. Here I want to declare that the charm of Stream is that it can be combined with other business logic for processing, making the code logic more natural and complete in one go. If it is purely a string splicing appeal, there is really no need to use Stream to achieve it. After all, it is a sledgehammer~ But you can look at the example given below, and you can feel the real charm of string splicing using Stream where.
insert image description here

Data Batch Math Operations

There is another scenario, which may be less in actual use, that is, to use collect to generate the sum information of digital data. You can also understand the implementation method:

public void testNumberCalculate() {
    
    
    List<Integer> ids = Arrays.asList(10, 20, 30, 40, 50);
    // 计算平均值
    Double average = ids.stream().collect(Collectors.averagingInt(value -> value));
    System.out.println("平均值:" + average);
    // 数据统计信息
    IntSummaryStatistics summary = ids.stream().collect(Collectors.summarizingInt(value -> value));
    System.out.println("数据统计信息: " + summary);
}

In the above example, the collect method is used to perform mathematical operations on the element values ​​in the list, and the results are as follows:

平均值:30.0
总和: IntSummaryStatistics{
    
    count=5, sum=150, min=10, average=30.000000, max=50}

Parallel Stream

Mechanism Description

The use of parallel streams can effectively utilize the multi-CPU hardware of the computer and improve the execution speed of logic. The parallel stream divides an entire stream into multiple fragments, then executes the processing logic on each fragmented stream in parallel, and finally aggregates the execution results of each fragmented stream into an overall stream.
insert image description here

Constraints and Restrictions

Parallel streams are similar to multi-threaded parallel processing, so some problems related to multi-threaded scenarios also exist, such as deadlock and other problems, so the function logic that terminates execution in parallel streams must ensure thread safety.

answer the original question

At this point, the introduction to the related concepts and usage of JAVA Stream is basically finished. Let's switch our focus back to a question mentioned at the beginning of this article:

Compared with the traditional foreach method of processing stream, what advantages does Stream have?

According to the previous introduction, we should be able to draw the following answers:

  • The code is more concise and the declarative coding style makes it easier to reflect the logical intent of the code
  • Decoupling between logics, one stream intermediate processing logic, no need to pay attention to the content of upstream and downstream, only need to implement its own logic according to the agreement
  • Parallel stream scenarios are more efficient than iterators looping one by one
  • Functional interface, the feature of delayed execution, the intermediate pipeline operation will not be executed immediately no matter how many steps there are, and will only be executed when the termination operation is encountered, which can avoid some unnecessary operation consumption in the middle

Of course, Stream is not all advantages, and it also has its disadvantages in some aspects:

  • Code debugging is inconvenient
  • When programmers switch from historical writing to Stream, it takes a certain amount of time to adapt

Guess you like

Origin blog.csdn.net/doublepg13/article/details/128577400