Java stream - collect combiner

Bojan Vukasovic :

Why does the following code:

StringBuilder sb22 = IntStream
     .range(1, 101)
     .filter(x -> x > 50)
     .boxed()
     .parallel()
     .collect(// object that is used in accumulator to do accumulating on
              StringBuilder::new, 
              // use object from above and call append on it with each stream element as argument
              (sb, a) -> sb.append(":" + a),  
              // (executes only when using parallel!)
              (sb1, sb2) -> { 
                     System.out.println(Thread.currentThread().getId() + "  " + "sb1=" + sb1 + " AND " + "sb2=" + sb2); 
                     sb1.append("-"+sb2);
              });

produces this result:

------------------:51:52:53-:54:55:56-:57:58:59-:60:61:62-:63:64:65-:66:67:68-:69:70:71-:72:73-:74:75-:76:77:78-:79:80:81-:82:83:84-:85:86:87-:88:89:90-:91:92:93-:94:95:96-:97:98-:99:100

shouldn't first part (------------------) be excluded from the output?

Also, I understood that combiner in collect can potentially be called out of order so it is possible to have instead :76:77:78-:79:80:81 e.g. :63:64:65-:79:80:81?

UPDATE (after @Holger response)

This is the tree generated using code he linked for this case:

                                                                                                                                                    [51..100]                                                                                                                                                     
                                                                        _________________________________________________________________________________/\______________________________________________________________________                                                                                 
                                                                       |                                                                                                                                                         |                                                                                
                                                                    (empty)                                                                                                                                                 [51..100]                                                                             
                                    ___________________________________/\__________________________________                                                                              ________________________________________/\______________________________________                                         
                                   |                                                                       |                                                                            |                                                                                |                                        
                                (empty)                                                                 (empty)                                                                     [51..75]                                                                         [76..100]                                    
                ___________________/\______________                                     ___________________/\______________                                       ______________________/\________________                                         ______________________/\________________                       
               |                                   |                                   |                                   |                                     |                                        |                                       |                                        |                      
            (empty)                             (empty)                             (empty)                             (empty)                              [51..62]                                 [63..75]                                [76..87]                                 [88..100]                  
        _______/\______                 ___________/\______                     _______/\______                 ___________/\______                      ________/\_______                   _____________/\_______                       ________/\_______                   _____________/\_______              
       |               |               |                   |                   |               |               |                   |                    |                 |                 |                      |                     |                 |                 |                      |             
    (empty)         (empty)         (empty)             (empty)             (empty)         (empty)         (empty)             (empty)             [51..56]          [57..62]          [63..68]               [69..75]              [76..81]          [82..87]          [88..93]               [94..100]         
    ___/\__         ___/\__         ___/\__         _______/\__             ___/\__         ___/\__         ___/\__         _______/\__              ___/\___          ___/\___          ___/\___          ________/\__               ___/\___          ___/\___          ___/\___          ________/\___         
   |       |       |       |       |       |       |           |           |       |       |       |       |       |       |           |            |        |        |        |        |        |        |            |             |        |        |        |        |        |        |             |        
(empty) (empty) (empty) (empty) (empty) (empty) (empty)     (empty)     (empty) (empty) (empty) (empty) (empty) (empty) (empty)     (empty)     [51..53] [54..56] [57..59] [60..62] [63..65] [66..68] [69..71]     [72..75]      [76..78] [79..81] [82..84] [85..87] [88..90] [91..93] [94..96]     [97..100]     
                                                            ___/\__                                                                 ___/\__                                                                         ___/\___                                                                         ____/\__     
                                                           |       |                                                               |       |                                                                       |        |                                                                       |        |    
                                                        (empty) (empty)                                                         (empty) (empty)                                                                [72..73] [74..75]                                                                [97..98] [99..100]
Holger :

The workload splitting happens before anything has been processed, hence, the Stream implementation will split the range [1, 101] into subranges to be processed. At this point, it doesn’t know that the filter will remove the first half entirely, it can’t know without evaluating the predicate and that should already happen in parallel, hence, after the workload splitting.

Therefore, each subrange gets processed the same way, including collecting the results into a container and combining those containers afterwards, even if they happen to be empty. The specification doesn’t say that the combining step will be skipped when no elements arrived at the collector, hence, you shouldn’t expect that. While in theory, it would be possible to track whether any elements reached the collector, this tracking would only serve a specific case and it’s not even clear whether combining a container with an empty container (like adding an empty List or appending an empty StringBuilder) is more expensive than this tracking.

Of course, nothing stops you from optimizing your combiner, if it retains the semantics, e.g. instead of (sb1, sb2) -> sb1.append(sb2), you could use (sb1, sb2) -> sb1.length()==0? sb2: sb1.append(sb2)

You can look at this Q&A, “Visualization of Java Stream parallelization” for more details.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=467481&siteId=1