OperatorChain of Flink1.15 source code analysis

Table of contents

How does Flink determine whether operators can form an operate chain?

Here I use wordcount code to explain and analyze

Summarize:


This article first summarizes under what circumstances operators can form an operate chain, and then proceeds step by step to truly determine isChainable based on the wordcount code for source code analysis (Flink 1.15.2 version)

How does Flink determine whether operators can form an operate chain?

  • When the following conditions are met
  • The incoming edge of the downstream operator must be 1, and the downstream operator cannot be connect, union, or join.
  • Both the upstream operator and the downstream operator are in the same SlotSharingGroup
  • The downstream operator is not null or the upstream operator is not null
  •  downStreamOperator does not belong to the YieldingOperatorFactory class and the Source type is not LegacySource
  • upStreamOperator.ChainingStrategy is ALWAYS, HEAD, HEAD_WITH_SOURCES type
  • downStreamOperator.ChainStrategy is ALWAYS, HEAD_WITH_SOURCES (only upstream is source)
  • outputPartitioner is ForwardPartitioner, that is, the partition downstream of the operator is ForwardParttioner
  • StreamEdge's ExchangeMode is not in BATCH mode
  • streamGraph.isChainingEnabled is true and the job does not call disableChaining()

Here I use wordcount code to explain and analyze

 WordCount.java

public class WordCount {
    public static void main(String[] args) throws Exception {
​
        StreamExecutionEnvironment env=
                StreamExecutionEnvironment.getExecutionEnvironment();
​
        env.socketTextStream("localhost",9999)
                .flatMap(new FlatMapFunction<String,Tuple2<String, Integer>>() {
                    @Override
                    public void flatMap(String value, Collector<Tuple2<String, Integer>> out) throws Exception {
                        String[] words=value.split(",");
                        for (String word:words) {
                            out.collect(new Tuple2<>(word,1));
                        }
                    }
                }).keyBy(0)
                .window(TumblingProcessingTimeWindows.of(Time.seconds(10)))
                .sum(1).print();
        env.execute();
    }
}

Make isChainable judgment in the createChain() method in StreamingJobGenerator.java

Enter the isChainable(outEdge, streamGraph) method

   public static boolean isChainable(StreamEdge edge, StreamGraph streamGraph) {
        StreamNode downStreamVertex = streamGraph.getTargetVertex(edge);
​
        return downStreamVertex.getInEdges().size() == 1 && isChainableInput(edge, streamGraph);
    }
​

Taking the wordcount code as an example, the first edge entered is source->flatmap

downStreamVertex: For the downstream StreamNode, which is flatmap, determine whether the incoming edge of flatmap is 1. After meeting the conditions, enter isChanableInput(edge, streamGraph)

The areOperatorChainable() method mainly performs ChainingStrategy judgment

In the process of source->flatmap, outputPartitioner=REBALANCE is not a ForwardPartitioner class. Therefore, source->flatmap is not an operatorChain.

The flatmap->window partition is KeyGroupStreamPartitioner, which is not a ForwardPartitioner, nor can it form an operatorChain.

window->sink partition is FORWARD, which can be called operatorChain

Of course, you can also add different slot sharing groups and set disableChaining at the operator level for testing.

Summarize:

When the upstream operator is not union, connect, or join and the operator partition is in Forwardpartitioner mode, and the upstream and downstream operators are in the same slot sharing group. If disableChaining is not set, it will be formed into an operator chain.

Guess you like

Origin blog.csdn.net/qq_24186017/article/details/127027744