Flink task scheduling source code analysis 2 (JobGraph construction and submission source code analysis)

JobGraph: StreamGraph generates JobGraph after optimization, and the data structure submitted to JobManager.
The main abstract concepts it contains are
1. JobVertex: after optimization, multiple StreamNodes that meet the conditions may be chained together to generate a
JobVertex, that is, a JobVertex contains one Or multiple operators, the input of JobVertex is JobEdge, and the output is
IntermediateDataSet.
2. IntermediateDataSet: Represents the output of JobVertex, that is, the data set generated by operator processing.
The producer is JobVertex, and the consumer is JobEdge.
3. JobEdge: represents a data transmission channel in the job graph. The source is IntermediateDataSet, and the target
is JobVertex. That is, the data is passed to the target JobVertex from the IntermediateDataSet through the JobEdge.

The conversion of StreamGraph to JobGraph is also done in Client, and three main things are done:
⚫ StreamNode is converted to JobVertex.
⚫ StreamEdge is converted to JobEdge.
⚫ Create an IntermediateDataSet to connect between JobEdge and JobVertex

入口:AbstractJobClusterExecutor.execute
final JobGraph jobGraph = PipelineExecutorUtils.getJobGraph(pipeline, configuration);
-> getJobGraph -> pipelineTranslator.translateToJobGraph -> StreamGraphTranslator.translateToJobGraph
 -> streamGraph.getJobGraph ->  StreamingJobGraphGenerator.createJobGraph
   -> new StreamingJobGraphGenerator(streamGraph, jobID).createJobGraph();
	private JobGraph createJobGraph() {
    
    
		preValidate();

		// make sure that all vertices start immediately
		/*TODO streaming 模式下,调度模式是所有节点(vertices)一起启动。 对应调度名称 Eager */
		jobGraph.setScheduleMode(streamGraph.getScheduleMode());
		jobGraph.enableApproximateLocalRecovery(streamGraph.getCheckpointConfig().isApproximateLocalRecoveryEnabled());

		// Generate deterministic hashes for the nodes in order to identify them across
		// submission iff they didn't change.
		// 广度优先遍历 StreamGraph 并且为每个SteamNode生成hash id,
		// 保证如果提交的拓扑没有改变,则每次生成的hash都是一样的
		Map<Integer, byte[]> hashes = defaultStreamGraphHasher.traverseStreamGraphAndGenerateHashes(streamGraph);

		// Generate legacy version hashes for backwards compatibility
		List<Map<Integer, byte[]>> legacyHashes = new ArrayList<>(legacyStreamGraphHashers.size());
		for (StreamGraphHasher hasher : legacyStreamGraphHashers) {
    
    
			legacyHashes.add(hasher.traverseStreamGraphAndGenerateHashes(streamGraph));
		}

		/* TODO 最重要的函数,生成 JobVertex,JobEdge等,并尽可能地将多个节点chain在一起*/
		setChaining(hashes, legacyHashes);

		/*TODO 将每个JobVertex的入边集合也序列化到该JobVertex的StreamConfig中 (出边集合已经在setChaining的时候写入了)*/
		setPhysicalEdges();

		/*TODO 根据group name,为每个 JobVertex 指定所属的 SlotSharingGroup 以及针对 Iteration的头尾设置  CoLocationGroup*/
		setSlotSharingAndCoLocation();

		setManagedMemoryFraction(
			Collections.unmodifiableMap(jobVertices),
			Collections.unmodifiableMap(vertexConfigs),
			Collections.unmodifiableMap(chainedConfigs),
			id -> streamGraph.getStreamNode(id).getManagedMemoryOperatorScopeUseCaseWeights(),
			id -> streamGraph.getStreamNode(id).getManagedMemorySlotScopeUseCases());

		// TODO 配置 checkpoint
		configureCheckpointing();

		jobGraph.setSavepointRestoreSettings(streamGraph.getSavepointRestoreSettings());

		JobGraphUtils.addUserArtifactEntries(streamGraph.getUserArtifacts(), jobGraph);

		// set the ExecutionConfig last when it has been finalized
		try {
    
    
			/*TODO 将 StreamGraph 的 ExecutionConfig 序列化到 JobGraph 的配置中*/
			jobGraph.setExecutionConfig(streamGraph.getExecutionConfig());
		}
		catch (IOException e) {
    
    
			throw new IllegalConfigurationException("Could not serialize the ExecutionConfig." +
					"This indicates that non-serializable types (like custom serializers) were registered");
		}

		return jobGraph;
	}
  -> setChaining() 
		/*TODO 从source开始建⽴ node chains*/
		for (OperatorChainInfo info : initialEntryPoints) {
    
    
			/*TODO 构建node chains,返回当前节点的物理出边;startNodeId != currentNodeId 时,说明currentNode是chain中的子节点*/
			createChain(
					info.getStartNodeId(),
					1,  // operators start at position 1 because 0 is for chained source inputs
					info,
					chainEntryPoints);
		}
  --> createChain
  	private List<StreamEdge> createChain(
			final Integer currentNodeId,
			final int chainIndex,
			final OperatorChainInfo chainInfo,
			final Map<Integer, OperatorChainInfo> chainEntryPoints) {
    
    

		Integer startNodeId = chainInfo.getStartNodeId();
		if (!builtVertices.contains(startNodeId)) {
    
    
			/*TODO 过渡用的出边集合, 用来生成最终的 JobEdge, 注意不包括 chain 内部的边*/
			List<StreamEdge> transitiveOutEdges = new ArrayList<StreamEdge>();

			List<StreamEdge> chainableOutputs = new ArrayList<StreamEdge>();
			List<StreamEdge> nonChainableOutputs = new ArrayList<StreamEdge>();

			StreamNode currentNode = streamGraph.getStreamNode(currentNodeId);

			/*TODO 将当前节点的出边分成 chainable 和 nonChainable 两类*/
			for (StreamEdge outEdge : currentNode.getOutEdges()) {
    
    
				// TODO isChainable
				if (isChainable(outEdge, streamGraph)) {
    
    
					chainableOutputs.add(outEdge);
				} else {
    
    
					nonChainableOutputs.add(outEdge);
				}
			}

			for (StreamEdge chainable : chainableOutputs) {
    
    
				transitiveOutEdges.addAll(
						createChain(chainable.getTargetId(), chainIndex + 1, chainInfo, chainEntryPoints));
			}

			/*TODO 递归调用 createChain*/
			for (StreamEdge nonChainable : nonChainableOutputs) {
    
    
				transitiveOutEdges.add(nonChainable);
				createChain(
						nonChainable.getTargetId(),
						1, // operators start at position 1 because 0 is for chained source inputs
						chainEntryPoints.computeIfAbsent(
							nonChainable.getTargetId(),
							(k) -> chainInfo.newChain(nonChainable.getTargetId())),
						chainEntryPoints);
			}

			/*TODO 生成当前节点的显示名,如:"Keyed Aggregation -> Sink: Unnamed"*/
			chainedNames.put(currentNodeId, createChainedName(currentNodeId, chainableOutputs, Optional.ofNullable(chainEntryPoints.get(currentNodeId))));
			chainedMinResources.put(currentNodeId, createChainedMinResources(currentNodeId, chainableOutputs));
			chainedPreferredResources.put(currentNodeId, createChainedPreferredResources(currentNodeId, chainableOutputs));

			OperatorID currentOperatorId = chainInfo.addNodeToChain(currentNodeId, chainedNames.get(currentNodeId));

			if (currentNode.getInputFormat() != null) {
    
    
				getOrCreateFormatContainer(startNodeId).addInputFormat(currentOperatorId, currentNode.getInputFormat());
			}

			if (currentNode.getOutputFormat() != null) {
    
    
				getOrCreateFormatContainer(startNodeId).addOutputFormat(currentOperatorId, currentNode.getOutputFormat());
			}

			/*TODO 如果当前节点是起始节点, 则直接创建 JobVertex 并返回 StreamConfig, 否则先创建一个空的 StreamConfig */
			StreamConfig config = currentNodeId.equals(startNodeId)
					? createJobVertex(startNodeId, chainInfo)
					: new StreamConfig(new Configuration());

			/*TODO 设置 JobVertex 的 StreamConfig, 基本上是序列化 StreamNode 中的配置到 StreamConfig中.*/
			setVertexConfig(currentNodeId, config, chainableOutputs, nonChainableOutputs, chainInfo.getChainedSources());

			if (currentNodeId.equals(startNodeId)) {
    
    
				/*TODO 如果是chain的起始节点,标记成chain start(不是chain中的节点,也会被标记成 chain start)*/
				config.setChainStart();
				config.setChainIndex(chainIndex);
				config.setOperatorName(streamGraph.getStreamNode(currentNodeId).getOperatorName());

				/*TODO 将当前节点(headOfChain)与所有出边相连*/
				for (StreamEdge edge : transitiveOutEdges) {
    
    
					/*TODO 通过StreamEdge构建出JobEdge,创建 IntermediateDataSet,用来将JobVertex和JobEdge相连*/
					connect(startNodeId, edge);
				}

				/*TODO 把物理出边写入配置, 部署时会用到*/
				config.setOutEdgesInOrder(transitiveOutEdges);
				/*TODO 将chain中所有子节点的StreamConfig写入到 headOfChain 节点的 CHAINED_TASK_CONFIG 配置中*/
				config.setTransitiveChainedTaskConfigs(chainedConfigs.get(startNodeId));

			} else {
    
    
				/*TODO 如果是 chain 中的子节点*/
				chainedConfigs.computeIfAbsent(startNodeId, k -> new HashMap<Integer, StreamConfig>());

				config.setChainIndex(chainIndex);
				StreamNode node = streamGraph.getStreamNode(currentNodeId);
				config.setOperatorName(node.getOperatorName());
				/*TODO 将当前节点的StreamConfig添加到该chain的config集合中*/
				chainedConfigs.get(startNodeId).put(currentNodeId, config);
			}

			config.setOperatorID(currentOperatorId);

			if (chainableOutputs.isEmpty()) {
    
    
				config.setChainEnd();
			}
			/*TODO 返回连往chain外部的出边集合*/
			return transitiveOutEdges;

		} else {
    
    
			return new ArrayList<>();
		}
	}

	--> isChainable() 可以chain条件 
	// 1、下游节点的入度为1 (也就是说下游节点没有来自其他节点的输入)
	downStreamVertex.getInEdges().size() == 1;
	// 2、上下游节点都在同一个 slot group 中
	upStreamVertex.isSameSlotSharingGroup(downStreamVertex);
	// 3、前后算子不为空
	!(downStreamOperator == null || upStreamOperator == null);
	// 4、上游节点的 chain 策略为 ALWAYS 或 HEAD(只能与下游链接,不能与上游链接,Source 默认是 HEAD)
	!upStreamOperator.getChainingStrategy() == ChainingStrategy.NEVER;
	// 5、下游节点的 chain 策略为 ALWAYS(可以与上下游链接,map、flatmap、filter 等默认是ALWAYS)
	!downStreamOperator.getChainingStrategy() != ChainingStrategy.ALWAYS;
	// 6、两个节点间物理分区逻辑是 ForwardPartitioner
	(edge.getPartitioner() instanceof ForwardPartitioner);
	// 7、两个算子间的shuffle方式不等于批处理模式
	edge.getShuffleMode() != ShuffleMode.BATCH;
	// 8、上下游的并行度一致
	upStreamVertex.getParallelism() == downStreamVertex.getParallelism();
	// 9、用户没有禁用 chain
	streamGraph.isChainingEnabled();

Summary
Each JobVertex corresponds to a serializable StreamConfig, which is used to send to JobManager and
TaskManager. Finally, when starting the Task in the TaskManager, you need to deserialize the required configuration
information from it, which includes the StreamOperator containing the user code.
setChaining will call the createChain method on the source, which will recursively call the downstream nodes to construct
node chains. createChain will analyze the outgoing edge of the current node
, and divide the outgoing edge into two types, chainalbe and noChainable , according to the chainable condition in Operator Chains , and call its own methods recursively. Then
the configuration information in StreamNode will be serialized into StreamConfig. If it is not currently a child node in the chain, a
JobVertex will be constructed to connect with JobEdge. If it is a child node in the
chain, StreamConfig will be added to the config collection of the chain. For a node chain, except for the headOfChain node that will generate the corresponding JobVertex, the
rest of the nodes are written to StreamConfig in a serialized form and saved to the
CHAINED_TASK_CONFIG configuration item of headOfChain . The corresponding ChainOperators will not be retrieved and generated until deployment.

Guess you like

Origin blog.csdn.net/m0_46449152/article/details/113789931
Recommended