Flink end of Job Start Driver (source code analysis)

The entire Flink Job started by the end user's Driver execute method Envirement operator user is converted into StreamGraph ()

Then this resulting interfaces JobGraph JobGraph submitted to the JobManager through corresponding RPC Remote

JobManager converted to executionGraph.deploy (), and then generates a TDD distributed TaskManager, and then started up the entire Job

Driver achieve this look from the end user Envirement.execute () method as an inlet

Envirement here into

RemoteStreamEnvironment

LocalStreamEnvironment

Because the local model is relatively simple do not start here, execute the main method is to look at the RemoteStreamEnvironment

Can be seen here first to get the streamGraph, get specific implementations

Here we are passing one transformations which will contain all of the operator of our users

This place is traversed all of the operator the UE generates StreamGraph, each traversed a particular operator is converted into logical streamGraph

At 1 will recurse input until the input has been transfor, then get the upstream ids

The operator is then added to the addNode streamGraph call () method as a Node operator, contains some information on the type of the downstream, the degree of parallelism, soltGroup

Finally traverse ids upstream side added to create streamGraph

Here streamGraph now created

Back to the place of beginning, you've created streamGraph later, will streamGraph incoming executeRemotely (streamGraph, jarFiles) this method, this is streamGraph converted into a logic of jobgraph

Which creates a RestClusterClient

We can see here, by getJobGraph method streamGraph converted into jobgraph

Then submitJob this JobGraph submitted a Jobmanager

Look at how streamGraph converted into the jobgraph

Then getJobGraph method

This method is the primary createJobGraph conversion logic

Breadth-first traversal is a node that is all streamGraph operator generates hash hash value, why should generate a hash of the operator?

Because the hash that uniquely identifies an operator as required for each, Flag recovery for each operator cp when the user code is not modified, the hash value will not be changed

Next

Here the operator will flink downstream operations in accordance with the condition satisfies chain chained together, in the createChian

This isChainable () method is the determination whether the condition chain

1. The input side of only one downstream

2. downstream operations operator is not empty

3. The operation of the upstream operator is not empty

4. The upstream group must have the same solt

5. Downstream chain policy to always

6. The head or upstream of the upstream chain policy chain policy to always

7.forwardpartition edge

8. downstream parallelism same

If the operator 9. user code can chian

 

The chain of streamnode can later be chained together to create become jobVertex jobGraph of the

Above the interface then sends to the RPC jobmanager corresponding to the Dispatcher will by this jobGraph RestClusterClient

Driver-side start the task of the entire job is finished

 

to sum up:

  Driver at the end user is created operator become streamGraph, which contains a number of edges, corners, some of the information on the type of downstream, parallel degree

  Then streamGraph chain condition through some vertices chain can chain together converted into JobGraph

  streamEdge became jobEdge, chain together streamnode become jobVertex

  Finally, the entire jobGraph then submitted to jobmanager by RPC.

Guess you like

Origin www.cnblogs.com/ljygz/p/11419943.html