The entire Flink Job started by the end user's Driver execute method Envirement operator user is converted into StreamGraph ()
Then this resulting interfaces JobGraph JobGraph submitted to the JobManager through corresponding RPC Remote
JobManager converted to executionGraph.deploy (), and then generates a TDD distributed TaskManager, and then started up the entire Job
Driver achieve this look from the end user Envirement.execute () method as an inlet
Envirement here into
RemoteStreamEnvironment
LocalStreamEnvironment
Because the local model is relatively simple do not start here, execute the main method is to look at the RemoteStreamEnvironment
Can be seen here first to get the streamGraph, get specific implementations
Here we are passing one transformations which will contain all of the operator of our users
This place is traversed all of the operator the UE generates StreamGraph, each traversed a particular operator is converted into logical streamGraph
At 1 will recurse input until the input has been transfor, then get the upstream ids
The operator is then added to the addNode streamGraph call () method as a Node operator, contains some information on the type of the downstream, the degree of parallelism, soltGroup
Finally traverse ids upstream side added to create streamGraph
Here streamGraph now created
Back to the place of beginning, you've created streamGraph later, will streamGraph incoming executeRemotely (streamGraph, jarFiles) this method, this is streamGraph converted into a logic of jobgraph
Which creates a RestClusterClient
We can see here, by getJobGraph method streamGraph converted into jobgraph
Then submitJob this JobGraph submitted a Jobmanager
Look at how streamGraph converted into the jobgraph
Then getJobGraph method
This method is the primary createJobGraph conversion logic
Breadth-first traversal is a node that is all streamGraph operator generates hash hash value, why should generate a hash of the operator?
Because the hash that uniquely identifies an operator as required for each, Flag recovery for each operator cp when the user code is not modified, the hash value will not be changed
Next
Here the operator will flink downstream operations in accordance with the condition satisfies chain chained together, in the createChian
This isChainable () method is the determination whether the condition chain
1. The input side of only one downstream
2. downstream operations operator is not empty
3. The operation of the upstream operator is not empty
4. The upstream group must have the same solt
5. Downstream chain policy to always
6. The head or upstream of the upstream chain policy chain policy to always
7.forwardpartition edge
8. downstream parallelism same
If the operator 9. user code can chian
The chain of streamnode can later be chained together to create become jobVertex jobGraph of the
Above the interface then sends to the RPC jobmanager corresponding to the Dispatcher will by this jobGraph RestClusterClient
Driver-side start the task of the entire job is finished
to sum up:
Driver at the end user is created operator become streamGraph, which contains a number of edges, corners, some of the information on the type of downstream, parallel degree
Then streamGraph chain condition through some vertices chain can chain together converted into JobGraph
streamEdge became jobEdge, chain together streamnode become jobVertex
Finally, the entire jobGraph then submitted to jobmanager by RPC.