Flink | Run architecture

 

1. Flink runtime component

             

 

 Job Manager (JobManager)

  

 

 Task Manager (TaskManager)

 

 

 Resource Manager (ResourceManager)

 

 

 Distributor (Dispatcher)

  

 

2. The job submission process

   

 

   

  After Flink job submission, Client Jar package and configuration upload Flink to HDFS, after submitting to the task Yarn ResourceManager;

Container ResourceManager allocated resources and notifies the corresponding starting NodeManager ApplicationMaster , Jar package loading configuration and Flink started after ApplicationMaster build environment, and then start the JobManager , after application resources ApplicationMaster ResourceManager start TaskManager after, ResourceManager Container resource allocation;

ApplicationMaster notified by the node where the resource NodeManager start TaskManager , NodeManager loading configuration and Flink Jar package and start the build environment TaskManager to rear, TaskManager starts sending the heartbeat packet JobManager , and the waiting JobManager assign tasks.

3. The principle task scheduling

   

  Not part of the client runtime and execution of the program, but it is used to prepare and transmit dataflow (JobGraph) to the Master (the JobManager) , then the client disconnects, or waiting to receive the calculation result to maintain the connection.

When Flink cluster starts, the first will start a JobManger and one or more of TaskManager . By the Client submit the task to the JobManager, JobManager and then schedule a task to each TaskManager to perform, then TaskManager will report a heartbeat and statistical information to the JobManager. TaskManager for data transmission between the form of a stream. The above three are independent JVM process .

Client  for the submission of Job client can be run on any machine (and the environment can be communicated JobManager). After submitting Job, Client may end the process (Streaming task) may not end and wait for the results returned.

JobManager  is responsible for scheduling and coordinating the Job Task do checkpoint, much like the role of Storm Nimbus. After receiving from the Client to Job JAR package and other resources, it will generate the optimized execution plan and to schedule Task to the respective units to perform TaskManager.

TaskManager  at boot time to set up the number of slots (Slot), each slot can start a Task, Task thread. Receiving from the need to deploy the JobManager Task, after deployment starts to establish connection with their upstream Netty, and processing the received data.

 Figure execution:

                    

  Flink executes may be divided into four layers: StreamGraph -> JobGraph -> ExecutionGraph -> physical execution FIG.

  StreamGraph : the code generated according to the user through the initial Stream API prepared FIG. Topology used to represent the program.

  JobGraph : StreamGraph optimized epigenetic became JobGraph, submitted to the data structure of the JobManager. Main optimization, for a plurality of qualified nodes  chain together as a node , thus reducing the flow of data between nodes required serialization / deserialization / transfer consumption.

  ExecutionGraph : generating ExecutionGraph according to the JobManager JobGraph. ExecutionGraph JobGraph parallel version of , the core layer is scheduling data structure.

  Physical executes : JobManager after ExecutionGraph according to Job scheduling, "map" on each TaskManager formed after deployment Task, not a specific data structure.

 TaskManager和Slots

   

   Flink each TaskManager is a standalone JVM process , it may be on a separate thread to perform one or more subtask;

   In order to control a number of receive TaskManager Task, TaskManager controlled by task slot (a slot having at least one TaskManager) ;

 Each task slot with the resources TaskManager represent a subset of a fixed size . If there are three TaskManager a slot, then it will manage its memory divided into three portions to each slot. Resource slot of means will not need a subtask subtask with competition from other job is managing memory, replaced it with a certain amount of memory reserves. It should be noted that this does not involve the isolation of the CPU, slot currently only used to isolate the managed memory task .

    By adjusting the number of task slot, allowing the user to define how isolated from each other subtask. If a TaskManager a slot, it would mean that each task group runs in a separate JVM (JVM which may be initiated by a specific container), while more than one TaskManager slot means more subtask can share the same JVM. While the same task in a JVM process will be shared TCP connection (based multiplexing) and the heartbeat message. They may also share data sets and data structures, which thus reduces the load of each task.

   

 

   By default, Flink allow subtasks shared slot, even if they are subtasks different tasks. The result is that a slot can hold an entire pipeline job.

  Task Slot is a static concept, refers to the TaskManager has concurrent execution capabilities. Parameters can be taskmanager.numberOfTaskSlots configuration;

       Parallelism parallelism is a dynamic concept, i.e. TaskManager concurrent capacity actually used when running the program , parameters can be parallelism.default configuration.

That is, assuming a total of three TaskManager, 3 TaskSlot assign each TaskManager in, that is, each TaskManager can receive three task, a total of nine TaskSlot, if we set parallelism.default = 1, that is, run the program default 1,9 parallelism TaskSlot only one of 1, 8 idle, thus setting an appropriate degree of parallelism in order to improve efficiency.

          

 

         

 

Program and data flow

                       

  All Flink program is composed of three parts:  Source , Transformation and Sink .

  •   Source responsible for reading the data source, Transformation processed using various processing operator, Sink responsible for output.
  •   At runtime, a program running on Flink will be mapped into the logical data flow dataflows Streaming , which contains three parts.
  •        Each dataflow to one or more sources to start with one or more sinks end . dataflow like any directed acyclic graph (DAG), the particular form of rings may be of course constructed by iteration.
  •       In most cases, the program transformations with dataflow operator is in one to one relationship, but sometimes, a transformation may correspond to multiple operator.

Parallel data streams

          

  Flink execution program having parallel, distributed properties. During execution, a stream comprising one or more stream Partition, and each includes one or more operator SubTask operator, which operator subtasks in different threads, different physical machine or a different containers have to implement and do not depend on each other .

  A particular operator SubTask the number of which is called Parallelism ( degree of parallelism) . Stream is always a degree of parallelism equal to its producing operator parallelism. A program, various operator may have different degrees of parallelism.

Stream transmission of data between the operator may be in the form of one-to-one (forwarding) mode may be a mode of redistributing, in particular a form which, depending on the type of operator.

  One-to-One ① : Stream (for example between the source and the map operator) maintains the order of the partitions and the elements. The number and order number with the source operator of subtask production elements that means subtask see map operator of elements of the same order, map, fliter, flatMap and other operators all correspondence between the one-to-one. The spark similar narrow-dependent

  Redistributing ② : (or keyBy / window between now between the sink map () with keyBy / window) partition will change stream. Transformation of the selected transmission data each operator subtask to different destinations based subtask. For example, keyBy () based hashCode repartitioning, Broadcast and rebalance randomly re-partition, these operators will cause redistribute process, the process is similar to redistribute the shuffle Spark process. The spark similar width dependent

Task chain

 

    Flink uses optimization technique called task chain, local communication overhead can be reduced under certain conditions. In order to meet the requirements of the task chain must be two or more operators to the same degree of parallelism ,

And forwarded through the local (local forward) connected manner. The same degree of parallelism one-to-one operation, Flink so connected together to form a linking operator task, the original operator becomes inside subtask.

The same degree of parallelism, and a one-to-one operation , two conditions are indispensable.

           

      The same degree of parallelism one to one operation , operator Flink such links connected together to form a task, the original operator becomes inside subtask. The operators to link optimization task is very efficient: it reduces switching between threads based on data exchange buffer area, while reducing lift throughput delay. Link behavior can be specified in a programming API.

 

 

 

 

Guess you like

Origin www.cnblogs.com/shengyang17/p/11785169.html
run