Spark architecture and operating mechanism (2) - Spark job execution process

The process of Spark program from submission to cluster to execution is as follows:


(1) The user writes a Spark program (like the WordCount program written before), includes a user-defined main function in the application, and implements RDD inside the main function Operations such as creation, RDD transformation, and RDD storage. After that, the user submits the complete Spark program to the cluster, applies for Spark cluster resources, and executes the program. After the cluster receives the Spark program submitted by the user, the corresponding Driver process will be started, responsible for executing the user-defined main function in response. There are two ways to start the Driver process: 1. The Driver process is started
    on the client;
    2. The Master node specifies a Worker node to start the Driver process to act as the Driver node;

(2) The Driver process responds to the execution of the user-defined main function , will communicate with the Master node, and apply for the resources required to execute the program through the Master node;

(3) All Worker nodes in the Spark cluster will register with the Master node, and the Master node will check whether the registered Worker node is alive through heartbeat check. After receiving the resource request from the Driver process, the Master node will instruct the registered Worker node to start the Executor process;

(4) If the Worker node is normal and the Executor process on the Worker node is successfully started, the Master node will start these The resource notifies the Driver process, so that the Driver process uses those resources in the cluster to complete the Spark program in parallel;

(5) The Driver node divides the program according to the transformation and execution of the RDD in the program. Send the divided tasks to multiple Executor resources that have been applied for, each Executor process is responsible for independently completing the part of the task assigned to it, and feeds back the execution results to the Driver;

(6) The Executor process running on the Worker node is the real executor of the job. Multiple Executor processes can be started on each Worker node. Each Executor runs in a JVM process alone, and each scheduled task runs in the Executor. of a thread.

(7) The Driver notifies the Client that the execution of the application is completed;


in the above process, the most important steps are the following three steps:


(1) The first step: Generate RDD
    Spark's operation mode for RDD is lazy computing. In the lazy computing mechanism, although each operator operation will convert an RDD into a new RDD, and logically perform this series of calculations sequentially, the operations of these RDDs are not executed immediately, but will wait until The entire RDD operation sequence is triggered when an action operator appears, and all previous operator operations are formed into a Directed Acyclic Graph (DAG), and each DAG triggers the execution of a job (Job). For example, in the WordCount program, only when the program executes to saveAsTextFile, Spark will actually process the RDD, connect the previous operators such as flatMap, map, reduceByKey and saveAsTextFile into a directed acyclic graph, and submit the job to Spark .
    Advantages of using lazy computing:
    1. The related operation sequence can be continuously calculated, without discrete and independent allocation of memory space for the intermediate results stored;
    2. Save storage space, which provides conditions for the subsequent preferential RDD transformation operations;

(2 ) Step 2: Generating
    the DAGSchedule instance in the Stage Driver node will traverse the dependencies between nodes in the directed acyclic graph, and divide all operations into multiple scheduling stages (Stage);

(3) Step 3: Generate Task
    Each Stage needs to be converted into a task and executed on the Worker node in the cluster. Therefore, the TaskSchedule instance in the Driver node converts the Stage into a Task and submits it to the Executor process of the Worker node for execution;


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326266467&siteId=291194637