Distributed computing framework Spark cluster combat

1. The overall structure of Spark

1. Spark cluster architecture

From the perspective of cluster deployment, a Spark cluster consists of a cluster manager (Cluster Manager), a worker node (Worker), an executor (Executor), a driver (Driver), and an application program (Application). The overall relationship is shown in the following figure Show:

1. Cluster Manager

Spark's cluster manager is mainly responsible for the allocation and management of the entire cluster resources. Cluster Manager is ResourceManager in YARN deployment mode; Mesos Master in Mesos deployment mode; Master in Standalone deployment mode. The resources allocated by the Cluster Manager belong to the first-level allocation. It allocates resources such as memory and CPU on each Worker to the Application, but it is not responsible for the allocation of resources to the Executor. The Master in the Standalone deployment mode will directly allocate resources such as memory, CPU, and Executor to the Application. Currently, Standalone, YARN, Mesos, EC2, etc. can be used as Spark cluster managers.

2. Worker

Spark worker nodes. In YARN deployment mode, it is actually replaced by NodeManager. The Worker node is mainly responsible for the following tasks:

  1. Inform the Cluster Manager of its memory, CPU and other resources through the registration mechanism
  2. Create Executor; further assign resources and tasks to Executor;
  3. synchronous resource letter

Guess you like

Origin blog.csdn.net/qq_35029061/article/details/132252544