The relationship between Executor, Task and Container in Spark

insert image description here

In Spark, a node can have one or more Executors, and the relationship among Executors, Tasks, and Containers is as follows:

1. Executor

  • An Executor is a process in a Spark application that runs on a Worker Node. Each Spark application has its own set of Executor processes.
  • A node can have one or more Executor processes. Each Executor process has its own JVM instance, therefore, each Executor process runs in its own independent Java process.
  • The Executor process is responsible for running the task of the application (that is, the basic unit of distributed computing).
  • Executor processes can maintain state and be reused for the lifetime of the application. This allows data to be efficiently shared between tasks.

2. Task

  • Task is a unit of work in Spark application. It is a computation on a partition of the RDD.
  • Each Task is an independent computing unit and runs on a certain thread of Executor

Guess you like

Origin blog.csdn.net/m0_47256162/article/details/132376865