Article Directory
In Spark, a node can have one or more Executors, and the relationship among Executors, Tasks, and Containers is as follows:
1. Executor
- An Executor is a process in a Spark application that runs on a Worker Node. Each Spark application has its own set of Executor processes.
- A node can have one or more Executor processes. Each Executor process has its own JVM instance, therefore, each Executor process runs in its own independent Java process.
- The Executor process is responsible for running the task of the application (that is, the basic unit of distributed computing).
- Executor processes can maintain state and be reused for the lifetime of the application. This allows data to be efficiently shared between tasks.
2. Task
- Task is a unit of work in Spark application. It is a computation on a partition of the RDD.
- Each Task is an independent computing unit and runs on a certain thread of Executor