The working mechanism of the distributed resource management framework Yarn (execution process)

Yarn working mechanism flowchart

Insert picture description here



Yarn working mechanism detailed introduction

1. The client submits the job and applies for the jobID (submitted to the Application Master in the ResourceManager (used to process the client's request))

2. ResourceManager returns a job ID and returns the HDFS path to the client

3. The client uploads the resources (Jar package, configuration information, sharding information, etc.) required to run the job to the returned HDFS path

4. The client sends a request to ResourceManager this time to perform the job

5. Forward the request to the scheduler (ResourceScheduler)

6. The scheduler puts the task in the scheduling queue, and when the corresponding request is executed, it will notify the ApplicationManager to assign fault tolerance, call the Container developed by the NodeManager, and create the ApplicationMaster corresponding to the job

7. ApplicationManager opens up resources (Container) on a NodeManager, and then starts an ApplicationMaster in the Container

8. ApplicationMaster obtains the files submitted on HDFS, and creates MapTask and ReduceTask based on the slice information

9. ApplicationMaster applies to the scheduler (ResourceScheduler in ResourceManager) for the resources needed to run MapTask and ReduceTask

10. The scheduler returns execution information to ApplicationMaster, which tells ApplicationMaster which NodeManager has the required resources

11. The ResourceManager assigns the task of running MapTask to multiple
NodeManagers that are idle , and the NodeManager receives the tasks and creates a container (Container) respectively

12. The MapTask and ReduceTask processes send program startup scripts to the two NodeManagers that
receive the task . Each NodeManager that receives the task starts the MapTask, and the MapTask processes the data and sorts it by partition.

13. After the ApplicationMaster waits for all MapTasks to run,
apply to the ResourceManager for a container (Container) and run ReduceTask

14. After the program runs, ApplicationMaster sends a request to ResourceManager to release resources

15. Progress and status updates

Guess you like

Origin blog.csdn.net/qq_42578036/article/details/110826261