Hadoop detailed notes (eighteen) principle to strengthen the running process of the mapreduce program on yarn

Free video tutorial  https://www.51doit.com/  or contact the blogger on WeChat 17710299606

The detailed steps are in the two pictures above, please read carefully!

1. The client client submits a job job to the YARN master node ResourceManager, such as counting the number of times each word in a 4G file appears.
The sentence for submitting the job in the hadoop environment: bin/yarn jar XXX.jar MainClass args

2. The ResourceManager of the master node starts a Container on a DataNode slave node to run the applicationMaster (manager running the application)

Note: Container is a container in which memory and hard disk space
are placed. applicationMaster is a temporary job manager, equivalent to a project manager. ApplicationMaster can apply for resources such as memory space, cpu, and hard disk from the ResourceManager master node according to the size of the job.

3. ApplicationMaster requests resources from ResourceManager in order to run all tasks in MapReduce;
ResourceManager will allocate resources on nodeManager and inform applicationMaste that the resources requested are on that machine.

4. The applicationMaster contacts the nodeManager to start the related tasks (map task, reduce task) in the corresponding Container.

5. The running task will report to the applicationMaster in real time (heartbeat mechanism) for monitoring the entire application.

6. When all tasks (reduceTask) are completed, applicationMaste informs ResourceManager to destroy applicationMaste.
 

Guess you like

Origin blog.csdn.net/qq_37933018/article/details/107337897