Summary of the operating principle of YARNRunner

After reading some source code before, I have a general understanding of the entire running process of Yarn. To sum up

, each Yarn cluster has a Resource Manager and several NodeManagers. The

Resource Manager mainly has two objects, one is Scheduler, and the other is That is, the Applications Manager ASM

Scheduler has FIFO and Fair, etc. The main function is to allocate containers according to the resource usage of the Node Manager.
Of course, there is also an Uber mode. Specifically, only one container will be used to run the entire Job Applications Manager after 7 conditions are met. The

Applications Manager is mainly to manage all Application Masters. After AM is created, it is registered with ASM, and there are various resource requests later. The

AppMaster will run on the NodeManager through ASM . In fact, AM is a special Container. After the Job is submitted, an AM container will be created first, and then the AM Container will be used to coordinate the operation of the entire Job.
The NM will heartbeat with the Resource Manger, so that the RM knows the resource status of each NM. When there is an application for a container, it will find the corresponding NM to allocate the container. The order is:
1. The NM where the data is located has priority
2. If If 1 does not exist, then find the NM on the same rack.
3. If neither 1 nor 2 exist, then find a random one on the cluster.

ApplicationMaster is a management container created for each Job. He applies resource application, container management, task management (communication with NM to start, stop, rerun, etc.) for the job situation.
He also communicates with RM to request the allocation of various resources

. If the container itself is actually on the Node Manager A resource package of , which is allocated by the Resource Manager and started by the AM, and runs on the NM.

The entire YarnRunner running process is illustrated by the following diagram (borrowing the network diagram):


1. Client submits Job Resource Manager to receive 2. Resource
Manager starts the first Container--Application Master, which is used to manage the entire Job and subsequent resources Application, task control
3. After Application Master starts, register with ASM
4. AM calculates the required amount of resources, and applies to RM through ASM 5. After RM
provides the location of all containers, AM asks NM to start all containers
6. NM Start the container
7. Each task reports the running status and status to the AM 8. After the job is
finished, the AM logs out and closes the ASM

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326969041&siteId=291194637