Yarn's understanding

   我对yarn的理解

Yarn is a new hadoop resource scheduler. The basic idea of ​​yarn is to separate the two functions of JobTracker (resource management and job scheduling/monitoring), the main method is to create a global RM and several application-specific AMs. (The application refers to the MR job or the DAG of the job)
The main components of Yarn are: ResourceManager (RM), ApplicationMaster (AM), NodeManger (NM), container. Next, let me introduce their main functions:

RM: Mainly responsible for resource scheduling, starting and monitoring AM, and restarting AM when it fails to run (RM is not responsible for fault tolerance of AM internal tasks, which is done by AM). RM is a global resource manager responsible for resource management and allocation of the entire system. It is mainly composed of two components: Scheduler (Scheduler) and ApplicationsManager (Application Manager, ASM)
Scheduler is a resource scheduler, which is mainly responsible for coordinating the resource allocation of each application in the cluster and ensuring the operation efficiency of the entire cluster. The role of the Scheduler is a pure scheduler, which is only responsible for scheduling Containers and does not care about application monitoring and its running status. Similarly, it cannot restart tasks that fail due to application failure or hardware error.
Another component, ApplicationManager, is mainly responsible for receiving job submission requests, assigning the first Container to the application to run the ApplicationMaster, and monitoring the ApplicationMaster. Restart the Container running the ApplicationMaster on failure.

AM: The main function of ApplicationMaster is to apply for resources (Container) to the Scheduler of ResourceManager and work with NodeManager to run various tasks of the application and then track their status and monitor the execution of each task. It is also responsible for restarting it when it encounters a failed task.

NM: The NodeManager process runs on the nodes in the cluster, and each node will have its own NodeManager. It is responsible for receiving resource allocation requests from ResourceManager and assigning specific Containers to applications. At the same time, it is also responsible for monitoring and reporting Container usage information to the ResourceManager. ResourceManager is a global process, and NodeManager is just a process on each node, manages the resource allocation on this node and monitors the health status of running nodes. When a node starts, it will register with ResourceManager and inform ResourceManager how much it has. Resources are available. During runtime, through the cooperation of NodeManager and ResourceManager, this information will be continuously updated and ensure the best state of the entire cluster. NodeManager is only responsible for managing its own Container, it does not know the information of the applications running on it. The component responsible for managing application information is ApplicationMaster; on the other hand, it also receives and handles various requests from AM's Container to start/stop.

Container: container refers to the system resources on the node, including cpu and memory at this stage (resources such as disk and network may be added in the future), which is the basic unit of task execution in yarn, that is to say, any job or application must run in a Among or multiple Containers, in the Yarn framework, the ResourceManager is only responsible for telling the ApplicationMaster which Containers are available, and the ApplicationMaster also needs to find the NodeManager to request the allocation of specific Containers.

Analysis of the application submission process:
1. The client program submits the application to the ResourceManager and requests an ApplicationMaster instance
2. The ResourceManager finds a NodeManager that can run a Container, and starts the ApplicationMaster instance in the Container
3. The ApplicationMaster registers with the ResourceManager, after registration The client can query the ResourceManager to obtain the detailed information of its own ApplicationMaster, and then can directly interact with its own ApplicationMaster.
4. After the Container is successfully allocated, the ApplicationMaster sends the information (container-launch-specification) to the NodeManager to start the Container (the The information contains the information needed to enable the Container and ApplicationMaster to communicate)
5. The code of the application runs in the started Container, and sends information such as running progress and status to the ApplicationMaster (through application-specific protocol)
6. In the application During the running of the program, the client submitting the application actively communicates with the ApplicationMaster to obtain information such as the running status and progress update of the application. The communication protocol is also an application-specific protocol
. 7. Once the application execution is completed and all related work has been completed, the ApplicationMaster will ResourceManager is unregistered and then closed, and all Containers used are also returned to the system

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324884525&siteId=291194637