Hadoop Yarn framework principle and operation mechanism

1.1 YARN Basic Architecture

YARN is a resource management system in Hadoop 2.0. Its basic design idea is to split the JobTracker in MRv1 into two independent services: a global resource manager ResourceManager and each application-specific ApplicationMaster.

The ResourceManager is responsible for the resource management and allocation of the entire system, while the ApplicationMaster is responsible for the management of a single application.

 

1.2  Basic structure of YARN

YARN is still a Master/Slave structure in general. In the entire resource management framework, the ResourceManager is the Master and the NodeManager is the Slave. The ResourceManager is responsible for unified management and scheduling of resources on each NodeManager. When a user submits an application, he needs to provide an ApplicationMaster to track and manage the program. It is responsible for applying for resources to the ResourceManager and asking NodeManger to start tasks that can occupy a certain amount of resources. Since different ApplicationMasters are distributed to different nodes, they will not affect each other. In this section, we will introduce the basic structure of YARN.

Figure 2-9 describes the basic structure of YARN. YARN mainly consists of ResourceManager, NodeManager, ApplicationMaster (the ApplicationMaster of MapReduce and MPI computing frameworks are shown in the figure, they are MR AppMstr and MPI AppMstr respectively) and several components such as Container constitute.

 

 

1.ResourceManager(RM)

RM is a global resource manager responsible for resource management and allocation of the entire system. It mainly consists of two components: Scheduler (Scheduler) and Application Manager (Applications Manager, ASM).

(1) Scheduler

The scheduler allocates resources in the system to each running application according to constraints such as capacity and queues (such as assigning certain resources to each queue, executing a maximum number of jobs, etc.).

It should be noted that the scheduler is a "pure scheduler", it is no longer engaged in any work related to the specific application, such as not responsible for monitoring or tracking the execution status of the application, etc., nor is it responsible for restarting due to application execution failures Or failed tasks caused by hardware failures, these are all completed by the application-related ApplicationMaster. The scheduler only allocates resources according to the resource requirements of each application, and the resource allocation unit is represented by an abstract concept "Resource Container" (Resource Container, Container for short ) . Container is a dynamic resource allocation unit. , network and other resources are encapsulated together to limit the amount of resources used by each task. In addition, the scheduler is a pluggable component, and users can design new schedulers according to their own needs. YARN provides a variety of directly available schedulers, such as Fair Scheduler and Capacity Scheduler.

 

(2) Application Manager

The application manager is responsible for managing all applications in the entire system, including application submission, negotiating resources with the scheduler to start the ApplicationMaster, monitoring the running status of the ApplicationMaster and restarting it if it fails, etc.

 

2. ApplicationMaster(AM)

Each application submitted by the user includes 1 AM , and the main features include:

Negotiate with the RM scheduler to obtain resources (represented by Container);

further assign the obtained tasks to internal tasks;

communicate with NM to start/stop tasks;

Monitor the running status of all tasks, and re-apply for resources to restart the task when the task fails to run.

Currently YARN comes with two AM implementations, one is distributedshell, an example program used to demonstrate AM writing methods, which can apply for a certain number of Containers to run a Shell command or Shell script in parallel; the other is AM that runs MapReduce applications— MRAppMaster, which we will cover in Chapter 8. In addition, some other computing frameworks corresponding to AM are under development, such as Open MPI, Spark , etc.

3. NodeManager (NM)

NM is the resource and task manager on each node. On the one hand, it regularly reports the resource usage on the node and the running status of each Container to the RM ; on the other hand, it receives and Handle various requests such as Container start/stop from AM .

4. Container

Container is a resource abstraction in YARN. It encapsulates multi-dimensional resources on a node, such as memory, CPU, disk, network, etc. When AM requests resources from RM, the resources returned by RM to AM are represented by Containers . YARN assigns a Container to each task, and the task can only use the resources described in the Container.

It should be noted that Container is different from the slot in MRv1, it is a dynamic resource division unit, which is dynamically generated according to the needs of the application. As of the completion of this book, YARN only supports two resources, CPU and memory, and uses the lightweight resource isolation mechanism Cgroups for resource isolation.

 

1.3 YARN Workflow

When a user submits an application to YARN, YARN will run the application in two stages:

The first stage is to start ApplicationMaster;

第二个阶段是由ApplicationMaster创建应用程序,为它申请资源,并监控它的整个运行过程,直到运行完成。

如图2-11所示,YARN的工作流程分为以下几个步骤:

     

 

步骤1 用户向YARN中提交应用程序,其中包括ApplicationMaster程序、启动ApplicationMaster的命令、用户程序等。

步骤2 ResourceManager为该应用程序分配第一个Container,并与对应的Node-Manager通信,要求它在这个Container中启动应用程序的ApplicationMaster。

步骤3 ApplicationMaster首先向ResourceManager注册,这样用户可以直接通过ResourceManager查看应用程序的运行状态,然后它将为各个任务申请资源,并监控它的运行状态,直到运行结束,即重复步骤4~7。

步骤4 ApplicationMaster采用轮询的方式通过RPC协议向ResourceManager申请和领取资源。

步骤5 一旦ApplicationMaster申请到资源后,便与对应的NodeManager通信,要求它启动任务。

步骤6 NodeManager为任务设置好运行环境(包括环境变量、JAR包、二进制程序等)后,将任务启动命令写到一个脚本中,并通过运行该脚本启动任务。

步骤7 各个任务通过某个RPC协议向ApplicationMaster汇报自己的状态和进度,以让ApplicationMaster随时掌握各个任务的运行状态,从而可以在任务失败时重新启动任务。

     在应用程序运行过程中,用户可随时通过RPC向ApplicationMaster查询应用程序的当前运行状态。

 

步骤8 应用程序运行完成后,ApplicationMaster向ResourceManager注销并关闭自己。

 

1.4 多角度理解YARN

YARN can be regarded as a cloud operating system . It is responsible for starting the ApplicationMaster (equivalent to the main thread) for the application, and then the ApplicationMaster is responsible for data segmentation, task allocation, startup and monitoring, etc., while the tasks started by the ApplicationMaster ( Equivalent to a child thread) is only responsible for its own computing tasks. When all tasks are calculated, ApplicationMaster considers the application to be completed and exits.

   

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326329955&siteId=291194637