Hadoop Big Data Development Foundation series: Six understand YARN

Learn YARN

 

First, what is YARN?

Apache Hadoop YARN (Yet Another Resource Negotiator , another resource coordinator) is a new Hadoop Explorer , it is a universal resource management system that provides a unified application for the upper resource management and scheduling, its introduction is cluster has brought great benefits in terms of utilization of resources, unified management and data sharing.

 

Second, background

1.MRv1 architecture:

2.MRv1 (Hadoop1.0) has various limitations:

(1) poor scalability: in MRv1 in, jobTracker also has two functional resource management and job control, which restricts the scalability hadoop cluster.

(2) poor reliability :: MRv1 using the Master / Slave configuration, wherein there is a single point of failure Master, once it fails guide

Induced entire cluster is unavailable.

(3) low utilization of resources: Resources slot partitioning method based on partition size is still too rough, often result in node resource utilization is too high or too low; Hadoop Map Slot into the slot and Reduce Slot are two, and not allowed to shared between them, often lead to a slot resource constraints while the other is idle (such as when a job just submitted will only run Map Task, this time Reduce slot idle).

(4) not to introduce effective resource isolation mechanism, based on using only the resource isolation mechanism jvm, this approach is still too rough, a lot of resources, such as CPU, not isolation, which can cause interference between tasks on the same node serious.

(5) can not support multiple computing framework: can not effectively support the Storm, Spark and other computing framework.

3.YARN improvements

(1) changing a cluster of computing a frame mode, calculating the respective frame resource sharing mode

Improve resource utilization; sharing mode requires only a small number of unified management of a cluster administrator can; do not make the scheduling of resources across the cluster, cost savings.

(2) mainly performs two functions in the Hadoop 1.0 JobTracker: resource management and control operations. In the cluster size is too large scenes, JobTracker pressure is too heavy, so YARN design , resource management and job control are separated. Substituted JobTracker of three parts ResourceManager, ApplicationMaster, NodeManager.

 

Third, the main advantage

1. greatly reduces the JobTracker (which is now ResourceManager) resource consumption, and allow monitoring of each Job subtasks (tasks) of the state program distributed, safer and more beautiful.

2. In the new Yarn in, ApplicationMaster is a changeable part , users can write their own AppMst different programming model, so that more types of programming model can run Hadoop cluster.

3. Expresses for resources to memory units , the number of remaining slot than before in order to be more reasonable.

4.ApplicationMaster assumed the task of monitoring the health of the work of the original JobTracker, ResourceManager the ASM control health ApplicationMaster, and if problems arise, it will restart its (ApplicationMaster).

YARN 5.Container used as a resource isolation assembly may be used to schedule and control resources . Container is a resource allocation unit, the internal dimensions of the package a plurality of resource nodes.

 

Fourth, the basic architecture of YARN

1.ResourceManager(RM):

(1) responsible for the overall resource management and distribution system includes a scheduler and an application manager (Applications Manager) ASM two parts.

(2) scheduler is a pure scheduler is only responsible for resource allocation.

(3) ASM: ① submitted an application to start ApplicationMaster; ② start AM; ③ monitor the operation of AM, AM restart it when you fail.

2.NodeManager(NM):

Resource and Task Manager on (1) each node

(2) detailed functions (nexus): ① the timing to report the resource usage and operation of each node Container according to RM; ② start and stop receiving the request AM (various tasks) the Container.

3.ApplicationMaster(AM):

(1) a per application AM

(2) specific functions: ① the resource request to the RM; ② to start and stop the communication task NM; ③ monitoring operating status of all tasks, resources and re-application and restart the task if the task fails.

4.Container:

YARN abstraction of the resources are allocated resource units, wherein the multi-dimensional packaging resource node, such as content, CPU, etc..

 

Five, YARN job scheduling process

1.client RM sent to the application start request (including: ApplicationMaster program start command AM, applications)

2.RM allocated a Container application, and communicates with the corresponding NM, NM required to start the application corresponding to the AM in the Container.

3.AM after starting to register with the RM, the user can view the status of running applications directly through the RM.

RM 4.AM to apply for and receive resources by way of polling

After 5.AM request to the resource to communicate with the corresponding NM, NM required to start the task.

After setting the environment 6.NM good task needs to run (including environmental variables, JAR packages, etc.) for the task, the task start command to write a script to start the task by running the script.

7. each task by sending RPC protocol to AM's own operating conditions, the situation can keep track of tasks AM, so you can restart the task if the task fails.

8. When the task is completed, AM log out and close itself to the RM.

 

Six, YARN fault tolerance

1.RM single points of failure

ResourceManager backup node when the primary node fails, the node is switched to continue the work.

2.NM

(1) task fails, RM will fail the task of telling AM;

(2) AM decide how to handle the task failure.

3.AM

After (1) AM failure by RM responsible for restarting AM

(2) AM fault tolerance to deal with the problem of internal tasks

(3) RM saves the task is already running, without having to re-run after the restart.

 

Seven, YARN scheduler

YARN resource scheduler is one of the most crucial components, and a plug- in, which defines a set of interface specification so that the user may own scheduler implemented as needed. YARN comes FIFO, Capacity Scheduler (default) [the internal default task execution queue is a FIFO manner} and Fair Scheduler Three common resource scheduler, of course, the user can write a new resource scheduler in accordance with the interface specification, and a simple configuration it up and running.

1. Resource Scheduling Model:

(1) YARN using double resource scheduling model : in a first layer, ResourceManager resource scheduler for allocating resources to individual ApplicationMaster ; in the second layer, ApplicationMaster further allocating resources to the various tasks inside it. Here Resource Scheduler main concern is the problem of scheduling the first layer , the second layer as for scheduling strategy, is entirely up to the user application itself.

(2) YARN uses a pull-base communication model: the resource scheduler to allocate resources to the application, it does not immediately push a corresponding ApplicationMaster, but temporarily placed in a buffer, waiting for periodically ApplicationMaster heartbeat to take the initiative.

2. Resource scheduling process

3. Resource assurance mechanism

(1) incremental resource allocation: When the resource application is temporarily unable to guarantee application, priority reserved resources on a node until the cumulative release of free resources for the application to meet the needs of the application. This resource allocation, reserve resources will result in a waste of resources, reduce cluster resource utilization.

(2) YARN using incremental resource allocation mechanism , although this mechanism is wasteful, but will not cause starvation phenomenon. (Assuming that the application does not permanently occupy a resource, it will release resources that are consumed over a period of time).

4. Pre-emption mechanism

(1) in the resource scheduler, each queue may be provided a minimum amount of resources and the maximum amount of resources , wherein the minimum amount of resources is where resources are scarce

Under the amount of resources required to guarantee each queue, and the maximum amount of resources is the use of resources in extreme cases can not exceed the queue.

The reason (2) Pre-emption occurred is entirely due to the "minimum amount of resources" concept. In general, in order to improve resource utilization, resource

Scheduler (including Capacity Scheduler and Fair Scheduler) temporarily allocated resources will be lightly loaded to heavily loaded queues queue (i.e., the minimum amount of resources to ensure that resources are not hard, when the queue does not require any resources, it does not meet the the minimum amount of resources, but temporarily

The free resources need to be allocated to other resource queue), only when the load is light queue suddenly received new applications submitted before the scheduler

Further resources that belong to the queue is assigned to it . . However, since the resource may be in use at this time another queue, the scheduler must wait

After the release of the other queue resources to these resources "restitution", which usually takes a period of uncertain waiting time. In order to prevent should

Waiting too long with the program, after the dispatcher wait for some time if the found resources have not been released, the resource preemption.

5.Capacity Shedule

Dividing the resource in the unit queue, each queue can be set to ensure a certain minimum percentage of the upper limit of resources and, at the same time, each user can also set a certain limit the use of resources in order to prevent misuse of resources. When a queue has remaining resources, the remaining resources can be shared temporarily to other queues.

(1) features. : Capacity assurance, flexibility, variety of leasing (supports multiple users and multiple applications to share a cluster of run), security, dynamic resource update

(2) the sub-queues:

① can be nested queues, each queue has a sub-queue.

② users can only submit an application to the bottom of the queue (leaf queue)

③ each queue minimum and maximum capacity.

④ scheduler queue will always choose the lowest current resource usage for, and allocation of resources.

Published 18 original articles · won praise 0 · Views 446

Learn YARN

Guess you like

Origin blog.csdn.net/weixin_45678149/article/details/104943364