Hadoop's distributed resource scheduling framework-YARN

1. Introduction to YARN

Apache YARN (Yet Another Resource Negotiator) is a cluster resource management system introduced by hadoop 2.0. The core idea is to separate resource management and task monitoring and scheduling. Users can deploy various service frameworks on YARN, and YARN will manage and allocate resources uniformly.


2. YARN basic architecture

2.1, master/slave master-slave architecture

  • master: ResourceManager
    is responsible for the global unified resource management, scheduling, and distribution of the cluster
  • slave: NodeManager
    is responsible for managing node resources and usage

2.2, core components

Insert picture description here

2.2.1、ResourceManager

  • ResourceManager components:
    Application Manager
    The application manager is responsible for processing job requests transmitted by the client , assigning the first Container to the application to run the ApplicationMaster, monitoring the ApplicationMaster, and restarting the Container running by the ApplicationMaster when it encounters a failure.
    ResourceScheduler
    The resource scheduler allocates resources to the application according to the node's resource situation and queue situation, and is only responsible
    for allocating Container (resource) to the application submitted by the user.

  • ResourceManager functions:
    1. Processing client requests
    2. Monitoring NodeManager, monitoring NodeManager heartbeat, and sending instructions to NodeManager
    3. Starting and monitoring ApplicationMaster
    4. Allocating and scheduling the entire system resources


2.2.2、NodeManager

  • NodeManager functions:
    1. Manage resources and tasks on this node.
    2. Regularly report to ResourceManager the resource situation and the operation of the Container.
    3. Processing instructions from ResourceManager, such as Container start and stop
    4. Processing from ApplicationMaster Instructions, such as starting MapTask and ReduceTask

2.2.3、ApplicationMaster

  • ApplicationMaster functions:
    1. An ApplicationMaster is responsible for managing an application.
    2. It is responsible for data segmentation. For example, the MapReduce task will perform
    data segmentation according to the InputSplit information.
    3. Apply for a resource (Container) from ResourceScheduler for the application, and allocate tasks (MapTask/ReduceTask) internally.
    4. Communicate with the NodeManager to start/stop the tasks. The tasks are all in
    5. Monitoring and fault tolerance of internal tasks running in Container

2.2.4、Container

A node can run multiple Containers, but it needs to be clear that Containers cannot span multiple machines.

  • Container functions:
    1. Container is a resource abstraction in YARN, encapsulating multi-dimensional resources on a node, such as memory, CPU, disk, network, etc.
    2. Container is similar to a virtual machine and can perform tasks on it

3. YARN working mechanism

Insert picture description here

3.1. Assignment submission

(1) The client calls the job.waitForCompletion method to submit MapReduce jobs to the entire cluster.

(2) The client applies for a job Id from ResourceManager.

(3) The ResourceManager returns the submission path (HDFS path) and job Id of the job resource to the Client. Each job has a unique Id.

(4) Client sends the jar package, slice information and configuration file to the specified resource submission path.

(5) After the client submits the resources, it applies to the ResourceManager to run MrAppMaster (ApplicationMaster for the job).


3.2, job initialization

(6) When the ResourceManager receives the client's request, it adds the job to the Resouce Scheduler.

(7) An idle NodeManager receives the job.

(8) The NodeManager creates a Container and generates MrAppMaster.

(9) Download the resources submitted by the Client to the local, and generate MapTask and ReduceTask according to the fragmentation information.


3.3, task allocation

(10) MrAppMaster applies to ResouceManager to run multiple MapTask task resources.

(11) The ResourceManager assigns the task of running MapTask to multiple NodeManagers that are idle, and the NodeManager receives the task and creates a container (Container) respectively.


3.4, task operation

(12) MrAppMaster sends the program startup script to the two NodeManagers that receive the task, and each NodeManager that receives the task starts the MapTask, and the MapTask processes the data and sorts it by partition.

(13) After MrAppMaster waits for all MapTasks to run, apply for a container (Container) from ResourceManager to run ReduceTask.

(14) After the program finishes running, MrAppMaster will apply to ResourceManager to cancel itself.

(15) Progress and status update

The task in YARN returns its progress and status (including counter) to the application manager, and the client requests progress updates from the application manager every second (set by mapreduce.client.progressmonitor.pollinterval) and displays it to the user. You can use YARN WebUI to view the task execution status.


3.5, homework completed

In addition to requesting the job progress from the application manager, the client will check whether the job is complete by calling waitForCompletion() every 5 minutes. The time interval can be set by mapreduce.client.completion.pollinterval. After the job is completed, the application manager and container will clean up the working status. The job information will be stored by the job history server for later user verification.


Four, YARN resource scheduler

4.1. First-in-first-out scheduler (FIFO)

FIFO Scheduler arranges applications into a queue in the order of submission. This is a first-in first-out queue. When resource allocation is performed, resources are allocated to the top application in the queue first, and then the requirements of the top application are satisfied. Give the next assignment, and so on.
FIFO Scheduler is the simplest and easiest to understand scheduler, and does not require any configuration, but it is not suitable for shared clusters. Large applications may occupy all cluster resources, which causes other applications to be blocked. In a shared cluster, it is more suitable to use Capacity Scheduler or Fair Scheduler, both of which allow large tasks and small tasks to obtain certain system resources while being submitted.


4.2. Capacity Scheduler (Capacity Scheduler)

The Capacity scheduler allows multiple organizations to share the entire cluster, and each organization can obtain a portion of the cluster's computing power. By assigning a dedicated queue to each organization, and then assigning certain cluster resources to each queue, the entire cluster can provide services to multiple organizations by setting up multiple queues. In addition, the queue can be divided vertically, so that multiple members within an organization can share the queue resources. Within a queue, resource scheduling uses a first-in-first-out (FIFO) strategy.
Supports multiple queues, each queue can be configured with a certain amount of resources, and each queue adopts FIFO scheduling strategy.
In order to prevent jobs of the same user from monopolizing resources in the queue, the scheduler limits the amount of resources occupied by jobs submitted by the same user.


4.3, Fair Scheduler (Fair Scheduler)

Supports multiple queues , multiple users , the amount of resources in each queue can be configured, and jobs in the same queue share all resources in the queue fairly.
For example, there are three queues queue1, Queue2, and Queue3. Jobs in each queue allocate resources according to priority. The higher the priority, the more resources are allocated , but each job will be allocated resources to ensure fairness . In the case of limited resources, there is a gap between the computing resources that each job obtains under ideal conditions and the computing resources actually obtained. This gap is called the difference. In the same queue, the greater the resource shortage of the job, the more resources are obtained first and executed first. The assignments are executed one after the other in accordance with the level of the vacancy . In the Fair scheduler, we do not need to occupy certain system resources in advance. The Fair scheduler dynamically adjusts system resources for all running jobs .


Five, YARN commonly used commands

5.1, yarn command list

Insert picture description here


5.2, yarn application command

Insert picture description here


5.3. View the list of running tasks

Common work

yarn application -list

5.4, ​​kill the running task

Commonly used in interviews and jobs

yarn application -kill 任务id

5.5, view the node list

yarn node -list

5.6, check node status

yarn node -status 节点id

5.7. Check the environment variables that yarn depends on jar

yarn classpath

Guess you like

Origin blog.csdn.net/weixin_48482704/article/details/110961396