YARN finishing

YARN finishing

1. Introduction of YARN
  • It is a resource management and task scheduling framework, which mainly includes three modules:

    • ResourceManager(RM): Responsible for monitoring, allocation and management of all resources

    • ApplicationMaster(AM): Responsible for scheduling and coordination of each application

    • NodeManager(NM): responsible for each node maintenance

  • For all applications, RM has absolute control and resource allocation rights. And each AM will negotiate resources with RM. At the same time, communicate with NodeManager to execute and monitor tasks.

2. Introduction of the three major components of YARN
  • ResourceManager

  1. ResourceManager is responsible for resource management and allocation of the entire cluster and is a global resource management system.

  2. NodeManager reports resource usage (mainly CPU and memory usage at present) to ResourceManager by heartbeat. RM only accepts NM's resource return information, and for specific resource processing, it is handed over to NM itself

  3. The YARN Scheduler allocates resources according to the application's request, and is not responsible for the monitoring, tracking, running status feedback, and startup of the application job.

  • NodeManager

  1. NodeManager is the resource and task manager on each node. It is the agent that manages this machine, is responsible for the operation of the node program, and the management and monitoring of the node's resources. Each node of a YARN cluster runs a NodeManager.

  2. NodeManager regularly reports to ResourceManager the usage of resources (CPU, memory) of this node and the running status of Container. When the ResourceManager goes down, the NodeManager automatically connects to the RM standby node.

  3. NodeManager receives and processes various requests from ApplicationMaster to start, stop and other Containers.

  • ApplicationMaster

  1. Each application submitted by the user contains an ApplicationMaster, which can run on machines other than the ResourceManager.

  2. Responsible for negotiating with the RM scheduler for resources (represented by Container).

  3. The obtained tasks are further allocated to internal tasks (secondary allocation of resources).

  4. Communicate with NM to start/stop tasks.

  5. Monitor the running status of all tasks, and re-apply for resources to restart the task when the task fails to run.

  6. At present, YARN comes with two ApplicationMaster implementations, one is DistributedShell, an example program used to demonstrate AM writing methods, which can apply for a certain number of Containers to run a Shell command or Shell script in parallel; the other is AM—which runs MapReduce applications. MRAppMaster.

  • The RM is only responsible for monitoring the AM and starting the AM if it fails. The RM is not responsible for the fault tolerance of AM's internal tasks, and the fault tolerance of the tasks is done by the AM.

3. YARN operation process

  • Client is actually a class YARNRunner request to submit job application resource operation

  • RM judges whether it conforms to the application specification, assigns jopid, and gives you a path of job submission resources mapred-default.xml:yarn.app.mapreduce.am.staging-dir:/tmp/hadoop-yarn/staging returns to the client

  • The client submits the job.submit:.jar program, slices the planning job.split, and submits these resources to the specified path by default configuration job.xml, and applies for resources to run the applicationMaster (MrAppMaster) of this job.

  • RM combines the reporting of cluster resource information from NMs, Container in the form of a container, gives resources to the client, finds an NM to start the container, and gives the ip where the container is located to the client

  • The client starts MrAppMater to the specified container, and needs to register itself with the RM, and maintain communication

  • According to the slicing rules, MrAppMaster knows the number of MapTasks in this job. For example, there are three, and x applies to RM for three containers to run.

  • RM receives the request, reports cluster resource usage according to NMs, finds out the machine to start 3 containers, and returns the IP information of the 3 containers to AppMaster

  • MrAppMaster gets the location of the container and starts the first stage task MapTask of this job on the container, and tracks the execution of the task and reports to the RM that the resources are used. You can recycle

  • When the MapTask finishes running, does the MrAppMaster program need to apply for the containers required for the next stage?

  • When all tasks are finished, MrAppMaster applies to RM to cancel itself.

4. YARN scheduler Scheduler

Ideally, requests for our application Yarn resources should be satisfied immediately, but in reality, resources are often limited, especially in a very busy cluster, a request for application resources often needs to wait for a period of time to get the corresponding resources. In Yarn, Scheduler is responsible for allocating resources to applications. In fact, scheduling itself is a difficult problem, and it is difficult to find a perfect strategy. For this reason, Yarn provides a variety of schedulers and configurable strategies for us to choose.

There are three schedulers to choose from in Yarn: FIFO Scheduler, Capacity Scheduler (default scheduler), Fair Scheduler.

1.FIFO Scheduler
  • FIFO Scheduler arranges applications into a queue according to the order of submission . This is a first-in, first- out queue. When resource allocation is performed, resources are allocated to the top application in the queue first, and then the application requirements at the top are satisfied. to the next assignment, and so on

  • FIFO Scheduler is the simplest and easiest to understand scheduler and does not require any configuration, but it is not suitable for shared clusters. Large applications may consume all cluster resources, causing other applications to be blocked. In a shared cluster , it is more suitable to use Capacity Scheduler or Fair Scheduler, both of which allow large tasks and small tasks to obtain certain system resources while submitting.

2. Capacity Scheduler (default scheduler)
  • The Capacity scheduler allows multiple organizations to share the entire cluster, with each organization getting a portion of the cluster's computing power . By assigning a dedicated queue to each organization, and then assigning certain cluster resources to each queue, the entire cluster can provide services to multiple organizations by setting multiple queues. In addition, the interior of the queue can be divided vertically, so that multiple members of an organization can share the queue resources. Within a queue, the scheduling of resources is based on a first-in, first-out (FIFO) strategy.

  • It is to divide the entire cluster resources into n queues, for example, into 2 queues, and the ratio of a large queue to a small queue is 8:2. Larger applications go to the larger queue, and small queues go to the smaller queue. However, the FIFO is still used inside the queue. In fact, a relatively large application is still blocked. If all the small applications come and go to the small queue, the large queue will be empty. Resources will be wasted.

3.Fair Scheduler
  • In the Fair scheduler, we do not need to pre-occupy certain system resources, the Fair scheduler will dynamically adjust system resources for all running jobs. As shown in the figure below, when the first large job is submitted, only this job is running, and it has obtained all cluster resources; when the second small task is submitted, the fair scheduler will allocate half of the resources to this small task , so that the two tasks share the cluster resources fairly.

  • It should be noted that in the Fair scheduler in the figure below, there will be a certain delay , because it needs to wait for the first task to release the occupied Container. After the small task is completed, the resources occupied by itself , and the large task will obtain all the system resources. The final effect is that the Fair scheduler achieves high resource utilization and ensures that small tasks are completed in a timely manner.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324976904&siteId=291194637