Yarn’s three scheduler models for resource scheduling

There are three schedulers to choose from in Yarn: FIFO Scheduler, Capacity Scheduler, and Fair Scheduler.

The Apache version of Hadoop uses the Capacity Scheduler scheduling method by default.
The CDH version uses the Fair Scheduler scheduling method by default.

1. FIFO Scheduler (first come, first served):

FIFO Scheduler arranges applications into a queue in the order of submission. This is a first-in, first-out queue. When allocating resources, resources are allocated to the top application in the queue first. After the requirements of the top application are met, Then assign it to the next one, and so on.
FIFO Scheduler is the simplest and easiest to understand scheduler and does not require any configuration, but it is not suitable for shared clusters. A large application may occupy all cluster resources, which causes other applications to be blocked. For example, if a large task is executing and occupies all resources, and then a small task is submitted, the small task will remain blocked.

2. Capacity Scheduler:

For the Capacity scheduler, there is a dedicated queue for running small tasks, but setting up a queue specifically for small tasks will occupy a certain amount of cluster resources in advance, which causes the execution time of large tasks to lag behind the time when using the FIFO scheduler. .

3. Fair Scheduler:

In the Fair scheduler, we do not need to occupy certain system resources in advance. The Fair scheduler will dynamically adjust system resources for all running jobs.

For example: when the first big job is submitted, only this job is running, and it obtains all the cluster resources; when the second small task is submitted, the Fair scheduler will allocate half of the resources to this small task, so that it can The two tasks share cluster resources fairly.

It should be noted that in the Fair scheduler, there will be a certain delay from the submission of the second task to the acquisition of resources, because it needs to wait for the first task to release the occupied Container. After the small task is completed, the resources it occupied will be released, and the large task will obtain all system resources. The final effect is that the Fair scheduler achieves high resource utilization and ensures that small tasks are completed in time.


See you next time, bye!

Guess you like

Origin blog.csdn.net/frdevolcqzyxynjds/article/details/131855255