Hadoop series (nine) of three core Hadoop Yarn- Detailed resource scheduling

Yarn Detailed scheduling process

  1. Client-side job is submitted to the ResourceManager ApplicationManager, the JobID application (unique ID)

  2. RM returns a job ID, and a temporary return to the Client hdfs path, requires the Client sends the file to be uploaded to the temporary directory. (After the mission, the temporary directory will be deleted)

  3. Client resource (jar package, and fragmentation profile information, etc.) needed to run the job uploaded to the specified path HDFS
  4. After a successful upload, send a request to the AM RM, do the job
  5. AM forwards the request to the scheduler, the scheduler starts processing the request
  6. The task scheduler is placed in the queue, when performing the request, informing the dispensing container can ApplicationManager, information for informing NodeManager open Container
  7. Container ApplicationManager command NodeManager create and run a job ApplicationMaster. NodeManager Creating a Container and start ApplicationMaster job. ApplicationMaster registers itself to ApplicationManager, making ApplicationManager can monitor the execution state of Job, Client can also control the operation through ApplicationManager.
  8. ApplicationMaster query temporary hdfs path, get jar package information, configuration files, create a map and reduce tasks
  9. ApplicationMaster request allocation of resource scheduler, open map, reduce job resources
  10. The scheduler return execution information, which contains the NodeManager can open up resource information
  11. ApplicationMaster notice NodeManager open up the resource pool to start the map and reduce tasks
  12. NodeManager start the task of its own resource pool
  13. map, reduce temporary hdfs task to query the data path. Begin execution. ApplicationMaster real-time monitoring to manage the map, reduce implementation of the mandate, if that fails, the request scheduler resource pool open in a new node, the failure to execute the program. ApplicationManager the implementation of real-time monitoring ApplicationMaster own management, if ApplicationMaster down, create a new ApplicationMaster, continue to monitor the original map and reduce tasks in the meantime, map, reduce the task will not be affected
  14. The success of the program, the release of resources

Yarn scheduling policy

  • FIFO scheduling
  • Capacity Scheduling
  • Fair Scheduling

1.FIFO FIFO scheduling

The simplest such scheduling, the application is placed in a queue and will order all Applications filed into the queue, the first level in accordance with the priority of the job, then in accordance with the arrival time has allocated resources for each app. If the first app resource needs are met, and if the left and resources needed to satisfy the second app inventory, it is the second app allocation of resources.

Advantages: simple, requires no configuration.

Disadvantages: not suitable for sharing cluster. If there are many resources in the app needs, then the other app might have been waiting for.

2.Capacity capacity scheduling mechanism

Capacity scheduling mechanism applies to a cluster (cluster shared by multiple organizations) Application in the case where a plurality of running, and the goal is to maximize the throughput of the cluster usage.

Capacity resource scheduling allows the entire cluster into a plurality of portions, wherein each portion of the tissue used, i.e., each organization has a dedicated queue, each queue tissue may be further divided into a hierarchy ( your Hierarchical Queues ), allowing using the internal organization of the different user groups.

Inside each queue, a FIFO manner in accordance with the scheduling Applications. When a queue of idle resources, can share its remaining resources to other queues.

Example:

There is a special queue allows after small apps submitted able to perform as soon as possible, job1 be submitted to queueA, did not take up all system resources (if job1 need 100G memory, but the entire cluster only 100G memory, then only assigned to job1 80G), job2 be submitted to queueB quickly executed.

3.Fair fair scheduling mechanism

FairScheduler allows applications to share resources fairly in a cluster. Fair scheduling FairScheduler of default based solely on memory, can also be configured based on memory & CPU. When the cluster is only one app, it is an exclusive cluster resources. When a new app submitted, the idle resources to use the new app, app so eventually everyone will get about the same resources. The app can be set for different priorities, decide the percentage of occupied resources for each app. FairScheduler can make short work within a reasonable time, without having to have to wait long to complete the job.

Sharing Fair : Scheduler The apps are organized into queues, the equitable distribution of resources between these queues. By default, all apps are added to the name to "default" queue. app can also specify which queue to join. The default scheduling policy is based on an internal queue memory sharing policy, and may also be configured FIFO multi-resource with Dominant Resource Fairness

Sharing Minimum : FairScheduller provide a fair share, but also allows you to specify minimum shares to queues, so as to ensure that all users and Apps can get enough resources. If the app does not take some minimum resources specified, it may be beyond the resources allocated to other app use.

FairScheduler default so that all apps are running, but you can also assign a weight to each queue Reconfiguration through the configuration file, if a weight of 1: 1, when the two queue while performing tasks assigned to each half of the resources.

Example:

Two users A and B. A submission job1 when the cluster is not running app, therefore job1 exclusive resources in the cluster. When the user B job2 after submission, job2 release half of the containers in job1, started. After time job2 not finished executing, the user B submitted job3, job2 release it occupied half of the containers, job3 access to resources started.

Guess you like

Origin www.cnblogs.com/valjeanshaw/p/11809713.html