Hadoop (eight): YARN framework Introduction

FIG assembly YARN

 

 

  • Container abstract YARN frame corresponding resource, encapsulates resources (memory + CPU) running on the node

  • Container NodeManager responsible for maintaining the state of the heartbeat, the resource information (remaining CPU, memory) to the ResourceManager.

  • ResourceManager can maintain resource state data for the entire cluster heartbeat decision based on the feedback information can be called.

    • Client submits the task will be submitted to the ResourceManager.

    • ResourceManager apply to their own resources.

    • After the application of resources to start ApplicationMaster.

    • ResourceManager monitor all ApplicationMaster.

  • ApplicationMaster main responsibilities:

    • Job scheduler ask to start each of ApplicationMaster services based on the client's call (will apply for resources before starting)

    • ApplicationMaster keep monitoring process conditions appropriate resource container, running tasks, tracking application status and monitor their progress.

    • Failed processing tasks, and try again.

    • For different tasks (Job), have different ApplicationMaster, such as the implementation of MRpplicationMaster MapReduce will start.

      • If ApplicationMaster Hadoop not realize existence, we can rewrite his way to achieve ApplicationMaster our own needs.

Job Execution Flow

 

 

  1. Client submits the task will be submitted to the ResourceManager, ResourceManager apply to their own resources.

  2. After the application of resources to send information to the Job NodeManager, scheduling notice NodeManager own resources (container) start ApplicationMaster.

  3. ApplicationMaster based Job information, apply to the ResourceManager use of resources at this stage, such as stage of MapReduce Mapper will not apply the resources Reduce stage.

    • Resource format applications is as follows

       

       

    • Resource Manager will tell ApplicationMaster allow the use of resources, if only a return on the first one.

    • The remaining resources will be needed to continue to request

  4. ApplicationMaster later to get this one will be the first scheduling resources corresponding Node Manager to start the task (Task)

    • Task is the real computing tasks, such as Mapper.

    • The task execution process, Task task will be to ApplicationMaster feedback progress, success, given the information.

    • ApplicationMaster based on feedback management Task whether to terminate or continue the restart and so on.

    • After the completion of Task Task ApplicationMaster will make a mark, the successful execution, execution failed.

  5. After all Task completed, ApplicationMaster Job will make a mark, the successful execution, execution failed.

Guess you like

Origin www.cnblogs.com/renzhongpei/p/12635241.html