Yet Another Resources Negotiator
From the beginning Hadoop2.0 version introduced YARN, main functions:
- Cluster Resource Management System
- Cluster is responsible for unified management and scheduling
- Interact with clients, client requests
First, the basic architecture
Master / Slave architecture
Resource management and the composition of Node Manager
Machine cluster start nodemanager process, generally correspond with the HDFS datanode
resourcemanager responsible for unified machine resources, and generally do not run on a nodemanager common node, the node is responsible for monitoring and managing nodemanager resources, resources and tasks running on the node status they will be fed back to resourcemanager by nodemanager, this feedback through a feedback manner heartbeat periodic report
resourcemanager responsible for receiving job requests submitted by the client, some computing tasks, not upload files, upload files uploaded to namenode
After receiving these tasks, these tasks will be scheduled to run in a cluster, perform tasks in real YARN is container, container is the cpu and memory and other computing resources abstract container, the real task is to run in a container in
resourcemanager scheduled task when a first communication nodemanager and will require a start nodemanager container, container will resource memory, cpu and network in accordance with the program package to run
Start finished container, for the program will start applicationmaster, each program has its Program Manager, responsible for resource application specific tasks, task scheduling and monitoring. After application to communicate resourcemanager resources, apply to the nodemanager, required to perform specific tasks nodemanager start container
Task execution process ongoing communication, reporting and status and task execution applicationmaster,
The client connection information obtained by resourcemanager applicationmaster, and establish a connection with applicatinomaster, constantly communicating with applicationmaster, obtain the latest task progress information until the end of the task, release resources occupied, container closed
Generally two resourcemanager, a active a standby
Second, the core components
1.ResourceManager
only one
Features:
- Processing client requests
- Start / monitoring ApplicationMaster
- Health status monitoring NodeManager
- Resource allocation and scheduling
2.NodeManager
Start NodeManager outside mastermanager, general correspondence and DataNode
Features:
- Resource monitoring and management on a single node
- Resource usage reports regularly to the machine ResourceManager
- Processing requests from the ResourceManager, for the execution of the job assignment Container
- ApplicationMaster process requests from start and stop Container
3.ApplicationMaster
Each application is only one responsible for the management of the application, resource and task scheduling application
Features:
- In consultation with the ResourceManager application resources for applications
- NodeManager communication with start / stop task
- Monitoring tasks running status and failure treatment
4.Container
Abstract task operating environment, only when assigned tasks will abstract a container
Features:
- Task runs resources (nodes, memory, CPU)
- Task start command
- Task operating environment
Three, YARN fault tolerance
ResourceManager
- Availability to start multiple ResourceManager based zookeeper achieve
NodeManager
- NodeManager failure will cause the task to run on the node fails, after a failure, the ResourceManager that a corresponding task fails ApplicationMaster
- ApplicationMaster decide how to deal with the failure of the task
ApplicationMaster
- After ApplictationMaster failure by the responsible restart ResourceManager
Fourth, running processes
1. The client sends a request to submit a job to the ResourceManager, after receiving the job request is returned to the client ID
2.ResourceManager will find one of nodemanager send a request to allocate it to the container, nodemanager start container, start a ApplicatinMaster in the container, ResourceManager monitors the operating status ApplicationMaster
How many map will be responsible for calculating the starting job after ApplicationMaster need, how many reduce, the number of cpu to run these tasks require, how much memory
3.ApplicationMaster registered with the ResourceManager, start applying for jobs resources needed for the operation After successful registration, when resourcemanager allocate resources to ApplicationMaster, ApplicationMaster receive these resources, which include information provided by these resources are what nodemanager,
4.ApplicationMaster corresponding NodeManager and will communicate to nodemanager assigned container, and the container start in claim nodemanager corresponding task start
Run the task container in the operational status and will progress to ApplicatinoMaster
In the process of running, the client obtains ApplicationMaster communication address, and the client communicate through the ResourceManager ApplicationMaster, periodically acquired operation status of the job, the job as periodically reporting the status and progress during operation ApplicationMaster
After the tasks are completed, the job is over AM will write off as RM, release the occupied resources out