Yarn---working mechanism

Yarn execution process

Insert picture description here

client->ResourceManager->app Manager->client->hdfs->app Manager->scheduler->app Manager->app master->
hdfs->scheduler->app Manager->container->hdfs->运行->app Master->app Manager->hdfs->client

Yarn detailed process:

1、Client端发送计算任务请求给ResourceManager(实际是先发给Namenode,但是它解决不了,交给RM处理)
2、ResourceManager创建ApplicationManager,并返回JobID和HDFS目录给客户端
3、Client端将具体的jar包,临时数据,目标数据信息上传到HDFS目录,
4、上传成功后,返回信息给ApplicationManager
5、由ResourceManager向Namenode获取元数据信息,创建RescourceScheduler,根据元数据信息生成 大概的schedule文件,得到需要几个Datanode资源,以及每个Datanode需要几块数据块
6、由消耗资源低的NodeManager来创建ApplicationMaster来具体负责schedule文件
7、ApplicationMaster先去HDFS上获取对应文件,计算各个Datanode节点具体需要多少资源,并且上报到ResourceSeceduler,更新schedule文件,同时向ApplicationManager申请资源权限
8、获取权限后,根据schedule文件向对应的NodeManager发送创建Container的任务
9、各个Container对应创建TaskTracker,并从HDFS上下载文件,创建MapTask和ReduceTask等具体任务
10、开始执行jar包,执行过程中不停反馈执行进度给ApplicationMaster,并返回心跳个NodeManager
11、执行完成后,结果逐步返回给Client端

can not read it? It does not matter, through the following simple example to understand the general process
Insert picture description here

Guess you like

Origin blog.csdn.net/qq_43288259/article/details/115127466