大数据学习笔记(五)_YARN资源调度框架

一:YARN 产生背景

MapReduce1.x 的特点
master/slave 架构:JobTracker/TaskTracker
JobTracker: 单点、压力大
仅仅只能够支持mapreduce作业

正因为有这样的问题,催生出 YARN

二:YARN 概述

官网介绍:

http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.15.1/hadoop-yarn/hadoop-yarn-site/YARN.html

MapReduce has undergone a complete overhaul in hadoop-0.23 and we now have, what we call, MapReduce 2.0 (MRv2) or YARN.(MapReduce 2.0 就是YARN了)

The fundamental idea of MRv2 is to split up the two major functionalities of the JobTracker, resource management and job scheduling/monitoring, into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM). An application is either a single job in the classical sense of Map-Reduce jobs or a DAG of jobs.
The ResourceManager and per-node slave, the NodeManager (NM), form the data-computation framework. The ResourceManager is the ultimate authority that arbitrates resources among all the applications in the system.
The per-application ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks.

三:YARN 架构

在这里插入图片描述
YRAN由以下几个部分组成:
Client: 向RM提交任务、杀死任务等
ApplicationMaster:
每个应用程序对应一个AM
AM向RM申请资源用于在NM上启动对应的Task
数据切分
为每个task向RM申请资源(container)
NodeManager通信
任务的监控

NodeManager: 多个
干活
向RM发送心跳信息、任务的执行情况
接收来自RM的请求来启动任务
处理来自AM的命令

ResourceManager:集群中同一时刻对外提供服务的只有1个,负责资源相关
处理来自客户端的请求:提交、杀死
启动/监控AM
监控NM
资源相关

container:任务的运行抽象
memory、cpu…
task是运行在container里面的
可以运行am、也可以运行map/reduce task

四:YARN执行流程

在这里插入图片描述
1.Client 提交任务到RM
2.RM 分配第一个NM
3.第一个NM启动Container运行 Application Master
4.Application Master与RN 通讯,注册到RM上,
这样的话,Client就可以通过AM与RM的通讯 得知运行情况
向RN申请资源,RM返回可用资源
5.Application Master拿到分配的资源,到各自的NM 上启动Container
6.Container 新建Task 任务

五:YARN环境搭建

官方配置:http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.6.0-cdh5.12.1/hadoop-project-dist/hadoop-common/SingleCluster.html

YARN on Single Node
You can run a MapReduce job on YARN in a pseudo-distributed mode by setting a few parameters and running ResourceManager daemon and NodeManager daemon in addition.
The following instructions assume that 1. ~ 4. steps of the above instructions are already executed.(在执行下面步骤之前,hadoop 环境都已经打开了)

//还需要去下面两个配置文件,配置参数
//1.配置以下参数Configure parameters as follows:
//路径:etc/hadoop/mapred-site.xml:
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

//路径:etc/hadoop/yarn-site.xml:
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
</configuration>


2. 启动RM和NM 的命令 Start ResourceManager daemon and NodeManager daemon:
  $ sbin/start-yarn.sh  (启动YARN命令)

3.用Web 查看 YRAN的控制台 Browse the web interface for the ResourceManager; by default it is available at:
访问URL:ResourceManager - http://localhost:8088/   (可以查看YARN 的WEB  http://IP:8088/)

WEB 展示的YARN控制台
在这里插入图片描述

4.执行一个MapReduce job ---Run a MapReduce job.

5. 关闭YARN命令
$ sbin/stop-yarn.sh (关闭YARN命令)

六:提交作业到YARN上执行

hadoop自带实列包:
1.cd 到hadoop 提供的示例包 路径hadoop-2.6.0-cdh5.15.1/share/hadoop/mapreduce

2.hadoop jar hadoop-mapreduce-examples-2.6.0-cdh5.15.1.jar (可查看 示例提供的各类方法)

3.hadoop jar hadoop-mapreduce-examples-2.6.0-cdh5.15.1.jar wordcount (调用wordcount 方法,会提示需要输入参数)

4.hadoop jar hadoop-mapreduce-examples-2.6.0-cdh5.15.1.jar wordcount /wc/input/data.txt /wc/output (调用示例的wordcount 进行词频统计 统计文件 /wc/input/data.txt 结果输出到/wc/output)

执行完后 WEB 端展示 的情况
如下图:
在这里插入图片描述
执行完成后
在这里插入图片描述
自定义的作业:
提交自己开发的MR作业到YARN上运行的步骤:

1)mvn clean package -DskipTests  (maven打包jar)   eclipse 可以直接export jar进行打包
	windows/Mac/Linux ==> Maven 
	
2)把编译出来的jar包(项目根目录/target/...jar)以及测试数据上传到服务器
	scp xxxx hadoop@hostname:directory
	
3) 把数据上传到HDFS
	hadoop fs -put xxx hdfspath
	
4) 执行作业
	hadoop jar xxx.jar 完整的类名(包名+类名) args.....
	hadoop jar hadoop-train-v2.jar 
	com.imooc.bigdata.hadoop.mr.access.AccessYARNApp /access/input/access.log /access/output
	
5) 到YARN UI(8088) 上去观察作业的运行情况
6)到输出目录去查看对应的输出结果

在这里插入图片描述
上传到YARN 上 将此处改成可传参数

在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/sinat_34979884/article/details/113242894