Hadoop基础教程-第5章 YARN:资源调度平台(5.2 YARN参数解读与调优)

第5章 YARN:资源调度平台

5.2 YARN参数解读与调优

yarn-site.xml文件默认参数: 
http://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

5.2.1 ResourceManager相关配置参数

参数 默认值 说明
yarn.resourcemanager.address ${yarn.resourcemanager.hostname}:8032 ResourceManager 对客户端暴露的地址
yarn.resourcemanager.scheduler.address ${yarn.resourcemanager.hostname}:8030 ResourceManager 对ApplicationMaster暴露的访问地址2
yarn.resourcemanager.resource-tracker.address ${yarn.resourcemanager.hostname}:8031 ResourceManager 对NodeManager暴露的地址
yarn.resourcemanager.admin.address ${yarn.resourcemanager.hostname}:8033 ResourceManager 对管理员暴露的访问地址
yarn.resourcemanager.webapp.address ${yarn.resourcemanager.hostname}:8088 ResourceManager对外WebUI地址
yarn.resourcemanager.scheduler.class ..capacity.CapacityScheduler 启用的资源调度器主类,目前可用的有FIFO、Capacity Scheduler和Fair Scheduler
yarn.resourcemanager.resource-tracker.client.thread-count 50 处理来自NodeManager的RPC请求的Handler数目
yarn.resourcemanager.scheduler.client.thread-count 50 处理来自ApplicationMaster的RPC请求的Handler数目
yarn.scheduler.minimum-allocation-mb 1024 单个可申请的最小内存资源量
yarn.scheduler.maximum-allocation-mb 8192 单个可申请的最大内存资源量
yarn.scheduler.minimum-allocation-vcores 1 单个可申请的最小虚拟CPU个数
yarn.scheduler.maximum-allocation-vcores 32 单个可申请的最大虚拟CPU个数
yarn.resourcemanager.nodemanagers.heartbeat-interval-ms 1000(毫秒) NodeManager心跳间隔

..capacity.CapacityScheduler的完整名称: org.apache.Hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler

5.2.2 NodeManager相关配置参数

参数 默认值 说明
yarn.nodemanager.resource.memory-mb 8192 NodeManager总的可用物理内存(这个值通过一定要配置)
yarn.nodemanager.vmem-pmem-ratio 2.1 每使用1MB物理内存,最多可用的虚拟内存数
yarn.nodemanager.resource.cpu-vcores 8 NodeManager总的可用虚拟CPU个数
yarn.nodemanager.local-dirs ${hadoop.tmp.dir}/nm-local-dir 中间结果存放位置,这个参数通常会配置多个目录,已分摊磁盘IO负载。
yarn.nodemanager.log-dirs ${yarn.log.dir}/userlogs 日志存放地址(可配置多个目录)
yarn.nodemanager.log.retain-seconds 10800 NodeManager上日志最多存放时间(不启用日志聚集功能时有效)
yarn.nodemanager.aux-services   NodeManager上运行的附属服务,需配置成mapreduce_shuffle,才可运行MapReduce程序

5.2.3 mapred-site.xml

参数 默认值 说明
mapreduce.job.reduces 1 默认启动的reduce数
mapreduce.job.maps 2 默认启动的map数
mapreduce.task.io.sort.factor 10 Reduce Task中合并小文件时,一次合并的文件数据
mapreduce.task.io.sort.mb 100 Map Task缓冲区所占内存大小
mapred.child.java.opts -Xmx200m jvm启动的子线程可以使用的最大内存
mapreduce.jobtracker.handler.count 10 JobTracker可以启动的线程数,一般为tasktracker节点的4%
mapreduce.reduce.shuffle.parallelcopies 5 reuduce shuffle阶段并行传输数据的数量
mapreduce.tasktracker.http.threads 40 map和reduce是通过http进行数据传输的,这个是设置传输的并行线程数
mapreduce.map.output.compress false map输出是否进行压缩,如果压缩就会多耗cpu,但是减少传输时间,如果不压缩,就需要较多的传输带宽
mapreduce.reduce.shuffle.merge.percent 0.66 reduce归并接收map的输出数据可占用的内存配置百分比
mapreduce.reduce.shuffle.memory.limit.percent 0.25 一个单一的shuffle的最大内存使用限制。
mapreduce.jobtracker.handler.count 10 可并发处理来自tasktracker的RPC请求数,默认值10。
mapreduce.job.jvm.numtasks 1 一个jvm可连续启动多个同类型任务,默认值1,若为-1表示不受限制。
mapreduce.tasktracker.reduce.tasks.maximum 2 一个tasktracker并发执行的reduce数,建议为cpu核数

5.2.4 参数调优

参照 
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/index.html

这里写图片描述

单击“Command Line Installation”超链接后的PDF图标,即可打开HDP安装文档。

这里写图片描述

单击“1.10. Determining HDP Memory Configuration Settings”条目,跳转到对应页面

这里写图片描述

单击“Download Companion Files”连接,可以看到两条命令

wget http://public-repo-1.hortonworks.com/HDP/tools/2.6.0.3/ hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz

tar zxvf hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz

下载:

[root@node1 ~]# wget http://public-repo-1.hortonworks.com/HDP/tools/2.6.0.3/hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz
--2017-05-23 23:26:13--  http://public-repo-1.hortonworks.com/HDP/tools/2.6.0.3/hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz
Resolving public-repo-1.hortonworks.com (public-repo-1.hortonworks.com)... 52.84.167.222, 52.84.167.38, 52.84.167.49, ...
Connecting to public-repo-1.hortonworks.com (public-repo-1.hortonworks.com)|52.84.167.222|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 85173 (83K) [application/x-tar]
Saving to: ‘hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz’

100%[==================================================================================================================================>] 85,173       132KB/s   in 0.6s   

2017-05-23 23:26:14 (132 KB/s) - ‘hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz’ saved [85173/85173]

解压缩

[root@node1 ~]# tar -zxvf hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz
hdp_manual_install_rpm_helper_files-2.6.0.3.8/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/
.....
2.6.0.3.8/configuration_files/zookeeper/configuration.xsl
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/zookeeper/log4j.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/zookeeper/zoo.cfg
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/pig/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/pig/pig-env.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/pig/pig.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/pig/log4j.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/oozie-log4j.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/oozie-env.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/oozie-site.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/adminusers.txt
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/hadoop-config.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/oozie-default.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hbase-policy.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/regionservers
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hadoop-metrics.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hbase-env.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hadoop-metrics.properties.master-GANGLIA
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hadoop-metrics.properties.regionservers-GANGLIA
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hbase-site.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/log4j.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/webhcat/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/webhcat/webhcat-env.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/webhcat/webhcat-site.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/scripts/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/scripts/directories.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/scripts/yarn-utils.py
hdp_manual_install_rpm_helper_files-2.6.0.3.8/scripts/usersAndGroups.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/readme.txt
hdp_manual_install_rpm_helper_files-2.6.0.3.8/HDP-CHANGES.txt
[root@node1 ~]# 
[root@node1 ~]# cd hdp_manual_install_rpm_helper_files-2.6.0.3.8
[root@node1 hdp_manual_install_rpm_helper_files-2.6.0.3.8]# ls
configuration_files  HDP-CHANGES.txt  readme.txt  scripts
[root@nb0 hdp_manual_install_rpm_helper_files-2.6.0.3.8]# cd scripts/
[root@nb0 scripts]# ls
directories.sh  usersAndGroups.sh  yarn-utils.py

假设,某个节点有4核8G内存1块硬盘,该节点同时安装HBase,通过下面命令即可获得优化参数

[root@nb0 scripts]# python yarn-utils.py -c 4 -m 8 -d 1 -k True
 Using cores=4 memory=8GB disks=1 hbase=True
 Profile: cores=4 memory=5120MB reserved=3GB usableMem=5GB disks=1
 Num Container=3
 Container Ram=1536MB
 Used Ram=4GB
 Unused Ram=3GB
 yarn.scheduler.minimum-allocation-mb=1536
 yarn.scheduler.maximum-allocation-mb=4608
 yarn.nodemanager.resource.memory-mb=4608
 mapreduce.map.memory.mb=1536
 mapreduce.map.java.opts=-Xmx1228m
 mapreduce.reduce.memory.mb=3072
 mapreduce.reduce.java.opts=-Xmx2457m
 yarn.app.mapreduce.am.resource.mb=3072
 yarn.app.mapreduce.am.command-opts=-Xmx2457m
 mapreduce.task.io.sort.mb=614

选项说明

选项 描述
-c 每一个客户机的核数目
-m 每一个客户机拥有的内存总数
-d 每一个客户机拥有的磁盘数目
-k 如果Hbase安装了为”True”,否则为”False”
灰常灰常感谢原博主的辛苦工作,为防止删博,所以转载,只供学习使用,不做其他任何商业用途。 https://blog.csdn.net/chengyuqiang/article/details/72642407

第5章 YARN:资源调度平台

5.2 YARN参数解读与调优

yarn-site.xml文件默认参数: 
http://hadoop.apache.org/docs/r2.7.3/hadoop-yarn/hadoop-yarn-common/yarn-default.xml

5.2.1 ResourceManager相关配置参数

参数 默认值 说明
yarn.resourcemanager.address ${yarn.resourcemanager.hostname}:8032 ResourceManager 对客户端暴露的地址
yarn.resourcemanager.scheduler.address ${yarn.resourcemanager.hostname}:8030 ResourceManager 对ApplicationMaster暴露的访问地址2
yarn.resourcemanager.resource-tracker.address ${yarn.resourcemanager.hostname}:8031 ResourceManager 对NodeManager暴露的地址
yarn.resourcemanager.admin.address ${yarn.resourcemanager.hostname}:8033 ResourceManager 对管理员暴露的访问地址
yarn.resourcemanager.webapp.address ${yarn.resourcemanager.hostname}:8088 ResourceManager对外WebUI地址
yarn.resourcemanager.scheduler.class ..capacity.CapacityScheduler 启用的资源调度器主类,目前可用的有FIFO、Capacity Scheduler和Fair Scheduler
yarn.resourcemanager.resource-tracker.client.thread-count 50 处理来自NodeManager的RPC请求的Handler数目
yarn.resourcemanager.scheduler.client.thread-count 50 处理来自ApplicationMaster的RPC请求的Handler数目
yarn.scheduler.minimum-allocation-mb 1024 单个可申请的最小内存资源量
yarn.scheduler.maximum-allocation-mb 8192 单个可申请的最大内存资源量
yarn.scheduler.minimum-allocation-vcores 1 单个可申请的最小虚拟CPU个数
yarn.scheduler.maximum-allocation-vcores 32 单个可申请的最大虚拟CPU个数
yarn.resourcemanager.nodemanagers.heartbeat-interval-ms 1000(毫秒) NodeManager心跳间隔

..capacity.CapacityScheduler的完整名称: org.apache.Hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler

5.2.2 NodeManager相关配置参数

参数 默认值 说明
yarn.nodemanager.resource.memory-mb 8192 NodeManager总的可用物理内存(这个值通过一定要配置)
yarn.nodemanager.vmem-pmem-ratio 2.1 每使用1MB物理内存,最多可用的虚拟内存数
yarn.nodemanager.resource.cpu-vcores 8 NodeManager总的可用虚拟CPU个数
yarn.nodemanager.local-dirs ${hadoop.tmp.dir}/nm-local-dir 中间结果存放位置,这个参数通常会配置多个目录,已分摊磁盘IO负载。
yarn.nodemanager.log-dirs ${yarn.log.dir}/userlogs 日志存放地址(可配置多个目录)
yarn.nodemanager.log.retain-seconds 10800 NodeManager上日志最多存放时间(不启用日志聚集功能时有效)
yarn.nodemanager.aux-services   NodeManager上运行的附属服务,需配置成mapreduce_shuffle,才可运行MapReduce程序

5.2.3 mapred-site.xml

参数 默认值 说明
mapreduce.job.reduces 1 默认启动的reduce数
mapreduce.job.maps 2 默认启动的map数
mapreduce.task.io.sort.factor 10 Reduce Task中合并小文件时,一次合并的文件数据
mapreduce.task.io.sort.mb 100 Map Task缓冲区所占内存大小
mapred.child.java.opts -Xmx200m jvm启动的子线程可以使用的最大内存
mapreduce.jobtracker.handler.count 10 JobTracker可以启动的线程数,一般为tasktracker节点的4%
mapreduce.reduce.shuffle.parallelcopies 5 reuduce shuffle阶段并行传输数据的数量
mapreduce.tasktracker.http.threads 40 map和reduce是通过http进行数据传输的,这个是设置传输的并行线程数
mapreduce.map.output.compress false map输出是否进行压缩,如果压缩就会多耗cpu,但是减少传输时间,如果不压缩,就需要较多的传输带宽
mapreduce.reduce.shuffle.merge.percent 0.66 reduce归并接收map的输出数据可占用的内存配置百分比
mapreduce.reduce.shuffle.memory.limit.percent 0.25 一个单一的shuffle的最大内存使用限制。
mapreduce.jobtracker.handler.count 10 可并发处理来自tasktracker的RPC请求数,默认值10。
mapreduce.job.jvm.numtasks 1 一个jvm可连续启动多个同类型任务,默认值1,若为-1表示不受限制。
mapreduce.tasktracker.reduce.tasks.maximum 2 一个tasktracker并发执行的reduce数,建议为cpu核数

5.2.4 参数调优

参照 
http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.0/index.html

这里写图片描述

单击“Command Line Installation”超链接后的PDF图标,即可打开HDP安装文档。

这里写图片描述

单击“1.10. Determining HDP Memory Configuration Settings”条目,跳转到对应页面

这里写图片描述

单击“Download Companion Files”连接,可以看到两条命令

wget http://public-repo-1.hortonworks.com/HDP/tools/2.6.0.3/ hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz

tar zxvf hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz

下载:

[root@node1 ~]# wget http://public-repo-1.hortonworks.com/HDP/tools/2.6.0.3/hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz
--2017-05-23 23:26:13--  http://public-repo-1.hortonworks.com/HDP/tools/2.6.0.3/hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz
Resolving public-repo-1.hortonworks.com (public-repo-1.hortonworks.com)... 52.84.167.222, 52.84.167.38, 52.84.167.49, ...
Connecting to public-repo-1.hortonworks.com (public-repo-1.hortonworks.com)|52.84.167.222|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 85173 (83K) [application/x-tar]
Saving to: ‘hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz’

100%[==================================================================================================================================>] 85,173       132KB/s   in 0.6s   

2017-05-23 23:26:14 (132 KB/s) - ‘hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz’ saved [85173/85173]

解压缩

[root@node1 ~]# tar -zxvf hdp_manual_install_rpm_helper_files-2.6.0.3.8.tar.gz
hdp_manual_install_rpm_helper_files-2.6.0.3.8/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/
.....
2.6.0.3.8/configuration_files/zookeeper/configuration.xsl
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/zookeeper/log4j.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/zookeeper/zoo.cfg
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/pig/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/pig/pig-env.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/pig/pig.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/pig/log4j.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/oozie-log4j.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/oozie-env.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/oozie-site.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/adminusers.txt
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/hadoop-config.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/oozie/oozie-default.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hbase-policy.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/regionservers
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hadoop-metrics.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hbase-env.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hadoop-metrics.properties.master-GANGLIA
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hadoop-metrics.properties.regionservers-GANGLIA
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/hbase-site.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/hbase/log4j.properties
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/webhcat/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/webhcat/webhcat-env.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/configuration_files/webhcat/webhcat-site.xml
hdp_manual_install_rpm_helper_files-2.6.0.3.8/scripts/
hdp_manual_install_rpm_helper_files-2.6.0.3.8/scripts/directories.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/scripts/yarn-utils.py
hdp_manual_install_rpm_helper_files-2.6.0.3.8/scripts/usersAndGroups.sh
hdp_manual_install_rpm_helper_files-2.6.0.3.8/readme.txt
hdp_manual_install_rpm_helper_files-2.6.0.3.8/HDP-CHANGES.txt
[root@node1 ~]# 
[root@node1 ~]# cd hdp_manual_install_rpm_helper_files-2.6.0.3.8
[root@node1 hdp_manual_install_rpm_helper_files-2.6.0.3.8]# ls
configuration_files  HDP-CHANGES.txt  readme.txt  scripts
[root@nb0 hdp_manual_install_rpm_helper_files-2.6.0.3.8]# cd scripts/
[root@nb0 scripts]# ls
directories.sh  usersAndGroups.sh  yarn-utils.py

假设,某个节点有4核8G内存1块硬盘,该节点同时安装HBase,通过下面命令即可获得优化参数

[root@nb0 scripts]# python yarn-utils.py -c 4 -m 8 -d 1 -k True
 Using cores=4 memory=8GB disks=1 hbase=True
 Profile: cores=4 memory=5120MB reserved=3GB usableMem=5GB disks=1
 Num Container=3
 Container Ram=1536MB
 Used Ram=4GB
 Unused Ram=3GB
 yarn.scheduler.minimum-allocation-mb=1536
 yarn.scheduler.maximum-allocation-mb=4608
 yarn.nodemanager.resource.memory-mb=4608
 mapreduce.map.memory.mb=1536
 mapreduce.map.java.opts=-Xmx1228m
 mapreduce.reduce.memory.mb=3072
 mapreduce.reduce.java.opts=-Xmx2457m
 yarn.app.mapreduce.am.resource.mb=3072
 yarn.app.mapreduce.am.command-opts=-Xmx2457m
 mapreduce.task.io.sort.mb=614

选项说明

选项 描述
-c 每一个客户机的核数目
-m 每一个客户机拥有的内存总数
-d 每一个客户机拥有的磁盘数目
-k 如果Hbase安装了为”True”,否则为”False”

猜你喜欢

转载自blog.csdn.net/airufengye/article/details/80867711