Hive(13):Hive与MR相关参数

一、实现功能

如果核心数比较少,内存比较多,则可以每个reduce多设置内存,使一个核心处理能力增大;
如果核心数充足,则内存设置可以比较少,进而使多个核心处理任务,增加处理速度。

二、配置

1.In order to change the average load for a reducer (in bytes):
每个reduce能够处理的数据量,字节,默认是1个G

set hive.exec.reducers.bytes.per.reducer=<number>;
例如
set hive.exec.reducers.bytes.per.reducer=1000000000;

2.In order to limit the maximum number of reducers:
设置最大运行的reduce个数,默认999个

set hive.exec.reducers.max=<number>

3.In order to set a constant number of reducers:
设置实际运行reduce的个数(默认值是1,但是在配置文件里面是看不到的)

set mapreduce.job.reduces=<number>

4.hive-site.xml的配置:

<property>
  <name>hive.exec.reducers.bytes.per.reducer</name>
  <value>1000000000</value>
  <description>size per reducer.The default is 1G, i.e if the input size is 10G, it will use 10 reducers.</description>
</property>

<property>
  <name>hive.exec.reducers.max</name>
  <value>999</value>
  <description>max number of reducers will be used. If the one
    specified in the configuration parameter mapred.reduce.tasks is
    negative, Hive will use this one as the max number of reducers when
    automatically determine number of reducers.</description>
</property>

猜你喜欢

转载自blog.csdn.net/u010886217/article/details/83890799
今日推荐