Under Linux Hive performance optimization (personal)

Update date 2020-01-03

// Open Local of Mr
SET = hive.exec.mode.local.auto to true;
// set local mr maximum amount of input data, using the local mr manner when the input data is less than this value, the default is 134217728, i.e. 128M
SET = 50000000 hive.exec.mode.local.auto.inputbytes.max;
// set the maximum number of input files local mr, using local mr manner when the number of the input file is less than this value, default. 4
sET hive.exec = 10 .mode.local.auto.input.files.max;
// open parallel execution of tasks
set hive.exec.parallel = true;
the maximum number of threads // same sql allow concurrent tasks
set hive.exec.parallel.thread =. 8 .number;
// hive.exec.reducers.bytes.per.reducer parameter adjustment values (500MB)
SET hive.exec.reducers.bytes.per.reducer = five hundred million;
// adjust the number of the reduce
SET mapred.reduce = Number .tasks;
// table for small connecting large table
sET = hive.auto.convert.join to true;
// set input map file merge small
set mapred.max.split.size = 256000000;
// split on at least one node of size (This value determines whether a plurality of files need to be merged DataNode)
SET mapred.min.split.size.per.node = 100000000;
// switch at least a split size (this value determines whether files on multiple switches need to be merged)
the SET mapred.min.split.size.per.rack = 100000000;
small-file pre-merger Map // execution
set hive.input.format = org.apache. hadoop.hive.ql.io.CombineHiveInputFormat;

Guess you like

Origin www.cnblogs.com/suhaohao/p/12144219.html