Tuning of HIVE MapJoin

  • If no MapJoin MapJoin or do not meet the conditions, the parser will then Hive converted into the Common Join Join operation, namely: Reduce the completed join stage. Data skew easily occurs. MapJoin the small table can all be loaded into memory map join at the end, to avoid the processing reducer

  • Open MapJoin parameter settings

    • Automatically setting Mapjoin (the default is true): set hive.auto.convert.join = true;

    • Threshold is set large tables small tables (default 25M think it is a small table): set hive.mapjoin.smalltable.filesize = 25000000;

Guess you like

Origin www.cnblogs.com/xiangyuguan/p/11411265.html
Recommended