-
By default, after the Map stage, the same Key data is distributed to a reduce, when the same key data is too large to generate a data skew. Not all of the polymerization operations are necessary to complete Reduce end, a lot of polymerization operations can be carried out first partially polymerized in the Map end, the conclusion that the end result at the end Reduce
-
Map open end of the polymerization parameters
-
Whether polymerized in Map end, the default is True: hive.map.aggr to true =
-
The number of entries in the Map polymerization operation end: hive.groupby.mapaggr.checkinterval = 100000
-
When data load balancing inclined (defaults to false): hive.groupby.skewindata to true =
-
-