Hive并行排序

set hive.optimize.sampling.orderby=true;
set hive.optimize.sampling.orderby.number=10000;
set hive.optimize.sampling.orderby.percent=0.1f;

记录一下,Hive中并行排序参数;

hive.optimize.sampling.orderby
    Default Value: false
    Added In: Hive 0.12.0 with HIVE-1402
Uses sampling on order-by clause for parallel execution.


hive.optimize.sampling.orderby.number
    Default Value: 1000
    Added In: Hive 0.12.0 with HIVE-1402
With hive.optimize.sampling.orderby=true, total number of samples to be obtained to calculate partition keys.


hive.optimize.sampling.orderby.percent
    Default Value: 0.1
    Added In: Hive 0.12.0 with HIVE-1402
With hive.optimize.sampling.orderby=true, probability with which a row will be chosen.

猜你喜欢

转载自superlxw1234.iteye.com/blog/2155436
今日推荐