hash取模

这里是hadoop的mapreduce中的MapTask中的output输出的hash分区代码:


package org.apache.hadoop.mapreduce.lib.partition;

import org.apache.hadoop.classification.InterfaceAudience;
import org.apache.hadoop.classification.InterfaceStability;
import org.apache.hadoop.mapreduce.Partitioner;

/** Partition keys by their {@link Object#hashCode()}. */
@InterfaceAudience.Public
@InterfaceStability.Stable
public class HashPartitioner<K, V> extends Partitioner<K, V> {
    
    

  /** Use {@link Object#hashCode()} to partition. */
  public int getPartition(K key, V value,
                          int numReduceTasks) {
    
    
    return (key.hashCode() & Integer.MAX_VALUE) % numReduceTasks;
  }

}

key: 经过map处理后,你自己定义的
value: 经过map处理后,你自己定义的
numReduceTasks: reduce的数量(partition数量 == reduce)

(key.hashCode() & Integer.MAX_VALUE) % numReduceTasks;
之所以&Integer.MAX_VALUE = 1111 1111 为了避免负数出现;
0为正哦!!!

猜你喜欢

转载自blog.csdn.net/weixin_42096620/article/details/111228126
今日推荐