The shuffle process in MapReduce

Shuffle is the core of MapReduce, the intermediate process between map and reduce.

Map is responsible for filtering and distribution, reduce is merged and sorted, and the process from map output to reduce input is the shuffle process.

Implemented functionality

partition

Determine which reducer to process the current key

Default: partition the number of reduce according to the hash value of the key

grouping

Merge values of the same key

sort

Sort each keyvalue by key, lexicographically

Process

map端shuffle

spill stage: overflow write

The result of each map task processing will enter the ring buffer (memory 100M)

partition

Partition each key (mark to which reduce)

hadoop      1       reduce0
hive        1       reduce0
spark       1       reduce1
hadoop      1       reduce0
hbase       1       reduce1

sort

Sort by key, sort the data in the same partition within the partition

hadoop      1       reduce0
hadoop      1       reduce0
hive        1       reduce0
hbase       1       reduce1
spark       1       reduce1

overflow

When the entire buffer reaches 80% of the threshold, start overflow writing


Write the sorted data of the current partition to disk into a file file1 
and finally generate multiple spill small files

The memory size and overflow threshold can be set in mapred-site.xml

Set the size of memory in mapred-site.xml

　　　　<property>

　　　　　　<name>mapreduce.task.io.sort.mb</name>

　　　　　　<value>100</value>

　　　　</property>

Set the threshold for out-of-memory writes in mapred-site.xml　　

　　　　<property>

　　　　　　<name>mapreduce.task.io.sort.spill.percent</name>

　　　　　　<value>0.8</value>

　　　　</property>

merge: merge

Combine multiple small files generated by spill

Sorting: Sort the data in the same partition within the partition, and implement the comparator for comparison. Finally a file is formed.

file1
hadoop      1       reduce0
hadoop      1       reduce0
hive        1       reduce0
hbase       1       reduce1
spark       1       reduce1

file2
hadoop      1       reduce0
hadoop      1       reduce0
hive        1       reduce0
hbase       1       reduce1
spark       1       reduce1

end_file:
hadoop      1       reduce0
hadoop      1       reduce0
hadoop      1       reduce0
hadoop      1       reduce0
hive        1       reduce0
hive        1       reduce0
hbase       1       reduce1
hbase       1       reduce1
spark       1       reduce1
spark       1       reduce1

When the map task ends, notify the app master, and the app master notifies reduce to pull the data

reduce端shuffle

map task1
        hadoop      1       reduce0
        hadoop      1       reduce0
        hadoop      1       reduce0
        hadoop      1       reduce0
        hive        1       reduce0
        hive        1       reduce0
        hbase       1       reduce1
        hbase       1       reduce1
        spark       1       reduce1
        spark       1       reduce1
map task2
        hadoop      1       reduce0
        hadoop      1       reduce0
        hadoop      1       reduce0
        hadoop      1       reduce0
        hive        1       reduce0
        hive        1       reduce0
        hbase       1       reduce1
        hbase       1       reduce1
        spark       1       reduce1
        spark       1       reduce1

reduce starts multiple threads to pull data belonging to its own partition to each machine through http

reduce0：
    hadoop      1       reduce0
    hadoop      1       reduce0
    hadoop      1       reduce0
    hadoop      1       reduce0
    hadoop      1       reduce0
    hadoop      1       reduce0
    hadoop      1       reduce0
    hadoop      1       reduce0
    hive        1       reduce0
    hive        1       reduce0
    hive        1       reduce0
    hive        1       reduce0

merge: merge, merge the data of its own partitions in the results of each map task

Sort: Sort the data that belongs to my partition as a whole

Grouping: Merge the values of the same key and use Comparable to complete the comparison.

hadoop，list<1,1,1,1,1,1,1,1>
hive,list<1,1,1,1>

optimization

combine

A merge is performed early in the map phase. Generally equivalent to executing reduce ahead of time


job.setCombinerClass(WCReduce.class);

compress

Compress intermediate result sets, reduce disk IO and network IO

Compression configuration


1. default: the default configuration items in all hadoop 
2. site: used to customize the configuration file, if it is modified, it must be restarted to take effect 
3. conf object configures the custom configuration of each program 
4. User customization through parameters at runtime configure 
bin / yarn jar xx.jar -Dmapreduce.map.output.compress = true -Dmapreduce.map.output.compress.codec = org.apache.hadoop.io.compress.Lz4Codec main_class input_path ouput_path _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

See what compressions are supported by the native library


bin / hadoop checknative

Configure compression via the conf configuration object

public static void main(String[] args) {
        Configuration configuration = new Configuration();
        //Configure map intermediate result set compression
        configuration.set("mapreduce.map.output.compress","true");
        configuration.set("mapreduce.map.output.compress.codec", "org.apache.hadoop.io.compress.Lz4Codec");
        //Configure reduce result set compression
        configuration.set("mapreduce.output.fileoutputformat.compress","true");
        configuration.set("mapreduce.output.fileoutputformat.compress.codec", "org.apache.hadoop.io.compress.Lz4Codec");
        try {
            int status = ToolRunner.run(configuration, new MRDriver(), args);
            System.exit(status);
        } catch (Exception e) {
            e.printStackTrace ();
        }
}