MapReduce_input stage

Input stage of the data on the data node deserialized, then divided sections.

Data slice: (1) map parallelism stage a job by the client when the number of slices determined job submitting

     (2) assign each slice example of a parallel processing MapTask

     (3) By default, the size of the slice is equal to BlockSize, i.e. data block size

 

 

 

     

 

Guess you like

Origin www.cnblogs.com/lihui001/p/12516712.html