MapReduce程序计算多个文件,对里面的数字进行排序,并输出(附例子)

MapReduce计算的案例,如下:

  数据排序sortDemo:

          将sortfile1.txt、sortfile2.txt、sortfile3.txt中的记录整合排序后,输出到一个文件中,包含行号。

 编写MapReduce程序,实现上述内容:

         分析:利用MR的sort能力,必须进行shuffle,一定实现reduce;
               1.编写mapper
                         将<k1,v1>(行偏移量,行值)  --> <k2,v2> (行值,1)
               2.编写reducer
                         接收来自mapper端的数据<k2,[v2,v2,v2...]>,此时,数据已经按key进行排序
                 循环遍历values,context.write(linesum,k2);linesum++;
               3.编写dirver
                         省略!!!

1.Mapper.class

public class SortMapper extends Mapper<LongWritable,Text,LongWritable,LongWritable> {
	@Override
	protected void map(LongWritable key, Text value, Context context)
			throws IOException, InterruptedException {
		 String line=value.toString();
		 context.write(new LongWritable(Integer.parseInt(line)), new LongWritable(1));
	}
}
此时<k1,v1>:表示<行偏移量,行值(用Text表示)>;

        <k2,v2>:表示<行值LongWritabe类型,1>;

2.Reducer.class

public class SortReducer extends Reducer<LongWritable,LongWritable,LongWritable,LongWritable> {
	 private static LongWritable linesum = new LongWritable(1);
	@Override
	protected void reduce(LongWritable key, Iterable<LongWritable> values, Context context)
			throws IOException, InterruptedException { 
		for(LongWritable v:values){
			context.write(linesum,key);
			linesum = new LongWritable(linesum.get()( + 1);	
                }	
        }
}

3.Driver.class-主类

public class SortDriver {
	public static void main(String[] args) throws IllegalArgumentException, IOException, ClassNotFoundException, InterruptedException {
		Configuration conf = new Configuration();	 
	        Job job = Job.getInstance(conf);		  
		job.setJarByClass(SortMapper.class);
		job.setJobName("mysort");
		job.setMapperClass(SortMapper.class);//输入数据方法
		job.setReducerClass(SortReducer.class);//计算结果
		
		job.setOutputKeyClass(LongWritable.class);
		job.setOutputValueClass(LongWritable.class);
		 //mysort包含三个sortfile.txt文件,结果输出到outsort
		 FileInputFormat.addInputPath(job, new Path("file:///D:/mysort"));
		 FileOutputFormat.setOutputPath(job, new Path("file:///D:/outsort"));
		System.exit(job.waitForCompletion(true) ? 0 : 1);
	}
}
4.输出结果 :
1	2    (左边是行号,后边是排序后的数。)
2	3
3	3
4	3
5	3
6	4
7	4
8	4
9	5
10	5
11	5
12	5
13	6
14	7
15	45
16	56
17	56
18	67
19	67
20	67
21	76
22	78
23	78
24	88
25	98
26	123
27	345
28	690
29	988












猜你喜欢

转载自blog.csdn.net/xiaozelulu/article/details/81020085