hadoop--mapreduce--自定义key类型

问题:

输入文件A的样例如下(注意文件以tab为分隔符,粘贴时请检查):

20170101     x

20170102     y

20170103     x

20170104     y

20170105     z

20170106     x

输入文件B的样例如下:

20170101      y

20170102      y

20170103      x

20170104      z

20170105      y

根据输入文件A和B合并得到的输出文件C的样例如下:

20170101      x

20170101      y

20170102      y

20170103      x

20170104      y

20170104      z

20170105      y  20170105      z

20170106      x

代码实现:

 1 import org.apache.hadoop.fs.Path;
 2 import org.apache.hadoop.io.DoubleWritable;
 3 import org.apache.hadoop.io.IntWritable;
 4 import org.apache.hadoop.io.LongWritable;
 5 import org.apache.hadoop.io.Text;
 6 import org.apache.hadoop.mapreduce.Job;
 7 import org.apache.hadoop.mapreduce.Mapper;
 8 import org.apache.hadoop.mapreduce.Reducer;
 9 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
10 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
11 import org.apache.hadoop.util.GenericOptionsParser;
12 
13 public class Task1 {
14     public static class MapClass extends Mapper<LongWritable, Text, Text, Text>{
15         public void map(LongWritable key,Text value,Context context) throws IOException, InterruptedException {
16             context.write(value, new Text(""));
17         }
18     }
19     public static  class ReduceClass extends Reducer<Text,Text,Text,Text>{
20         public void reduce( Text key, Iterable<Text> values,Context context) throws IOException, InterruptedException {
21             context.write(key, new Text(""));
22         }
23     }
24     public static void main(String args[]) throws IOException, ClassNotFoundException, InterruptedException {
25         Configuration conf = new Configuration();
26         Job job = new Job(conf);
27         job.setJarByClass(Task1.class);
28         job.setMapperClass(MapClass.class);
29         job.setReducerClass(ReduceClass.class);
30         job.setOutputKeyClass(Text.class);
31         job.setOutputValueClass(Text.class);
32         
33         FileInputFormat.addInputPath(job, new Path("C:\\Users\\Administrator\\Desktop\\新建文件夹\\input2.txt")  );
34         FileInputFormat.addInputPath(job, new Path("C:\\Users\\Administrator\\Desktop\\\\新建文件夹\\input1.txt")  );
35         FileOutputFormat.setOutputPath(job, new Path("C:\\Users\\Administrator\\Desktop\\新建文件夹\\output"));
36         
37         System.exit(job.waitForCompletion(true)?0:1);
38     }
39 }

 结果:

 

 踩过的坑:

  reduce不执行的原因:

    1、程序出现过异常,可以通过日志来debug;

    2、参数类型不匹配;

    等

猜你喜欢

转载自www.cnblogs.com/z-bear/p/9846089.html