学习hbase 这么久了,下面做一个简单的wordcount练习
- WordCountRunner
public class HBaseWordCountRunner{
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
conf.set("fs.defaultFS", "hdfs://localhost:8020");
conf.set("hbase.zookeeper.quorum", "localhost");
Job job = Job.getInstance(conf);
job.setJarByClass(HBaseWordCountRunner.class);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(IntWritable.class);
job.setMapperClass(WCMapper.class);
TableMapReduceUtil.initTableReducerJob(
"wc",
WCReducer.class,
job);
FileInputFormat.addInputPath(job, new Path("/test"));
job.waitForCompletion(true);
}
- WCMapper 跟MR的mapper 写法一样
public class WCMapper
extends Mapper<LongWritable, Text, Text, IntWritable>{
@Override
protected void map(LongWritable key, Text value, Mapper<LongWritable, Text, Text, IntWritable>.Context context)
throws IOException, InterruptedException {
String[] words = value.toString().split(" ");
for(String word:words) {
context.write(new Text(word), new IntWritable(1));
}
}
}
- WCReduder 与MR 的reduder写法不一样
reducer 的输出类型为 ImmutableBytesWritable
public class WCReducer extends TableReducer<Text, IntWritable,
ImmutableBytesWritable>{
@Override
protected void reduce(Text text, Iterable<IntWritable> iterable,
Reducer<Text, IntWritable, ImmutableBytesWritable, Mutation>.Context context)
throws IOException, InterruptedException {
int sum=0;
for(IntWritable it:iterable) {
sum+=it.get();
}
// write into hbase
Put put = new Put(text.toString().getBytes());
put.add("cf".getBytes(),"ct".getBytes(),(sum+"").getBytes());
context.write(null, put);
}
}