MapReduce一个Mapper读Hbase表并且写Hbase方案

场景:

      在MapReduce操作Hbase的时候有时候不需要reduce,只需要用Mapper的时候读A表,处理完之后写B表,也就是A表为输入表,B表为输出表。需求就是 A中的有一列E:E,把‘E:E’中为数字的数据保存到B表中。

表说明:

     create A,'E'

     create B,'E'

方案一:直接在mapper中打开B表,如果不是数字,那么直接Put进去,这个比较简单优缺点就不说了

方案二:TableMapReduceUtil.initTableMapperJob把A表作为输入,同时设置outputValueClass为Put,然后设置OutputTable即可,不需要reduce

实现:

Job:

private static void runJob() {
	String outputTableName = "B";
	String inputTableName = "A";
	Configuration conf = HBaseConfiguration.create();
	conf.set("hbase.master", XXX);
	conf.set("hbase.zookeeper.quorum", XXX);
	conf.set("hbase.cluster.distributed", "true");
	conf.set(TableOutputFormat.OUTPUT_TABLE, outputTableName);
	try {
		Scan scan = new Scan();
		Job job = new Job(conf, "DataFormat Task");

		job.setJarByClass(DataFormatTask.class);
		TableMapReduceUtil.initTableMapperJob(inputTableName, scan,
				DataFormatMapper.class, NullWritable.class, Put.class, job);

		job.setOutputFormatClass(TableOutputFormat.class);
		job.setNumReduceTasks(0);
		job.waitForCompletion(true);

	} catch (Throwable e) {
		throw new RuntimeException("Run DataFormatTask error! ", e);
	} finally {
		HConnectionManager.deleteConnection(conf, true);
	}

}

Main:

public static void main(String[] args) {
	runJob();
}

 DataFormatMapper:

protected void map(ImmutableBytesWritable key, Result value,
	Context context) throws IOException,
	InterruptedException {
	LOGGER.info("key:" + Bytes.toString(key.get()));
	LOGGER.info("row:" + Bytes.toString(value.getRow()));
	String val = Bytes.toString(value.getValue(Bytes.toBytes("E"), Bytes.toBytes("E")));
	if (!NumberUtils.isDigits(val)) {
		return;
	}
	Put put = new Put(key.get());
	put.add(Bytes.toBytes("E"), Bytes.toBytes("E"), Bytes.toBytes(val));
	context.write(NullWritable.get(), put);
}

猜你喜欢

转载自snv.iteye.com/blog/1825547
今日推荐