2018-08-06 期 MapReduce MRUnit安装及单元测试

一、MRUnit测试jar包

mrunit-1.1.0-hadoop2.jar

第三方依赖

MRUnit\apache-mrunit-1.1.0-hadoop1-bin\lib

1.png

二、在现有工程里面配置MRUnit单元测试

1、新建一个userlib

2.png

2、将MRUnitLib添加到mr工程,如下图:

3.png

3、解决jar包冲突

由于在MRUnitLib包中存在mockito-core-1.9.5.jar包,该包和E:\depslib\hadoop-2.4.1\share\hadoop\common\lib\mockito-all-1.8.5.jar冲突,因此需要将mrlib包中的mockito-all-1.8.5.jar移除

为此,工程MRUnit单元测试环境以及搭建完成。

三、进行MRUnit单元测试

这里以前期编写的WordCount MR程序为例来测试Mapper阶段、Reducer、Job是否正常

测试用例代码

1、Mapper阶段测试

/**

* 对WordCountMapper进行MRUnit单元测试

* @throws Exception

*/

@Test

public void mapperTest() throws Exception {

//创建一个WordCountMapper的对象wordCountMapper

WordCountMapper wordCountMapper = new WordCountMapper();

//创建Map驱动driver MapDriver<K1, V1, K2, V2> 对应Mapper<K1, V1, K2, V2> 并指定运行的Mapper程序

MapDriver<LongWritable, Text, Text, IntWritable> mapDriver = new MapDriver<>(wordCountMapper);

//指定Map输入数据

mapDriver.withInput(new LongWritable(1), new Text("Hello word"))

.withInput(new LongWritable(3), new Text("Hello java java is a good language"));

//指定Map输出数据  -->我们期望输出的数据

mapDriver.withOutput(new Text("Hello"), new IntWritable(1))

.withOutput(new Text("word"), new IntWritable(1))

.withOutput(new Text("Hello"), new IntWritable(1))

.withOutput(new Text("java"), new IntWritable(1))

.withOutput(new Text("java"), new IntWritable(1))

.withOutput(new Text("is"), new IntWritable(1))

.withOutput(new Text("a"), new IntWritable(1))

.withOutput(new Text("good"), new IntWritable(1))

.withOutput(new Text("language"), new IntWritable(1));

//执行单元测试 -->对比我们期望输出的数据和Mapper实际输出的数据是否一致,不一致则会报错,一致则会通过

mapDriver.runTest();

}

测试结果:

测试结果通过

假设我们稍微调整一下Mapper程序输出,在利用单元测试用例测试观察看看是否通过

--部分mapper程序,这里将

context.write(new Text(word), new IntWritable(1));

修改为

context.write(new Text(word), new IntWritable(2));

再次执行单元测试:

测试失败 输出部分信息如下:

Missing expected output (Hello, 1) at position 0, got (Hello, 2).

......

Missing expected output (language, 1) at position 8, got (language, 2).

通过分析该信息发现Mapper输出和我们测试用例输出不一致,

output (Hello, 1) at position 0, got (Hello, 2).

表示我们期望输出为(Hello, 1),但是通过Mapper执行后输出为(Hello, 2).,这样我们就可以分析问题到底是测试用例问题还是Mapper业务逻辑存在问题,这里显然使我们认为的修改了Mapper的逻辑导致。

2、Reducer阶段测试

/**

* 对WordCountReducer进行MRUnit单元测试

* @throws Exception

*/

@Test

public void reducerTest() throws Exception {

//创建一个WordCountReducer的对象wordCountReducer

WordCountReducer wordCountReducer = new WordCountReducer();

//创建Reduce驱动driver ReduceDriver<K3, V3, K4, V4> 对应Reducer<Text, IntWritable, Text, IntWritable> 并指定运行的Reducer程序

ReduceDriver<Text, IntWritable, Text, IntWritable> reduceDriver = new ReduceDriver<>(wordCountReducer);

//指定Reduce输入数据

//构造一个输入v3 list对象

List<IntWritable> v3list1 = new ArrayList<IntWritable>();

v3list1.add(new IntWritable(1));

v3list1.add(new IntWritable(1));

List<IntWritable> v3list2 = new ArrayList<IntWritable>();

v3list2.add(new IntWritable(1));

v3list2.add(new IntWritable(1));

v3list2.add(new IntWritable(1));

reduceDriver.withInput(new Text("Hello"), v3list1)

   .withInput(new Text("Java"), v3list2);

//指定Reduce输出数据

reduceDriver.withOutput(new Text("Hello"), new IntWritable(2));

reduceDriver.withOutput(new Text("Java"), new IntWritable(3));

//运行Rudece单元测试

reduceDriver.runTest();

}

执行单元测试:

测试结论通过

下面我们将

reduceDriver.withOutput(new Text("Hello"), new IntWritable(2));

修改为

reduceDriver.withOutput(new Text("Hello"), new IntWritable(1));

人为的模拟单元测试用例存在问题

再次测试发现不通过,日志如下:

2018-08-01 16:11:28,769 ERROR [main] mrunit.TestDriver (Errors.java:record(57)) - Missing expected output (Hello, 1) at position 0, got (Hello, 2).

通过分析发现(Hello, 1)但是通过Reduc处理后输出为 (Hello, 2).,所以测试失败。这样我们就可以利用这一特性来对我们Reducer和单元测试用例进行分析,发现问题所在。这里由于是我们人为的修改了单元测试用例,因此问题主要在单元测试用例上。

3、Job任务测试,同时测试Mapper和Reducer

/**

* 对WordCount进行MRUnit进行整体的单元测试

* @throws Exception

*/

@Test

public void jobTest() throws Exception {

//创建一个WordCountMapper的对象wordCountMapper

WordCountMapper wordCountMapper = new WordCountMapper();

//创建一个WordCountReducer的对象wordCountReducer

WordCountReducer wordCountReducer = new WordCountReducer();

//创建驱动driver MapReduceDriver<K1, V1, K2, V2, K4, V4> 对应Mapper<K1, V1, K2, V2>和Reducer<K4,V4>,并指定运行的mapper和reducer

MapReduceDriver<LongWritable, Text, Text, IntWritable, Text, IntWritable> driver = new MapReduceDriver<>(wordCountMapper,wordCountReducer);

//指定Mapper的输入

driver.withInput(new LongWritable(1), new Text("Hello word"))

 .withInput(new LongWritable(1), new Text("Hello java java is a good language"));

//指定Reducer输出 -->期望的输出

driver.withOutput(new Text("Hello"), new IntWritable(2))

 .withOutput(new Text("word"), new IntWritable(1))

 .withOutput(new Text("java"), new IntWritable(2))

 .withOutput(new Text("is"), new IntWritable(1))

 .withOutput(new Text("a"), new IntWritable(1))

 .withOutput(new Text("good"), new IntWritable(1))

 .withOutput(new Text("language"), new IntWritable(1));

//运行单元测试 -->对比我们期望输出的数据和Reducer阶段实际输出的数据是否一致,不一致则会报错,一致则会通过

driver.runTest();

}

测试结果:

不通过,日志如下:

2018-08-01 16:24:33,724 ERROR [main] mrunit.TestDriver (Errors.java:record(57)) - Missing expected output (word, 1) at position 1, got (a, 1).

2018-08-01 16:24:33,725 ERROR [main] mrunit.TestDriver (Errors.java:record(57)) - Missing expected output (java, 2) at position 2, got (good, 1).

2018-08-01 16:24:33,725 ERROR [main] mrunit.TestDriver (Errors.java:record(57)) - Missing expected output (a, 1) at position 4, got (java, 2).

2018-08-01 16:24:33,726 ERROR [main] mrunit.TestDriver (Errors.java:record(57)) - Missing expected output (good, 1) at position 5, got (language, 1).

2018-08-01 16:24:33,726 ERROR [main] mrunit.TestDriver (Errors.java:record(57)) - Missing expected output (language, 1) at position 6, got (word, 1).

分析发现

expected output (word, 1) at position 1, got (a, 1).

表示期望的(word,1)的位置实际上是got (a, 1).这与我们实际不符。

这就要分析到底是测试用例的问题还是Mapper或者Reducer的问题。通过深入分析发现,数据通过MapReduce处理后输出的数据是按照Key进行排过序的,即MapReduce中数据Key会采用默认的排序规则进行排序。而我们测试用例里面期望输出的key单词是没有排过序的。

下面我们队期望输出的数据按照MapReduce默认的排序规则进行排序,如下:

//指定Reducer输出 -->期望的输出

driver.withOutput(new Text("Hello"), new IntWritable(2))

     .withOutput(new Text("a"), new IntWritable(1))

     .withOutput(new Text("good"), new IntWritable(1))

     .withOutput(new Text("is"), new IntWritable(1))

     .withOutput(new Text("java"), new IntWritable(2))

     .withOutput(new Text("language"), new IntWritable(1))

 .withOutput(new Text("word"), new IntWritable(1));

这样期望输出的key就是安装MapReduce默认规则排序了

再次运行单元测试就通过了。


猜你喜欢

转载自blog.51cto.com/2951890/2155096
今日推荐