网上其他的解释是,MR的in/out的key、value类型不匹配,或者job的输入输出格式不匹配导致报如下错误:
java.lang.NullPointerException at org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73) at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:970) at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:673) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364) at org.apache.hadoop.mapred.Child$4.run(Child.java:255) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190) at org.apache.hadoop.mapred.Child.main(Child.java:249)
但仔细检查过代码,是正确的。后来对比此前正确的代码,发现,如果一个job没有map输入,但还是的显式设置输入的key、value格式:
job.getConfiguration().set("mapred.mapoutput.key.class", "org.apache.hadoop.io.Text"); job.getConfiguration().set("mapred.mapoutput.value.class", "org.apache.hadoop.io.Text");
这样就能正确运行。