com.mongodb.hadoop.splitter.SplitFailedException

使用 Spark 链接 MongoDB 数据库时报错:

Exception in thread "main" java.io.IOException: com.mongodb.hadoop.splitter.SplitFailedException: Failed to aggregate sample documents. Note that this Splitter implementation is incompatible with MongoDB versions prior to 3.2.
    at com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:62)
    at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:113)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
    at scala.Option.getOrElse(Option.scala:121)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
    at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:912)
    at org.apache.spark.rdd.RDD$$anonfun$foreach$1.apply(RDD.scala:910)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
    at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
    at org.apache.spark.rdd.RDD.foreach(RDD.scala:910)
    at xtwy.EstateNews$.main(EstateNews.scala:29)
    at xtwy.EstateNews.main(EstateNews.scala)
Caused by: com.mongodb.hadoop.splitter.SplitFailedException: Failed to aggregate sample documents. Note that this Splitter implementation is incompatible with MongoDB versions prior to 3.2.
    at com.mongodb.hadoop.splitter.SampleSplitter.calculateSplits(SampleSplitter.java:84)
    at com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:60)
    ... 14 more
Caused by: com.mongodb.MongoCommandException: Command failed with error 9: 'The 'cursor' option is required, except for aggregate with the explain argument' on server localhost:27017. The full response is { "ok" : 0.0, "errmsg" : "The 'cursor' option is required, except for aggregate with the explain argument", "code" : 9, "codeName" : "FailedToParse" }
    at com.mongodb.connection.ProtocolHelper.getCommandFailureException(ProtocolHelper.java:86)
    at com.mongodb.connection.CommandProtocol.execute(CommandProtocol.java:120)
    at com.mongodb.connection.DefaultServer$DefaultServerProtocolExecutor.execute(DefaultServer.java:159)
    at com.mongodb.connection.DefaultServerConnection.executeProtocol(DefaultServerConnection.java:286)
    at com.mongodb.connection.DefaultServerConnection.command(DefaultServerConnection.java:173)
    at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:215)
    at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:206)
    at com.mongodb.operation.CommandOperationHelper.executeWrappedCommandProtocol(CommandOperationHelper.java:112)
    at com.mongodb.operation.AggregateOperation$1.call(AggregateOperation.java:227)
    at com.mongodb.operation.AggregateOperation$1.call(AggregateOperation.java:223)
    at com.mongodb.operation.OperationHelper.withConnectionSource(OperationHelper.java:239)
    at com.mongodb.operation.OperationHelper.withConnection(OperationHelper.java:212)
    at com.mongodb.operation.AggregateOperation.execute(AggregateOperation.java:223)
    at com.mongodb.operation.AggregateOperation.execute(AggregateOperation.java:65)
    at com.mongodb.Mongo.execute(Mongo.java:772)
    at com.mongodb.Mongo$2.execute(Mongo.java:759)
    at com.mongodb.DBCollection.aggregate(DBCollection.java:1377)
    at com.mongodb.DBCollection.aggregate(DBCollection.java:1308)
    at com.mongodb.DBCollection.aggregate(DBCollection.java:1294)
    at com.mongodb.hadoop.splitter.SampleSplitter.calculateSplits(SampleSplitter.java:82)
    ... 15 more
    ......

根据报错的信息发现是分割器与现用的 MongoDB 版本不兼容导致的,因此要解决这个问题,需要在指定一个不同的分割器

val mongoConfig = new Configuration()
mongoConfig.set("mongo.splitter.class","com.mongodb.hadoop.splitter.StandaloneMongoSplitter")

猜你喜欢

转载自blog.csdn.net/u010225915/article/details/79711439