报错:
Exception in thread "main" java.lang.RuntimeException: No new data sinks have been defined since the last execution. The last execution refers to the latest call to 'execute()', 'count()', 'collect()', or 'print()'.
at org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:944)
at org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:926)
at org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:85)
at org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:820)
at org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:525)
at com.yxf.wordcount.BatchWordCount$.main(BatchWordCount.scala:15)
at com.yxf.wordcount.BatchWordCount.main(BatchWordCount.scala)
代码:
object BatchWordCount {
def main(args: Array[String]): Unit = {
val env: ExecutionEnvironment = ExecutionEnvironment.getExecutionEnvironment
val line: DataSet[String] = env.readTextFile("input")
val result: AggregateDataSet[(String, Int)] = line.flatMap(_.split(" ")).map((_, 1)).groupBy(0).sum(1)
result.print()
env.execute()
}
}
问题解决:
异常的原因就是说,自上次执行以来,没有定义新的数据接收器。对于离线批处理的算子,如:“count()”、“collect()”或“print()”等既有sink功能,还有触发的功能。我们上面调用了print()方法,会自动触发execute,所以最后面的一行执行器没有数据可以执行。
对于学过sparkstreaming后初学flink的小伙伴来说,这里要注意一下,有点小区别。