About the errors and solutions encountered in the running example of "Spark Fast Big Data Analysis"

1. Description

In Chapter 2 of the book, there is an example, after building it, run:

${SPARK_HOME}/bin/spark-submit --class com.oreilly.learningsparkexamples.mini.java.WordCount ./target/learning-spark-mini-example-0.0.1.jar ./README.md ./wordcouts

If the spark version used is different from the one used in the book, there will be various problems. For example, the book uses 1.2.0 and I use the latest 2.3.0.

 

2. Problems and solutions

1. When compiling for the first time, an error similar to the following appears:

ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.AbstractMethodError: com.oreilly.learningsparkexamples.mini.java.WordCount$1.call(Ljava/lang/Object;)Ljava/util/Iterator;
....

The first is to solve the problem of version dependencies :

(1) Obtain the spark-core version and the spark version by viewing the following paths:

${SPARK_HOME}/jars/spark-core_x.xx-y.y.y.jar

(2) Modify the pom.xml in the mini-complete-example directory and replace the version number you just checked with the original:

<dependency> <!-- Spark dependency -->
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_x.xx</artifactId>
    <version>y.y.y</version>
    <scope>provided</scope>
</dependency>

Recompile.

 

2. The second compilation is estimated to encounter the following error:

Java FlatMapFunction in Spark: error: is not abstract and does not override abstract method call(String) in FlatMapFunction
......

Locate the wrong sentence:

JavaRDD<String> words = input.flatMap(
      new FlatMapFunction<String, String>() {
        public Iterable<String> call(String x) {
          return Arrays.asList(x.split(" "));
        }});

I checked the inheritance rules of the FlatMapFunction <T, R> () interface in the book again, and found no errors. After thinking about it, it may be caused by different versions. I checked the latest version of the api and found that the return type of the method that needs to be implemented has changed:

java.util.Iterator<R>    call(T t)

is an Iterator<R> instead of an Iterable<R>, which is the right medicine:

(1) Import the Iterator package:

import java.util.Iterator;

(2) Modify the wrong sentence into:

JavaRDD<String> words = input.flatMap(
      new FlatMapFunction<String, String>() {
        @Override public Iterator<String> call(String x) {
          return Arrays.asList(x.split(" ")).iterator();
        }});

Recompile, package:

mvn compile && mvn package

Then run it again, the problem is solved

 

3. Reference

1. Apache Spark: ERROR Executor –> Iterator

2. Java FlatMapFunction in Spark: error: is not abstract and does not override abstract method call(String) in FlatMapFunction

3. Spark Api 

(over)

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325267157&siteId=291194637