spark execution example eclipse maven packaging jar

First, create a new Maven project in eclipse Java EE. The specific options are as follows

 

 

Click Finish to create successfully, then change the default jdk1.5 to jdk1.8

 

 

 

Then edit pom.xml to add spark-core dependencies

<!-- https://mvnrepository.com/artifact/org.apache.spark/spark-core -->
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.2.1</version>
</dependency>

Then copy the source code sample program in the book. Since the spark version in the book is 1.2, my environment spark is 2.2.1, so I need to modify the code to adapt to the new version of the spark API

JavaRDD<String> words = input.flatMap(
new FlatMapFunction<String, String>() {
public Iterator<String> call(String x) {
return Arrays.asList(x.split(" ")).iterator();
}});

 

 

 

 

 Then execute Maven install and then enter the directory E:\developtools\eclipse-jee-neon-3-win32\workspace\learning-spark-mini-example\target to find learning-spark-mini-example-0.0.1-SNAPSHOT. jar and upload it to the linux directory of the spark2.2.1 environment

 

 

 Then execute the following command in linux, as shown below

[root@hserver1 ~]# spark-submit \
> --class com.oreilly.learningsparkexamples.mini.java.WordCount \
> learning-spark-mini-example-0.0.1-SNAPSHOT.jar \
> /opt/spark-2.2.1-bin-hadoop2.7/README.md wordcounts

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325114017&siteId=291194637