1. they need to find the right version from the official website, here I use the maven-3.6.1
wget http://mirrors.tuna.tsinghua.edu.cn/apache/maven/maven-3/3.6.1/binaries/apache-maven-3.6.1-bin.tar.gz
2. unzip it in / usr / local directory
tar -zxvf apache-maven-3.6.1-bin.tar.gz -C /usr/local rm apache-maven-3.6.1-bin.tar.gz -C /usr/local
3. Enter at / usr / local directory, modify the file directory name maven
cd /usr/local mv apache-maven-3.6.1 maven-3.6.1
4. Next, the environmental configuration maven
vim /etc/profile
export MAVEN_HOME=/usr/local/maven-3.6.1 export PATH=$JAVA_HOME/bin:$MAVEN_HOME/bin:$PATH
5. Refresh environment variable
source /etc/profile
6. Test maven is successfully installed
mvn -version
7.Java independent Application Programming
1) into the user's home folder
cd ~
2) Create a sparkApp2
mkdir sparkapp2
3) Create a file in ./sparkapp2/src/main/java in SimpleApp.java
vim ./sparkapp2/src/main/java/SimpleApp.java
It reads as follows:
1 /*** SimpleApp.java ***/ 2 import org.apache.spark.api.java.*; 3 import org.apache.spark.api.java.function.Function; 4 5 public class SimpleApp { 6 public static void main(String[] args) { 7 String logFile = "file:///usr/local/spark-2.4.3/README.md"; 8 JavaSparkContext sc = new JavaSparkContext("local", "Simple App", 9 "file:///usr/local/spark-2.4.3/", new String[]{"target/simple-project-1.0.jar"}); 10 JavaRDD<String> logData = sc.textFile(logFile).cache(); 11 12 long numAs = logData.filter(new Function<String, Boolean>() { 13 public Boolean call(String s) { 14 return s.contains("a"); 15 } 16 }).count(); 17 18 long numBs = logData.filter(new Function<String, Boolean>() { 19 public Boolean call(String s) { return s.contains("b"); } 20 }).count(); 21 22 System.out.println("Lines with a: " + numAs + ", lines with b: " + numBs); 23 } 24 }
The program relies Spark java API, and therefore need to be compiled by packaging Maven.
4) Create a file pom.xml in ./sparkapp2
If you rely on here do not know which specific version can be viewed directly dependent on the maven's official website, official website link
https://mvnrepository.com/
It reads as follows:
<project> <groupId>edu.berkeley</groupId> <artifactId>simple-project</artifactId> <modelVersion>4.0.0</modelVersion> <name>Simple Project</name> <packaging>jar</packaging> <version>1.0</version> <repositories> <repository> <id>Akka repository</id> <url>http://repo.akka.io/releases</url> </repository> </repositories> <dependencies> <dependency> <!-- Spark dependency --> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.4.3</version> </dependency> </dependencies> </project>
5) Use maven packaged java program
cd ~ / sparkapp2 find .
6) then the entire application package to the jar (time consuming, more than a dozen twenty minutes this way it will be longer than sbt time), after the success message is as follows:
/usr/local/maven-3.6.1/bin.mvn package
7) Run the program by spark-submit
The resulting package submitted to the jar by spark-submit Spark
/usr/local/spark-2.4.3/bin/spark-submit --class "SimpleApp" ~/sparkapp2/target/simple-project-1.0.jar 2>&1 | grep "Lines with a"