Maven installation (Linux)

1. they need to find the right version from the official website, here I use the maven-3.6.1

wget http://mirrors.tuna.tsinghua.edu.cn/apache/maven/maven-3/3.6.1/binaries/apache-maven-3.6.1-bin.tar.gz

2. unzip it in / usr / local directory

tar -zxvf apache-maven-3.6.1-bin.tar.gz -C /usr/local
rm apache-maven-3.6.1-bin.tar.gz -C /usr/local

3. Enter at / usr / local directory, modify the file directory name maven

cd /usr/local
mv apache-maven-3.6.1 maven-3.6.1

4. Next, the environmental configuration maven

vim /etc/profile
export MAVEN_HOME=/usr/local/maven-3.6.1
export PATH=$JAVA_HOME/bin:$MAVEN_HOME/bin:$PATH

5. Refresh environment variable

source /etc/profile

6. Test maven is successfully installed

mvn -version

7.Java independent Application Programming

1) into the user's home folder

cd ~

2) Create a sparkApp2

mkdir sparkapp2

3) Create a file in ./sparkapp2/src/main/java in SimpleApp.java

vim ./sparkapp2/src/main/java/SimpleApp.java
It reads as follows:
 1 /*** SimpleApp.java ***/
 2 import org.apache.spark.api.java.*;
 3 import org.apache.spark.api.java.function.Function;
 4  
 5 public class SimpleApp {
 6 public static void main(String[] args) {
 7 String logFile = "file:///usr/local/spark-2.4.3/README.md";
 8 JavaSparkContext sc = new JavaSparkContext("local", "Simple App",
 9 "file:///usr/local/spark-2.4.3/", new String[]{"target/simple-project-1.0.jar"});
10 JavaRDD<String> logData = sc.textFile(logFile).cache();
11  
12 long numAs = logData.filter(new Function<String, Boolean>() {
13 public Boolean call(String s) {
14 return s.contains("a");
15 }
16 }).count();
17  
18 long numBs = logData.filter(new Function<String, Boolean>() {
19 public Boolean call(String s) { return s.contains("b"); }
20 }).count();
21  
22 System.out.println("Lines with a: " + numAs + ", lines with b: " + numBs);
23 }
24 }
The program relies Spark java API, and therefore need to be compiled by packaging Maven.

4) Create a file pom.xml in ./sparkapp2

If you rely on here do not know which specific version can be viewed directly dependent on the maven's official website, official website link https://mvnrepository.com/
It reads as follows:
<project>
<groupId>edu.berkeley</groupId>
<artifactId>simple-project</artifactId>
<modelVersion>4.0.0</modelVersion>
<name>Simple Project</name>
<packaging>jar</packaging>
<version>1.0</version>
<repositories>
<repository>
<id>Akka repository</id>
<url>http://repo.akka.io/releases</url>
</repository>
</repositories>
<dependencies>
<dependency> <!-- Spark dependency -->
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.11</artifactId>
<version>2.4.3</version>
</dependency>
</dependencies>
</project>

5) Use maven packaged java program

cd ~ / sparkapp2
 find .

6) then the entire application package to the jar (time consuming, more than a dozen twenty minutes this way it will be longer than sbt time), after the success message is as follows:

/usr/local/maven-3.6.1/bin.mvn package

7) Run the program by spark-submit

The resulting package submitted to the jar by spark-submit Spark
/usr/local/spark-2.4.3/bin/spark-submit --class "SimpleApp" ~/sparkapp2/target/simple-project-1.0.jar 2>&1 | grep "Lines with a"

Guess you like

Origin www.cnblogs.com/xiaolan-Lin/p/11353872.html