Minimalist Spark 3.3.0 installation

The Spark official website provides precompiled packages with Hadoop and Scala, which greatly simplifies the installation process.

Avoid pitfalls: According to my observation, the Hadoop attached to Spark is not a complete Hadoop body, but only includes file management components such as HDFS and Hbase that Spark relies on. If you need to use the full Hadoop function at the same time, you need to install Hadoop and Hadoop separately. Spark, this tutorial is not for you

Below I will use a brand new Linux virtual machine to install:

Virtual machine software: VMware® Workstation 16 Pro

System: ubuntu-22.04.1-desktop-amd64

install java

Note that the JAVA version here should be consistent with the version supported by your Spark, here I use Java 17

Official website: Overview - Spark 3.3.0 Documentation

Spark runs on Java 8/11/17, Scala 2.12/2.13, Python 3.7+ and R 3.5+. Java 8 prior to version 8u201 support is deprecated as of Spark 3.2.0. For the Scala API, Spark 3.3.0 uses Scala 2.12. You will need to use a compatible Scala version (2.12.x).

Be sure to write JAVA_HOME in the environment variable. I won’t go into details on how to install java. I just found a tutorial link on the Internet  to install java in a linux environment - Conan. Doyle - Blog Garden

Download Spark

Official website download: Downloads | Apache Spark

Pay attention to select the version with Hadoop and Spark in the first box

Install

Unzip to the specified directory

sudo tar -xzvf [你的下载文件路径] -C [你的Spark安装路径]

The path in [] looks at it and changes it. After the change, it looks like this

sudo tar -xzvf ~/Downloads/spark-3.3.0-bin-hadoop3-scala2.13.tgz -C ~/Software/Spark

Verify successful installation

Go to the directory where you installed

cd [你的Spark安装路径]

Run the sample code - find pi (approximate numbers retain 10 decimal places)

./bin/run-example SparkPi 10

It will output a lot of things, but as long as the result comes out, it should be fine

 It's that simple

reference:

Official website documentation: Overview - Spark 3.3.0 Documentation

Guess you like

Origin blog.csdn.net/seriseri/article/details/127193023
Recommended