Spark series-installation and configuration

Introduction

Need to install: JDK, scala, Spark (note that it corresponds to the version of hadoop).

Optional installation: hadoop

installation

windows

Other URL

Spark in the Windows environment to build_the atmosphere of life-CSDN blog
windows10 install spark (including hadoop installation)_小白白的博客-CSDN博客

1. Install JDK

slightly

2. Install Hadoop

See: Hadoop series-installation and setup_feiying0canglang's blog -CSDN blog

3. Install Spark

Download link: Downloads | Apache Spark   (Pre-built: already compiled, just download it and use it directly)

Download here: spark-3.0.2-bin-hadoop3.2.tgz

4. Install Scala

Scala is installed to run spark-shell.

Download      link : https://www.scala-lang.org/download/all.html //Download here: scala-2.12.13.msi

Fool-style installation. After installation, if cmd input scala output, the installation is successful.

注意版本: Spark 2.x is pre-built with Scala 2.11 except version 2.4.2, which is pre-built with Scala 2.12. Spark 3.0+ is pre-built with Scala 2.12.

5. Run spark

Go to the decompression directory \bin and run: spark-shell

6. Test

Visit: http://localhost:4040/

result:

Docker

use

PySpark under Python

        For Spark under Python, there is PySpark, which is similar to spark-shell under Scala and can perform some simple debugging and testing on Spark.

Guess you like

Origin blog.csdn.net/feiying0canglang/article/details/113964761