Spark installation
Referring to install tutorial Spark and Scala reference link: http: //dblab.xmu.edu.cn/blog/1307-2/
Environment: Linux already installed Hadoop
spark Official Download: http: //spark.apache.org/downloads.html
Referring to FIG contents downloaded spark, because we already own Hadoop installation, so, after the "Choose a package type" need to select the "Pre-build with user-provided Apache Hadoop", then click on the back "Download Spark" of "spark -2.4.4-bin-without-hadoop.tgz "download.
Spark deployment mode, there are four: Local mode (single mode), Standalone mode (using the Spark comes with a simple cluster manager), YARN mode (using YARN as cluster manager) and Mesos mode (using as Mesos cluster manager) .
To unzip the downloaded content and modify user permissions
sudo tar -zxf ~ / -C download /spark-2.1.0-bin-without-hadoop.tgz / usr / local /
CD / usr / local
the sudo Music Videos ./spark-2.1.0-bin-without-hadoop/. / the Spark
sudo chown -R hadoop: hadoop hadoop ./spark # here for your user name
After installation, you also need to modify configuration files spark-env.sh Spark
cd /usr/local/spark
cp ./conf/spark-env.sh.template ./conf/spark-env.sh
Edit spark-env.sh file (vim ./conf/spark-env.sh), add the following configuration information in the first line:
vim, type i to insert, esc launch editor,: wq to save and exit
Can be used directly after the configuration is complete, you do not need to run like Hadoop startup command.
By way of example comes running Spark, Spark verify whether the installation was successful.
cd /usr/local/spark
bin/run-example SparkPi
Will output a lot of operational information is executed, the output is not easy to find, you can be filtered by grep command (command 2> & 1 can have all the information to stdout, otherwise due to the nature of the output log, or will output to the screen):
bin/run-example SparkPi 2>&1 | grep "Pi is"
Get an approximation of π
Spark shell run code
Use the command bin / spark-shell into the spark-shell environment
Enter expression is evaluated
You can use the command ": quit" or direct use "Ctrl + D" key combination to exit