First, the purpose of the experiment
(1) master the method of installation of Linux virtual machines. Spark and run Hadoop and other big data software can play the best performance, so this tutorial, Spark related operations are carried out in a Linux system on the Linux operating system, at the same time, the next chapter will Scala language in the Linux system installation and operation. Given the many readers are using the Windows operating system, therefore, in order to successfully complete the follow-up experiments of this tutorial, here it is necessary to pass this test, to give readers a method to build a Linux virtual machine on a Windows operating system. Of course, installing Linux virtual machine just installed Linux system one way, in fact, the reader may not have the virtual machine, instead of using a dual system installed Linux system. This tutorial is recommended to use a virtual machine mode. (2) be familiar with the basics of using Linux systems. This tutorial are conducted under the Linux environment experiment, therefore, the reader needs to know in advance the basic usage of Linux systems, especially the use of some commonly used commands.
Second, the experimental process
Environment: centos6.4, jdk1.7.0, spark1.5.2
According to this blog post installation spark1.5.2 https://www.cnblogs.com/Genesis2018/p/9079787.html
First input
wget http://archive.apache.org/dist/spark/spark-1.5.2/spark-1.5.2-bin-hadoop2.6.tgz
Download spark1.5.2
After waiting for the download is complete, the downloaded file to decompress
Entry
tar -zxvf spark-1.5.2-bin-hadoop2.6.tgz
The downloaded file is decompressed, after the input of the following command to move to the corresponding / usr / local / directory
mv spark-1.5.2-bin-hadoop2.6 /usr/local/
Then enter
gedit /etc/profile.d/spark.sh
Add the following information in the open file
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export HDFS_CONF_DIR=$HADOOP_HOME/etc/hadoop
export YARN_CONF_DIR==$HADOOP_HOME/etc/hadoop export SPARK_HOME=/usr/local/spark-1.5.2-bin-hadoop2.6 export PATH=$PATH:$SPARK_HOME/bin
Save and exit
Entry
source /etc/profile.d/spark.sh
The file to take effect
Then enter
cp /usr/local/spark-1.5.2-bin-hadoop2.6/conf/spark-env.sh.template /usr/local/spark-1.5.2-bin-hadoop2.6/conf/spark-env.sh
gedit /usr/local/spark-1.5.2-bin-hadoop2.6/conf/spark-env.sh
Enter the open file (IP and jdk need to be set depending on the version of themselves)
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.221.x86_64/jre
export SCALA_HOME=/usr/local/scala-2.10.6
export HADOOP_HOME=/usr/local/hadoop-2.7.2
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export SPARK_MASTER_HOST=192.168.57.128
export SPARK_LOCAL_IP=192.168.57.128
接着输入
cp /usr/local/spark-1.5.2-bin-hadoop2.6/conf/slaves.template /usr/local/spark-1.5.2-bin-hadoop2.6/conf/slaves
gedit /usr/local/spark-1.5.2-bin-hadoop2.6/conf/slaves
将localhost中的内容改为对应虚拟机ip的地址
192.168.57.128
保存退出
验证spark安装:
sbin/start-master.sh
在服务器外边输入对应
http://192.168.57.128:8080/
发现正常启动
spark安装完毕