自制Spark安装详细过程(含Scala)

推荐spark安装链接http://blog.csdn.net/weixin_36394852/article/details/76030317

一、scala下载安装与配置

         1.下载

                   cd /usr/scala    //若不存在则新建

                   wget https://downloads.lightbend.com/scala/2.11.7/scala-2.11.7.tgz                  //在线下载

                   或者访问https://downloads.lightbend.com/scala/2.11.7/scala-2.11.7.tgz

                   本机下载后copy到scala文件夹下

         2.安装

                   cd /usr/scala

                   tar -zxf scala-2.11.7.tgz  //解压

                   vi /etc/profile   //配置环境

                   在末尾加入如下代码

                   export SCALA_HOME=/usr/scala/scala-2.11.7

                   export PATH=$PATH:$SCALA_HOME/bin

                   source /etc/profile  //设置生效

         3.验证

                   scala -version

4.同步到从节点

                   rsync -av /usr/scala/scala-2.11.7 slave1:/usr/scala/      //使用rsync命令复制文件夹

                   rsync -av /usr/scala/scala-2.11.7 slave2:/usr/scala/

配置环境(vi /etc/profile)、设置生效并验证

二、spark下载安装与配置

         1.下载

                   官网上下载2.2.X版本spark(与hadoop2.7.X相对应),存放在/usr/local文件夹下(本次安装spark-2.2.0-bin-hadoop2.7.tgz)

         2.解压、改名

                   cd /usr/local

                   tar -zxf spark-2.2.0-bin-hadoop2.7.tgz   //解压

                   mv spark-2.2.0-bin-hadoop2.7 spark       //重命名文件夹为spark方便后续使用

         3.配置环境变量

                   vi ~/.bashrc

                   在文件末尾添加如下代码

                   export SPARK_HOME=/usr/local/spark

                   export PATH=$PATH:$SPARK_HOME/bin:$SPARK_HOME/sbin

                   source ~/.bashrc      //配置立即生效

4.spark配置

                   (1)修改slaves文件

                            将slaves.template改名为slaves

                            cd /usr/local/spark/conf  //配置文件所在目录

                            mv slaves.template slaves      //修改为spark可识别文件

                            并将slaves内localhost改为Slave,Slave2,各占一行

(2)修改spark-env.sh文件

                            将spark-env.sh.template改名为spark-env.sh

                            mv spark-env.sh.template spark-env.sh //改名

                            并在文件末未添加如下代码

                            export SPARK_DIST_CLASSPATH=$(/usr/local/hadoop/bin/hadoop classpath)

                            export HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop

                            export SPARK_MASTER_IP=192.168.0.253

注:192.168.0.253为spark集群master节点的IP地址

(3)copy到从节点上

                            使用rsync命令copy主节点spark文件夹到从节点

                            cd /usr/local

                            rsync -av spark Slave1:/usr/local

                            rsync -av spark Slave2:/usr/local

         注:若出现permission denied则给spark文件夹授权读写chmod 777 /usr/loca/spark

         5.启动spark集群

                   启动hadoop集群

                   cd /usr/local/hadoop/sbin

                   start-all.sh

启动spark集群

                   cd /usr/local/spark/sbin

                   start-master.sh        //启动主节点

start-slaves.sh                   //启动从节点

6.验证成功

                   主节点输入jps进程如下

从节点输入jps进程如下

然后主节点上打开浏览器,访问http://master:8080如下图

主节点上输入spark-shell结果应为

7.停止spark集群

                   关闭Master节点

                   cd /usr/local/spark/sbin

                   stop-master.sh

                   关闭Worker节点

                   stop-slaves.sh

                   关闭hadoop集群

                   cd /usr/local/hadoop/sbin

                   stop-all.sh

猜你喜欢

转载自blog.csdn.net/Fortuna_i/article/details/82751818