spark2.1.0安装步骤

0 机器分配

     IP        host      角色
172.29.41.153  master  Spark master   

172.29.41.154  slave1   Spark slave

172.29.41.155  slave2   Spark slave 

1、安装scala

(2.10.6支持java7\java6 2.12.*只支持java8以上)
sudo tar -zxvf scala-2.10.6.tgz -C /usr/local
cd /usr/local
sudo mv scala-2.10.6 scala
sudo chown -R hadoop:hadoop scala

2、验证scala是否安装成功

sudo vi ~/.bashrc
export SCALA_HOME=/usr/local/scala
export PATH = $PATH:$SCALA_HOME/bin
source ~/.bashrc

scala -version

3、解压spark-2.10

sudo tar -zxvf spark-2.1.0-bin-hadoop2.7.tgz -C /usr/local
cd /usr/local
sudo mv sspark-2.1.0-bin-hadoop2.7 spark
sudo chown -R hadoop:hadoop spark

4、配置spark

  • 进入spark目录,拷贝Spark环境模板文件为环境文件:cp conf/spark-env.sh.template conf/spark-env.sh,然后添加如下内容:
export SCALA_HOME=/root/dev/java/scala-2.12.1  
export SPARK_WORKER_MEMORY=1g  
export SPARK_MASTER_IP=your_server_ip  
export MASTER=spark://your_server_ip:7077  
# 如果SSH端口不是缺省的22时加入下面行  
export SPARK_SSH_OPTS="-p 22000"

  • 生成Slave文件:cp conf/slaves.template conf/slaves。在这个文件中加入Worker节点的名称

5、配置spark环境变量

sudo vi ~/.bashrc
export SPARK_HOME=/usr/local/spark
export PATH = $PATH:$SPARK_HOME/bin
source ~/.bashrc

6、发往slave1\slave2

scp -r /usr/local/spark slave1:~
scp -r /usr/local/spark slave2:~

7、启动spark

  • 进入$SPARK_HOME目录,启动Spark:./sbin/start-all.sh

  • 进入到$SPARK_HOME目录,运行求PI的实例:./bin/run-example org.apache.spark.examples.SparkPi


  • 运行spark-shell ./bin/spark-shell

8、spark-shell WordCount

  • 在hdfs下建立文件/sparkTest/aaa 内容如下

  • 进入sparkshell后运行wordcount :
    scala> val file=sc.textFile("hdfs://master:9000/sparkTest/aaa")
    scala> val count=file.flatMap(line => line.split(" ")).map(word => (word,1)).reduceByKey(_+_)
    scala> count.collect()

至此spark环境已经成功安装到了集群

猜你喜欢

转载自blog.csdn.net/phn_csdn/article/details/78001844
今日推荐