首先安装scala
官网,download,往下拉,下第一个
配置环境变量:
#set scala env
export SCALA_HOME=/export/servers/scala-2.12.2
export PATH=$PATH:$SCALA_HOME/bin
环境变量配置完成后,执行下面的命令让它生效:source /etc/profile
验证Scala:
scala -version
将scala复制到其他节点,并配置环境变量
(在scala-2.12.2所在目录打开终端)
scp -r scala-2.12.2 hadoop02:/export/servers/
scp -r scala-2.12.2 hadoop03:/export/servers/
(在profile所在的etc目录打开终端)
scp profile hadoop02:/etc/
安装spark
1.官网下载(prebuilt是预编译版本,下这个)
2.解压到/export/servers文件夹
3.配置环境变量
#set spark env
export SPARK_HOME=/export/servers/spark-2.3.2-bin-hadoop2.6
export PATH=$PATH:$SPARK_HOME/bin
4.配置文件
spark-env.sh
export JAVA_HOME=/export/servers/jdk
export HADOOP_HOME=/export/servers/hadoop
export SCALA_HOME=/export/servers/scala-2.12.2
export SPARK_HOME=/export/servers/spark-2.3.2-bin-hadoop2.6
export SPARK_MASTER_IP=hadoop01
export SPARK_WORKER_MEMORY=1g
export HADOOP_CONF_DIR=/export/servers/hadoop/etc/hadoop
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib:$HADOOP_HOME/lib/native"
slaves
hadoop01
hadoop02
hadoop03
5.复制spark到其他节点
scp -r spark-2.3.2-bin-hadoop2.6 hadoop02:/export/servers/
scp -r spark-2.3.2-bin-hadoop2.6 hadoop03:/export/servers/
6.启动验证
# 启动(由于和hadoop的启动shell名字一样,需要注意)
进入SPARK_HOME/sbin
./start-all.sh # 查看集群状态 http://192.168.80.139:8080/
# 命令行交互验证
./bin/spark-shell
(至此结束,后面的没再验证)
scala> val textFile = sc.textFile("file:///home/zkpk/spark-1.6.2/README.md")
textFile: org.apache.spark.rdd.RDD[String] = file:///home/zkpk/spark-1.6.2/README.md MapPartitionsRDD[1] at textFile at :27
scala> textFile.count()
res0: Long = 95
scala> textFile.first()
res1: String = # Apache Spark