环境:redhat6.2,cdh4.4
虚拟机:
域名
|
ip
|
角色
|
master
|
192.168.199.129
|
Master
|
slave1
|
192.168.199.130
|
Worker
|
slave2
|
192.168.199.131
|
Worker |
软件包:
scala-2.10.1.tgz
spark-1.3.0-bin-cdh4.tgz
前提:
安装好jdk,hadoop
安装路径:/home/hadoop/cdh44/
一。安装scala
$ tar -zvxf
scala-2.10.1.tgz
二。安装spark
$ tar -zvxf spark-1.3.0-bin-cdh4.tgz
$ cd
spark-1.3.0-bin-cdh4/conf
修改所有配置文件名,如图1:
$ vi spark-env.sh
添加参数如下:
export SCALA_HOME=/home/hadoop/cdh44/scala-2.10.1
export HADOOP_HOME=/home/hadoop/cdh44/hadoop-2.0.0-cdh4.4.0
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
SPARK_EXECUTOR_INSTANCES=2
SPARK_EXECUTOR_CORES=1
SPARK_EXECUTOR_MEMORY=400M
SPARK_DRIVER_MEMORY=400M
SPARK_YARN_APP_NAME="Spark 1.3.0"
SPARK_MASTER_PORT=7077
SPARK_MASTER_WEBUI_PORT=9090
PARK_WORKER_DIR=$SPARK_HOME/work
SPARK_WORKER_WEBUI_PORT=9091
如图2
$ vi slaves,添加两行,如下:
slave01
slave02
$ 把scala,spark复制一份到slave01,slave02
master,slave01,slave02分别设置环境变量
export SCALA_HOME=/home/hadoop/cdh44/scala-2.10.1
export PATH=$SCALA_HOME/bin:$PATH
export SPARK_HOME=/home/hadoop/cdh44/spark-1.3.0-bin-cdh4
export PATH=$SPARK_HOME/bin:$SPARK_HOME/sbin:$PATH
source /etc/profile
三。启动spark
$ start-all.sh
$ jps 分别在三台机器上查看进程
如图3、图4、图5
在浏览器查看状态
http://192.168.199.129:9090
如图6