Spark series (seven) - based on ZooKeeper to build high availability clusters Spark

A cluster plan

Here to build a three-node cluster Spark, where the three hosts were deployed Workerservices. Meanwhile, in order to ensure high availability, in addition to the main deployment on hadoop001 Masterservices, are deployed also on hadoop002 and hadoop003 alternate Masterservice, Master coordinated service managed by Zookeeper cluster, if the primary Masteris unavailable, the backup Masterwill become the new primary Master.

Second, the pre-conditions

Spark ago to build a cluster, the need to ensure JDK environment, Zookeeper Hadoop clusters and clusters have been set up, the steps can be found in:

Three, Spark Cluster Setup

3.1 download, unzip

Download the required version of the Spark, the official website Download: http://spark.apache.org/downloads.html

After downloading decompressing:

# tar -zxvf  spark-2.2.3-bin-hadoop2.6.tgz

3.2 Configuration Environment Variables

# vim /etc/profile

Add environment variables:

export SPARK_HOME=/usr/app/spark-2.2.3-bin-hadoop2.6
export  PATH=${SPARK_HOME}/bin:$PATH

It makes the configuration of environment variables to take effect immediately:

# source /etc/profile

3.3 Cluster Configuration

Into the ${SPARK_HOME}/confdirectory, copy the configuration modified samples:

1. spark-env.sh

 cp spark-env.sh.template spark-env.sh
# 配置JDK安装位置
JAVA_HOME=/usr/java/jdk1.8.0_201
# 配置hadoop配置文件的位置
HADOOP_CONF_DIR=/usr/app/hadoop-2.6.0-cdh5.15.2/etc/hadoop
# 配置zookeeper地址
SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=hadoop001:2181,hadoop002:2181,hadoop003:2181 -Dspark.deploy.zookeeper.dir=/spark"

2. slaves

cp slaves.template slaves

Configure the location of all Woker node:

hadoop001
hadoop002
hadoop003

3.4 installation package distribution

Spark installation package will be distributed to other servers, the distribution is recommended on both servers also configure the look Spark environment variables.

scp -r /usr/app/spark-2.4.0-bin-hadoop2.6/   hadoop002:usr/app/
scp -r /usr/app/spark-2.4.0-bin-hadoop2.6/   hadoop003:usr/app/

Fourth, start the cluster

4.1 start ZooKeeper cluster

Respectively, to start the ZooKeeper service on three servers:

 zkServer.sh start

4.2 Start Hadoop cluster

# 启动dfs服务
start-dfs.sh
# 启动yarn服务
start-yarn.sh

Start Cluster 4.3 Spark

Hadoop001 enter the ${SPARK_HOME}/sbindirectory, execute the following command to start the cluster. After executing the command, starts on the hadoop001 Maserservice will slavesstart on all nodes in the configuration profile Workerservice.

start-all.sh

Run the following command on hadoop002 and hadoop003, start backup Masterservice:

# ${SPARK_HOME}/sbin 下执行
start-master.sh

4.4 View Service

View Spark's Web-UI page, port 8080. At this time, we can see on the Master node is hadoop001 ALIVEstate, and there are three available Workernodes.

Master node and the hadoop002 hadoop003 are in the STANDBYstate, no available Workernodes.

Fifth, high availability cluster validation

At this point you can use the killcommand to kill the hadoop001 Masterprocess, then backup Masterwill once again there will be a 主 Master, I have here is hadoop002, it can be seen on hadoop2 Masterthrough RECOVERINGafter becoming the new primary Master, and won all be used Workers.

Hadoop002 on Masterto become the master Master, and received all be used Workers.

At this point, if you then use on hadoop001 start-master.shstart Master service, then it will be as a backup Masterexists.

Sixth, submit jobs

And stand-alone environment to be submitted under the command of Yarn exactly the same, where Pi is calculated to Spark built-in sample program, for example, submit the following command:

spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn \
--deploy-mode client \
--executor-memory 1G \
--num-executors 10 \
/usr/app/spark-2.4.0-bin-hadoop2.6/examples/jars/spark-examples_2.11-2.4.0.jar \
100

更多大数据系列文章可以参见 GitHub 开源项目大数据入门指南

Guess you like

Origin www.cnblogs.com/heibaiying/p/11330385.html