大数据-Spark的HA

Spark的HA

基于文件目录的HA

(1)修改Master服务器的conf/spark-env.sh配置文件
export JAVA_HOME=/usr/local/java/jdk1.8.0_11
export SPARK_MASTER_HOST=hadoop1
export SPARK_MASTER_PORT=7077
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=FILESYSTEM -Dspark.deploy.recoveryDirectory=/usr/local/spark/recovery"

(2)创建recovery文件夹

(3)启动Spark集群

./sbin/start-all.sh

(4)启动Spark-shell

./bin/spark-shell --master spark:\\hadoop1:7077

注意:万一master宕机可以通过./sbin/start-master.sh单节点启动,spark就会读取/usr/local/spark/recovery路径下的文件重新启动spark-shell

基于Zookeeper的HA

(1)修改conf/spark-env.sh配置文件
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=hadoop1:2181,hadoop2:2181,hadoop3:2181 -Dspark.deploy.zookeeper.dir=/spark"

(2)修改conf/salves配置文件

hadoop2
hadoop3

(3)将配置文件通过ssh发送到各个节点服务器上

(4)启动Spark集群

./sbin/start-all.sh

(5)启动hadoop2单节点上的master

./sbin/start-master.sh

注意:hadoop1的master工作的时候,hadoop2的master状态是待命(stand by),万一hadoop1的master服务器宕机,hadoop2的master就会工作,代替hadoop1的master

发布了131 篇原创文章 · 获赞 12 · 访问量 6万+

猜你喜欢

转载自blog.csdn.net/JavaDestiny/article/details/94413324