Section 7 Flink standalone HA Cluster Configuration

Part I: Section 6 Flink on internal yarn realization


1, Flink-HA HA

JobManager High Availability (HA)

  1. jobManager coordination flink each task deployment. It is responsible for scheduling and resource management.
  2. By default, each cluster is only a flink JobManager, which will lead to a single point of failure (SPOF): If JobManager hung up, it can not submit a new task, and run the program will fail.
  3. JobManager using the HA, the cluster can recover from a failure JobManager, avoiding SPOF (single point of failure). Either under YARN trunked mode or standalone, high availability cluster configuration

2, JobManager HA configuration steps

  1. Standalone High Availability Cluster
    under Standalone mode (standalone mode) The basic idea of high availability JobManager is that any time there is a Master JobManager, and more Standby JobManagers. Standby JobManagers can take over the cluster become Master JobManager in the case of Master JobManager hang. This ensures that there is no single point of failure, once a cluster to take over a Standby JobManager, the program can continue to run. There is no clear distinction between Standby JobManager and Master JobManager instance. Each JobManager can become a Master or Standby node
  2. Yarn cluster high availability
    flink on yarn of HA in fact, is the use of yarn own job recovery mechanism
  3. For more information please refer to the configuration steps << Flink HA Configuration Guide .doc >>

3, the configuration, operation step

(1) in the file flink flink-1.7.0 / conf / profile

[root@Flink105 conf]# vim flink-conf.yaml 
[root@Flink105 conf]# 

//配置参数本机名
jobmanager.rpc.address: flink105

(2) Configuration slaves

[root@Flink105 conf]# vim slaves 

flink106
flink107

(3) Configuration masters

//然后修改配置 HA 需要的参数
[root@Flink105 conf]# vim masters 

flink105:8081
flink106:8081

(4) arranged flink-conf.yaml

[root@Flink105 conf]# vim flink-conf.yaml

 high-availability: zookeeper
 high-availability.zookeeper.quorum: flink105:2181
//ZooKeeper节点根目录,其下放置所有集群节点的namespace
high-availability.zookeeper.path.root:/flink

//建议指定hdfs的全路径。如果某个flink节点没有配置hdfs的话,不指定全路径无法识别
//storageDir存储了恢复一个JobManager所需的所有元数据
high-availability.storageDir:hdfs://flink105:9000/flink/ha

(5) copies distributed to other machines

[root@Flink105 module]# scp -r flink-1.7.0/  flink106:/opt/hadoop/module/
[root@Flink105 module]# scp -r flink-1.7.0/  flink107:/opt/hadoop/module/

(6) start hadoop cluster

[root@Flink105 module]# start-all.sh

(7) starts zookeeper cluster

[root@flink105 zookeeper-3.4.5]# bin/zkServer.sh start
[root@flink106 zookeeper-3.4.5]# bin/zkServer.sh start
[root@flink107 zookeeper-3.4.5]# bin/zkServer.sh start

//查看进程
[root@Flink105 zkData]# jps -l
16596 org.apache.hadoop.hdfs.server.namenode.NameNode
9959 org.apache.hadoop.yarn.server.resourcemanager.ResourceManager
16888 org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode
17016 org.apache.zookeeper.server.quorum.QuorumPeerMain
17176 sun.tools.jps.Jps
10073 org.apache.hadoop.yarn.server.nodemanager.NodeManager
16698 org.apache.hadoop.hdfs.server.datanode.DataNode
8349 org.apache.flink.runtime.entrypoint.StandaloneSessionClusterEntrypoint


//启动zookeeper客户端,之后关闭
[root@Flink105 bin]# ./zkCli.sh 

//启动cluster集群
[root@Flink105 flink-1.7.0]# bin/start-cluster.sh

Starting HA cluster with 2 masters.
Starting standalonesession daemon on host Flink105.
Starting standalonesession daemon on host Flink106.
Starting taskexecutor daemon on host Flink106.
Starting taskexecutor daemon on host Flink107.

View the process

//已经有了
[root@Flink105 flink-1.7.0]# jps
16596 NameNode
9959 ResourceManager
16888 SecondaryNameNode
17016 QuorumPeerMain
17976 Jps
10073 NodeManager
16698 DataNode
8349 StandaloneSessionClusterEntrypoint

Web page visit
http: // flink105: 8081

Here Insert Picture Description
When we visit flink106 (other machines will skip flink105)
Here Insert Picture Description

Published 217 original articles · won praise 20 · views 10000 +

Guess you like

Origin blog.csdn.net/weixin_39868387/article/details/104699717