Flink Standalone high availability cluster configuration jobmanagers

    Last article a brief description of the basis for the deployment of Flink standalone cluster in a production environment if only one jobmanager, then hang up once the nodes, all running task will be interrupted, the impact is relatively large, at least in a production environment jobmanager to ensure high availability of the at least two nodes, and may be run jobmanager taskmanager two instances to one physical node, and a plurality taskmanager plurality jobmanager coexist highly available, fault zookeeper availability rely recovery, So first cluster ready zookeeper, zookeeper recommendations of the independent set up a cluster, do not use the built-in single-node flink zookeeper, before the original environment is as follows:

    bigdata1 - jobmanager

    bigdata2, bigdata3, bigdata4 - task manager

    Currently zookeeper clusters: bigdata1, bigdata2, bigdata3, port number 2181

    Next To expand jobmanager together to achieve high availability in bigdata4 run jobmanager above, and bigdata1 of jobmanager. 

    First, start the configuration in a node, here bigdata1 start the configuration:

    Configuration: conf / flink-conf.yaml find the High Availability configuration section, which is the default comment is, do not use high availability, need to manually remove the comment and add some configuration items, the specific configuration is as follows:

high-availability: zookeeper
high-availability.storageDir: file:///data/flink/ha
high-availability.zookeeper.quorum: bigdata1:2181,bigdata2:2181,bigdata3:2181
high-availability.zookeeper.path.root: /flink
high-availability.cluster-id: /flink_cluster

    high-availability default is NONE, said they did not use high-availability, into a zookeeper here

    high-availability.storageDir this is highly available storage for some larger objects for restoration, the document recommends configuring all nodes have access to the resources recommended hdfs, here is the configuration of the local file system, the effectiveness of specific needs to be verified, it is recommended to use the production environment hdfs

    high-availability.zookeeper.quorum zookeeper cluster configuration

    high-availability.zookeeper.path.root configuration flink path zookeeper in the entire cluster to be unified, here is / flink; if it is more than a zookeeper flink clusters using the same cluster, this should be distinguished. 

    Identify high-availability.cluster-id cluster, the entire cluster to be consistent, have this cluster-id specified in the catalog and under storageDir under zookeeper for coordinating data storage necessary

    After the above configuration is correct, save the file

    Configuration masters, file: conf / masters, add nodes bigdata4

    

    Meanwhile conf / slaves remain unchanged at bigdata2, bigdata3, bigdata4

    Then flink-conf.yaml and masters configuration synchronization to all other nodes in the cluster, while ensuring that the service is up and running zookeeper 

    Execution:  . Bin / Start-Cluster SH  start the cluster, start bigdata4 will find more out StandaloneSessionClusterEntrypoint process, this time by the client to perform zookeeper  get / flink / flink_cluster / leader / rest_server_lock  view the current jobmanager master can usually be seen bigdata1

    

    You can then try to bigdata1 above StandaloneSessionClusterEntrypoint kill off the process by bigdata4: 8081 visit web ui, this time failover flink log may be error, wait a little while, then the interface will be loaded successfully, normal to see slots and task managers and detailed task description at this time jobmanager successful failover to achieve high availability, while viewing the zookeeper above the node will switch to bigdata4

    另外注意配置高可用之后,之前的flink-conf.yaml中的配置项jobmanager.rpc.port就不再生效,这个配置项只针对之前的单个jobmanager的独立集群,现在这个端口会自动选择并且多个jobmanager都是不一样的,但是我们不用去关心他,对使用flink没有任何影响. 

    以上就是flink jobmanager高可用的配置,配置起来还是比较简单的,推荐在生产环境中使用,集群稳定性更好. 

    参考文档: https://ci.apache.org/projects/flink/flink-docs-release-1.9/zh/ops/jobmanager_high_availability.html 

    

Guess you like

Origin www.cnblogs.com/freeweb/p/12092487.html