Flink cluster structures in two ways: namely, Standalone mode (independent trunked mode) mode and HA (High Availability Cluster mode).
Here we come to the learner two kinds of cluster structures (here for flink-1.6.0-bin-hadoop26- scala_2.11.tgz version, for example).
Environmental ready
three virtual machines, each node hostname are node01, node02, node03, configure the virtual machines between free confidential login
each virtual machine installed jdk1.8 and above
each virtual machine and install a zookeeper cluster hadoop cluster
hadoop cluster access ports configured for 8020 |
Create the following directory on each virtual machine
mkdir ‐p /opt/softwares mkdir ‐p /opt/servers mkdir ‐p /opt/data mkdir ‐p /opt/logs |
flink installation package: flink-1.6.0-bin-hadoop26-scala_2.11.tgz
Download Path: https: //flink.apache.org/downloads.html |
Standalone Cluster Setup
1. upload the installation package (flink-1.6.0-bin-hadoop26 -scala_2.11.tgz) to the directory node node01 / opt / Softwares
2. extract the installation package to the directory / opt / servers
cd /opt/softwares tar ‐zxvf flink‐1.6.0‐bin‐hadoop26‐scala_2.11.tgz ‐C /opt/servers/ |
3. Modify Profile conf / flink-conf.yaml
# Configure the port number Master of jobmanager.rpc.port: 6123 # configure jobManager JVM heap memory size jobmanager.heap.size: 1024M # Configure taskManager JVM heap memory size taskmanager.heap.size: 1024M # configure each TaskManager task slot (slot ) number taskmanager.numberOfTaskSlots:. 3 # pre-allocated memory configuration for startup taskmanager.memory.preallocate: to false # operator to configure the default degree of parallelism of each of parallelism.default:. 1 # configure interface web ui start port number rest.port: 8081 # taskmanager configuration of each generated temporary folder taskmanager.tmp.dirs: / opt / data / flink |
4. conf modify the configuration file in the directory flink / master
Master vi # if the default configuration is present, remove the default configuration # Configure to add node01: 8081 |
5. modify the configuration file directory flink the conf / slaves
slaves vi # if there is a default configuration, delete the default configuration # Configure to add node01 node02 node03 |
6. Sending installation package
cd /opt/servers scp ‐r flink‐1.6.0/ node02:$PWD scp ‐r flink‐1.6.0/ node03:$PWD |
7. 启动和关闭
# 启动集群 bin/start‐cluster.sh # 关闭集群 bin/stop‐cluster.sh |
8. 查看前端页面
http://node01:8081 # 注意:windows上的host文件要配置对应的node01域名映射 |
HA集群搭建
1. 停止Flink集群
bin/stop‐cluster.sh |
2. 在Standalone基础上,修改node01配置文件 conf/flink-conf.yaml
# 在文件末尾追加下列配置 #开启HA # checkpoint state文件保存模式 state.backend: filesystem # checkpoint state文件保存路径 state.backend.fs.checkpointdir: hdfs://node01:8020/flink‐checkpoints # 将高可用模式设置为zookeeper,依赖zookeeper实现高可用 high‐availability: zookeeper # 设置HA元数据保存路径 high‐availability.storageDir: hdfs://node01:8020/flink/ha/ # 配置zookeeper集群地址 high‐availability.zookeeper.quorum: node01:2181,node02:2181,node03:2181 # zookeeper是否进行安全校验,防止丢失数据 high‐availability.zookeeper.client.acl: open |
3. 发送文件 flink-conf.yaml
scp flink‐conf.yaml node02:$PWD scp flink‐conf.yaml node03:$PWD |
4. 修改 node02 的配置文件 flink-conf.yaml
# 修改配置 jobmanager.rpc.address: node02 |
5. 修改 node01的 conf/masters
node01:8081 node02:8081 |
6. 发送master
scp master node02:$PWD scp master node03:$PWD |
7. 启动集群
# 1. 启动zookeeper集群 # 2. 启动hadoop集群 # 3. 启动Flink集群 bin/start‐cluster.sh |
8. 查看前端页面
http://node01:8081/ |
小结
每一个大数据技术组件的学习,第一步面临的可能都是集群环境的搭建,但是这些这些搭建步骤基本都是定式化的
操作,只要搭建过一次,之后遇到其他情况的搭建基本上是大同小异。本篇内容详细介绍了Flink的两种集群搭建方
式,希望帮助小伙们轻松入门Flink第一课。
文章来源于公总号黑马程序员广州中心(itheimagz)更多资源请关注