Ten minutes to build a cluster easily learn Flink


Flink cluster structures in two ways: namely, Standalone mode (independent trunked mode) mode and HA (High Availability Cluster mode).
Here we come to the learner two kinds of cluster structures (here for flink-1.6.0-bin-hadoop26- scala_2.11.tgz version, for example).
Environmental ready
three virtual machines, each node hostname are node01, node02, node03, configure the virtual machines between free confidential login
each virtual machine installed jdk1.8 and above
each virtual machine and install a zookeeper cluster hadoop cluster

hadoop cluster access ports configured for 8020


Create the following directory on each virtual machine

mkdir ‐p /opt/softwares
mkdir ‐p /opt/servers
mkdir ‐p /opt/data
mkdir ‐p /opt/logs


flink installation package: flink-1.6.0-bin-hadoop26-scala_2.11.tgz

Download Path: https: //flink.apache.org/downloads.html


Standalone Cluster Setup
1. upload the installation package (flink-1.6.0-bin-hadoop26 -scala_2.11.tgz) to the directory node node01 / opt / Softwares
2. extract the installation package to the directory / opt / servers

cd  /opt/softwares
tar ‐zxvf flink‐1.6.0‐bin‐hadoop26‐scala_2.11.tgz ‐C /opt/servers/


3. Modify Profile conf / flink-conf.yaml

# Configure the port number Master of
jobmanager.rpc.port: 6123
# configure jobManager JVM heap memory size
jobmanager.heap.size: 1024M
# Configure taskManager JVM heap memory size
taskmanager.heap.size: 1024M
# configure each TaskManager task slot (slot ) number
taskmanager.numberOfTaskSlots:. 3
# pre-allocated memory configuration for startup
taskmanager.memory.preallocate: to false
# operator to configure the default degree of parallelism of each of
parallelism.default:. 1
# configure interface web ui start port number
rest.port: 8081
# taskmanager configuration of each generated temporary folder
taskmanager.tmp.dirs: / opt / data / flink


4. conf modify the configuration file in the directory flink / master

Master vi
# if the default configuration is present, remove the default configuration
# Configure to add
node01: 8081


5. modify the configuration file directory flink the conf / slaves

slaves vi
# if there is a default configuration, delete the default configuration
# Configure to add
node01
node02
node03


6. Sending installation package

cd /opt/servers
scp ‐r flink‐1.6.0/ node02:$PWD
scp ‐r flink‐1.6.0/ node03:$PWD


7. 启动和关闭

# 启动集群
bin/start‐cluster.sh
# 关闭集群
bin/stop‐cluster.sh


8. 查看前端页面

http://node01:8081
# 注意:windows上的host文件要配置对应的node01域名映射


HA集群搭建
1. 停止Flink集群

bin/stop‐cluster.sh


2. 在Standalone基础上,修改node01配置文件  conf/flink-conf.yaml

# 在文件末尾追加下列配置
#开启HA
# checkpoint state文件保存模式
state.backend: filesystem
# checkpoint state文件保存路径
state.backend.fs.checkpointdir: hdfs://node01:8020/flink‐checkpoints
# 将高可用模式设置为zookeeper,依赖zookeeper实现高可用
high‐availability: zookeeper
# 设置HA元数据保存路径
high‐availability.storageDir: hdfs://node01:8020/flink/ha/
# 配置zookeeper集群地址
high‐availability.zookeeper.quorum: node01:2181,node02:2181,node03:2181
# zookeeper是否进行安全校验,防止丢失数据
high‐availability.zookeeper.client.acl: open


3. 发送文件 flink-conf.yaml

scp flink‐conf.yaml node02:$PWD
scp flink‐conf.yaml node03:$PWD


4. 修改 node02 的配置文件 flink-conf.yaml

# 修改配置
jobmanager.rpc.address: node02



5. 修改 node01的 conf/masters

node01:8081
node02:8081


6. 发送master

scp master node02:$PWD
scp master node03:$PWD


7. 启动集群

# 1. 启动zookeeper集群
# 2. 启动hadoop集群
# 3. 启动Flink集群
bin/start‐cluster.sh


8. 查看前端页面

http://node01:8081/


小结
每一个大数据技术组件的学习,第一步面临的可能都是集群环境的搭建,但是这些这些搭建步骤基本都是定式化的
操作,只要搭建过一次,之后遇到其他情况的搭建基本上是大同小异。本篇内容详细介绍了Flink的两种集群搭建方
式,希望帮助小伙们轻松入门Flink第一课。


文章来源于公总号黑马程序员广州中心(itheimagz)更多资源请关注

image.png





Guess you like

Origin blog.51cto.com/14500648/2430134