Flink Standalone mode deployment cluster is the simplest deployment method, does not depend on other components, and also supports deployment in YARN/Mesos/K8S modes
Standalone execution architecture diagram:
1) The client client submits tasks to Jobmanager
2) JobManager is responsible for applying for the resources needed for task operation and managing tasks and resources.
3) JobManager distributes tasks to TaskManager for execution
4) TaskManager regularly reports status to JobManager
10
.8环境 :10.0.83.71 jobmanager + taskmanager
10.0.83.72 taskmanager
10.0.83.73 taskmanager
systemctl stop firewalld
systemctl disable firewalld
2. Modify the configuration environment to the actual cluster configuration:
sed -i's/jobmanager.rpc.address: localhost/jobmanager.rpc.address: 10.0.83.71/g' /opt/flink/conf/flink-conf. yaml
sed -i's/taskmanager.numberOfTaskSlots: 1/taskmanager.numberOfTaskSlots: 2/g' /opt/flink/conf/flink-conf.yaml #Allow
web submission
sed -i's/#web.submit.enable : false/web.submit.enable: true/g' /opt/flink/conf/flink-conf.yaml
Specify the master node
sed -i 's/localhost:8081/10.0.83.71:8081/g' /opt/flink/conf/masters
Specify worker node
echo -e '10.0.83.71\n10.0.83.72\n10.0.83.73' > /opt/flink/conf/workers
3. Configure the password-free login
to execute on 71, 72, 73: ssh-keygen -t rsa
Perform copy to the other 2 machine addresses on each machine:
ssh-copy-id 10.0.83.71
ssh-copy-id 10.0.83.72
ssh-copy-id 10.0.83.73
4. Synchronize code to other machines
scp -r /opt/flink 10.0.83.72:/opt/
scp -r /opt/flink 10.0.83.73:/opt/
Deploy the hadoop cluster
You can refer to: https://blog.51cto.com/mapengfei/2546950
After the hadoop cluster is deployed,
hdfs dfs -mkdir -p /wordcount/output
hdfs dfs -mkdir -p /wordcount/input
Upload sample data to hdfs
hdfs dfs -put /opt/words.txt /wordcount/input
执行flink测试任务:
cd /opt/flink/
bin/flink run examples/batch/WordCount.jar --input hdfs://node1:8020/wordcount/input/words.txt --output hdfs://node1:8020/wordcount/output