1. yarn-session mode
1.1 Start yarn session first
bin/yarn-session.sh \
-s 8 \
-jm 4g \
-tm 16g \
-nm yarn-session-flink \
-d
Parameter explanation
parameter | significance |
---|---|
-jm 1024 | Indicates jobmanager 1024M memory |
-tm 1024 | Indicates taskmanager 1024M memory |
-s 8 | Each TaskManager has 8 slots |
-d | Tasks run in the background |
-nm,–name | Set the application name on YARN |
-D<property=value> | Dynamic properties, similar to -Dparallelism.default=3 |
-q,–query | Show available YARN resources (memory, cores) |
-qu,–tail | Specify YARN queue |
-t,–ship | Specify YARN queue |
-nl,–nodeLabel | Specify YARN node labels for YARN applications |
-z,–Zookeeper Namespace | Namespace, used to create Zookeeper subpaths in high availability mode |
-j,–jar | The path to the Flink jar file |
1.2 Submit the task to the created yarn session to run
flink run -t yarn-session -Dyarn.application.id=application_1650018331890_0001 -c org.apache.flink.examples.java.wordcount.WordCount examples/batch/WordCount.jar
2. yarn-per-job
Submit command
./flink run \
-m yarn-cluster \
-yjm 1024 \
-ytm 1024 \
-ynm wordcount \
-c org.apache.flink.examples.java.wordcount.WordCount \
-yj examples/batch/WordCount.jar
Parameter explanation
parameter | significance |
---|---|
-m | The execution mode is yarn-cluster. You can also specify the address of the JobManager to connect to. Use this flag to connect to a different JobManager specified in the configuration. Note: This option will only be considered when the high availability configuration is NONE. |
-yjm | Specify the Container memory where the JobManager is located. Unit: MB |
-ytm | The memory of each TaskManager Container, in MB. |
-ys | The number of slots in each TaskManager. |
-inm | The name of the application in YARN. |
-c | Specify the class name of the main function in the jar package corresponding to the Job. |
-yj,–yarnjar<arg> | Jar package location |
-yt,–yarnship | Transfer files in the specified directory (t is used for transfer) |
-yqu,–yarnqueue<arg> | Specify yarn queue |
-yD <property=value> | Custom parameters |
-yid,–yarnapplicationId <arg> | Specify yarnid to execute |
-yq,–yarnquery | Show available YARN resources (memory, cores) |
-d,–detached | background execution |
3. Command changes in the new version
./bin/flink run \
# 指定yarn的Per-job模式,-t等价于-Dexecution.target
-t yarn-per-job \
# yarn应用的自定义name
-Dyarn.application.name=wordcount \
# 未指定并行度时的默认并行度值, 该值默认为1
-Dparallelism.default=3 \
# JobManager进程的内存
-Djobmanager.memory.process.size=2048mb \
# TaskManager进程的内存
-Dtaskmanager.memory.process.size=2048mb \
# 每个TaskManager的slot数目, 最佳配比是和vCores保持一致
-Dtaskmanager.numberOfTaskSlots=2 \
# 防止日志中文乱码
-Denv.java.opts="-Dfile.encoding=UTF-8" \
# 支持火焰图, Flink1.13新特性, 默认为false, 开发和测试环境可以开启, 生产环境建议关闭
-Drest.flamegraph.enabled=true \
# 入口类
-c xxxx.MainClass \
# 提交Job的jar包
xxxx.jar