Flink1.11.0 flink on yarn 模式部署详解

Flink发布了新版本1.11.0,增加了很多重要新特性,包括增加了对Hadoop3.0.0以及更高版本Hadoop的支持,不再提供“flink-shaded-hadoop-*” jars,而是通过配置YARN_CONF_DIR或者HADOOP_CONF_DIR和HADOOP_CLASSPATH环境变量完成与yarn集群的对接。
具体步骤如下:

  1. 下载Flink1.11.0安装包
    flink安装包下载地址
  2. 确保安装有Hadoop集群,版本至少Hadoop 2.4.1
  3. 配置环境变量
    增加环境变量如下:
HADOOP_HOME=/opt/module/hadoop-3.1.3
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export HADOOP_CLASSPATH=`hadoop classpath`
  1. 启动Hadoop集群(HDFS和Yarn)
  2. 解压Flink安装包,并进入conf目录,修改flink-conf.yaml文件,修改以下配置,若在提交命令中不特定指明,这些配置将作为默认配置
# The total process memory size for the JobManager.
#
# Note this accounts for all memory usage within the JobManager process, including JVM metaspace and other overhead.

jobmanager.memory.process.size: 1600m


# The total process memory size for the TaskManager.
#
# Note this accounts for all memory usage within the TaskManager process, including JVM metaspace and other overhead.

taskmanager.memory.process.size: 1728m

# To exclude JVM metaspace and overhead, please, use total Flink memory size instead of 'taskmanager.memory.process.size'.
# It is not recommended to set both 'taskmanager.memory.process.size' and Flink memory.
#
# taskmanager.memory.flink.size: 1280m

# The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline.

taskmanager.numberOfTaskSlots: 8

# The parallelism used for programs that did not specify and other parallelism.

parallelism.default: 1

  1. 向yarn集群申请资源
    执行以下命令向yarn集群申请资源
bin/yarn-session -nm test

可以使用以下参数:

	 -D <arg>                        Dynamic properties
     -d,--detached                   Start detached
     -jm,--jobManagerMemory <arg>    Memory for JobManager Container with optional unit (default: MB)
     -nm,--name                      Set a custom name for the application on YARN
     -at,--applicationType           Set a custom application type on YARN
     -q,--query                      Display available YARN resources (memory, cores)
     -qu,--queue <arg>               Specify YARN queue.
     -s,--slots <arg>                Number of slots per TaskManager
     -tm,--taskManagerMemory <arg>   Memory per TaskManager Container with optional unit (default: MB)
     -z,--zookeeperNamespace <arg>   Namespace to create the Zookeeper sub-paths for HA mode
  1. 提交任务

yarn session启动之后会给出一个web UI地址以及一个yarn application ID,用户可以通过web UI或者命令行两种方式提交任务。
在这里插入图片描述
web UI提交任务比较简单,通过命令行提交需要执行以下命令

bin/flink run ./examples/batch/WordCount.jar

可选参数如下:

	 -c,--class <classname>           Class with the program entry point ("main"
                                      method or "getPlan()" method. Only needed
                                      if the JAR file does not specify the class
                                      in its manifest.
     -m,--jobmanager <host:port>      Address of the JobManager to
                                      which to connect. Use this flag to connect
                                      to a different JobManager than the one
                                      specified in the configuration.
     -p,--parallelism <parallelism>   The parallelism with which to run the
                                      program. Optional flag to override the
                                      default value specified in the
                                      configuration

客户端可以自行确定JobManager的地址,也可以通过-m或者–jobmanager 参数指定JobManager的地址,JobManager的地址在yarn session的启动页面中可以找到。

猜你喜欢

转载自blog.csdn.net/shkstart/article/details/107843464
今日推荐