注意:想要使用flink on yarn 模式,需要确保hadoop集群启动成功,并且需要在yarn的某一个节点上面执行flink on yarn的脚本
没有启动hadoop集群,执行flink的bin/yarn-session.sh脚本会报下面错误
脚本会一直卡在这里,一直输出重试日志,连不上resoucemanager,说明hadoop集群每启动,解决方案:启动hadoop集群即可
2018-03-17 12:30:09,231 INFO org.apache.hadoop.ipc.Client - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
./bin/yarn-session.sh -n 1 -jm 1024 -tm 1024 启动失败
日志详情
2019-12-05 19:26:08,071 ERROR org.apache.flink.yarn.cli.FlinkYarnSessionCli - Error while running the Flink Yarn session.
org.apache.flink.client.deployment.ClusterDeploymentException: Couldn't deploy Yarn session cluster
at org.apache.flink.yarn.AbstractYarnClusterDescriptor.deploySessionCluster(AbstractYarnClusterDescriptor.java:385)
at org.apache.flink.yarn.cli.FlinkYarnSessionCli.run(FlinkYarnSessionCli.java:616)
at org.apache.flink.yarn.cli.FlinkYarnSessionCli.lambda$main$3(FlinkYarnSessionCli.java:844)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1754)
at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
at org.apache.flink.yarn.cli.FlinkYarnSessionCli.main(FlinkYarnSessionCli.java:844)
Caused by: org.apache.flink.yarn.AbstractYarnClusterDescriptor$YarnDeploymentException: The YARN application unexpectedly switched to state FAILED during deployment.
经测试发现,是由于分配的内存太大导致的,把分配的内存调小,尝试改为800 即可正常启动
这个报错是虚拟内存超出限制,有可能你用的是虚拟机或者你们的服务器也是虚拟化出来的,可能就会报这个错误
这是因为有虚拟内存的设置,而使用的过程中超出了虚拟内存的限制,所以报错
解决办法:
在etc/hadoop/yarn-site.xml文件中,修改检查虚拟内存的属性为false,需要重启yarn集群./start-yarn.sh 如下:
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
2、yarn-session.sh启动报错
flink-shaded-hadoop-2-uber-2.7.5-7.0.jar 复制到lib目录下
下载:链接:https://pan.baidu.com/s/1D2WLgIjBghoxuLiZJUdYjA
提取码:eadd
启动Flink
直接提交到yarn上 启动命
bin/flink run -m yarn-cluster xx.jar // -c 可以指定启动类
- 另一种方式在yarn上开辟一个空间,然后在提交这种任务,可以自行尝试
- 查看效果,点击AM,可以跳转到flink web ui
-
常见报错
yarn-cluster 不能解析,没有复制jar包到lib下
可参考 : https://ci.apache.org/projects/flink/flink-docs-release-1.8/ops/deployment/yarn_setup.html