目录
1.flinkx部署
参考官方安装文档,但是会有一些坑
wget https://github.com/DTStack/flinkx/blob/1.10_release/docs/quickstart.md
2.FlinkX版本需要与Flink版本保持一致,最好小版本也保持一致
FlinkX分支 | Flink版本 |
---|---|
1.8_release | Flink1.8.3 |
1.10_release | Flink1.10.1 |
1.11_release | Flink1.11.3 |
不对应在standalone和yarn session模式提交时,会报错:
Caused by: java.io.InvalidClassException: org.apache.flink.api.common.operators.ResourceSpec; incompatible types for field cpuCores
注意:flinkx的github里默认不是最新版本,要自己选最新的分支
下载最新编译包
wget https://github.com/DTStack/flinkx/releases/download/1.11.0/flinkx.7z
下载后,lib里面缺少flinkx-launcher-1.6.jar,需要再编译以下,编译好后把flinkx/lib 里面的flinkx-launcher-1.6.jar放到flink/lib下面
./bin/install_jars.sh
mvn clean package -DskipTests
2. flinkx on local,这里配置文件搞好,没什么坑
./bin/flinkx \
-mode local \
-job test/binlog_to_kafka.json \
-pluginRoot syncplugins \
-flinkconf flinkconf
2. flinkx on yarn
下载对应Hadoop版本的flink shade包,放入$FLINK_HOME/lib目录下(从flink1.11开始官方不再提供打包好的flink shade包,需要自行下载打包)
下载对应版本的flink prometheus包,放入$FLINK_HOME/lib目录下
修改flink配置文件,指定flink类加载方式
vi ../conf/flink-conf.yaml
## flink类加载方式,指定为父类优先
classloader.resolve-order: parent-first
示例脚本
bin/flinkx \
-mode yarnPer \
-jobid binlog_to_kafka \
-job test/binlog_to_kafka.json \
-pluginRoot syncplugins \
-flinkconf ~/service/flink-1.11.3/conf \
-yarnconf $HADOOP_HOME/etc/hadoop \
-flinkLibJar ~/service/flink-1.11.3/lib \
-confProp "{\"flink.checkpoint.interval\":60000}" \
-queue default
执行完后,去yarn-ui查看任务运行状态,并点到flink-ui查看任务状态,并查看数据导入情况
问题解决:
问题:
java.lang.NoSuchMethodError: org.apache.flink.runtime.execution.Environment.getUserCodeClassLoader()Lorg/apache/flink/util/UserCodeClassLoader;
at org.apache.flink.streaming.runtime.tasks.StreamTask.createRecordWriters(StreamTask.java:1235)
at org.apache.flink.streaming.runtime.tasks.StreamTask.createRecordWriterDelegate(StreamTask.java:1218)
at org.apache.flink.streaming.runtime.tasks.StreamTask.<init>(StreamTask.java:304)
at org.apache.flink.streaming.runtime.tasks.StreamTask.<init>(StreamTask.java:285)
at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.<init>(SourceStreamTask.java:74)
at org.apache.flink.streaming.runtime.tasks.SourceStreamTask.<init>(SourceStreamTask.java:70)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.flink.runtime.taskmanager.Task.loadAndInstantiateInvokable(Task.java:1372)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:699)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:546)
at java.lang.Thread.run(Thread.java:748)
解决:
flink版本关系不对
问题:
15:36:00.031 [main] ERROR org.apache.flink.runtime.entrypoint.ClusterEntrypoint - Could not start cluster entrypoint YarnJobClusterEntrypoint.
org.apache.flink.runtime.entrypoint.ClusterEntrypointException: Failed to initialize the cluster entrypoint YarnJobClusterEntrypoint.
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:200) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runClusterEntrypoint(ClusterEntrypoint.java:569) [flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.yarn.entrypoint.YarnJobClusterEntrypoint.main(YarnJobClusterEntrypoint.java:99) [flink-dist_2.12-1.12.1.jar:1.12.1]
Caused by: org.apache.flink.util.FlinkException: Could not create the DispatcherResourceManagerComponent.
at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:275) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:234) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:178) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_252]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_252]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) ~[flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:175) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
... 2 more
Caused by: java.lang.NullPointerException
at org.apache.flink.util.Preconditions.checkNotNull(Preconditions.java:59) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.yarn.configuration.YarnResourceManagerDriverConfiguration.<init>(YarnResourceManagerDriverConfiguration.java:54) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.yarn.entrypoint.YarnResourceManagerFactory.createResourceManagerDriver(YarnResourceManagerFactory.java:52) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.resourcemanager.active.ActiveResourceManagerFactory.createResourceManager(ActiveResourceManagerFactory.java:102) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.resourcemanager.ResourceManagerFactory.createResourceManager(ResourceManagerFactory.java:71) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.resourcemanager.active.ActiveResourceManagerFactory.createResourceManager(ActiveResourceManagerFactory.java:63) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.component.DefaultDispatcherResourceManagerComponentFactory.create(DefaultDispatcherResourceManagerComponentFactory.java:177) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.runCluster(ClusterEntrypoint.java:234) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.lambda$startCluster$0(ClusterEntrypoint.java:178) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_252]
at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_252]
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1836) ~[flink-shaded-hadoop-2-uber-2.8.3-9.0.jar:2.8.3-9.0]
at org.apache.flink.runtime.security.contexts.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
at org.apache.flink.runtime.entrypoint.ClusterEntrypoint.startCluster(ClusterEntrypoint.java:175) ~[flink-dist_2.12-1.12.1.jar:1.12.1]
... 2 more
End of LogType:jobmanager.out
解决:
找不到resourcmanager,用以下命令切换resourcemanager
hdfs haadmin -getServiceState nn1
hdfs haadmin -getServiceState nn2
hdfs haadmin -transitionToStandby --forcemanual nn1
hdfs haadmin -transitionToActive --forcemanual nn2