在Flink集群搭建和使用中遇到的坑

一、项目概况

使用Flink测试中间状态设置checkpoint和从checkpoint中恢复。

二、搭建中出现的问题

Flink的集群搭建中需要配置中间状态缓存的路径(项目中使用到的是在hdfs中存储中间状态)
在集群中需要配置的项目是(如果需要中间状态的保存,这个必须的):

##配置使用的web接口,用来访问集群。默认应该也可以
jobmanager.web.address: 192.168.11.100

##声明使用文件系统来保存checkpoint
state.backend: filesystem
##配置使用的文件系统路径,这个我自己没有配置导致诸多错误。
state.checkpoints.dir: hdfs://192.168.xx.xx:9000/flink/persist

(1)集群提交使用flink_web ui界面提交,当然也可以使用上传jar包到集群,然后提交任务

###提交任务到集群中
flink run -c com.testMain /home/myhome100/FlinkTest_Tank-1.0-SNAPSHOT-jar-with-dependencies.jar

(2)从hdfs上的检查点checkpoint恢复

flink run 
-s hdfs://192.168.xx.xx:9000/flink/current/kafka2flink/5ea0c67a29b4186d2a900bb9e4dbc1ce/chk-4/a91c3a93-3892-458d-9314-ab4d96133200 
-c com.testMain /home/myhome100/FlinkTest_Tank-1.0-S
NAPSHOT-jar-with-dependencies.jar 

其中,我使用的maven工程中的pom文件为:

<properties>
    <scala.version>2.11.12</scala.version>
    <hadoop.version>2.7.2</hadoop.version>
    <flink.version>1.4.2</flink.version>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>

    <maven.compiler.source>1.8</maven.compiler.source>
    <maven.compiler.target>1.8</maven.compiler.target>
  </properties>

  <dependencies>
    <dependency>
      <groupId>org.scala-lang</groupId>
      <artifactId>scala-library</artifactId>
      <version>${scala.version}</version>
    </dependency>

    <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-shaded-hadoop2 -->
    <dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-shaded-hadoop2</artifactId>
    <version>1.4.2</version>
  </dependency>

    <!--flink相关配置-->
    <dependency>
      <groupId>org.apache.flink</groupId>
      <artifactId>flink-clients_2.11</artifactId>
      <version>${flink.version}</version>
      <scope>provided</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.flink</groupId>
      <artifactId>flink-scala_2.11</artifactId>${flink.version}${flink.version}
      <version>${flink.version}</version>
    </dependency>
    <dependency>
      <groupId>org.apache.flink</groupId>
      <artifactId>flink-streaming-scala_2.11</artifactId>
      <version>${flink.version}</version>
    </dependency>
    <dependency>
      <groupId>org.apache.flink</groupId>
      <artifactId>flink-connector-rabbitmq_2.11</artifactId>
      <version>${flink.version}</version>
    </dependency>
    <dependency>
      <groupId>org.apache.flink</groupId>
      <artifactId>flink-connector-filesystem_2.11</artifactId>
      <version>1.2.1</version>
    </dependency>
    
    <!-- https://mvnrepository.com/artifact/org.apache.flink/flink-connector-kafka-0.10 -->
    <dependency>
      <groupId>org.apache.flink</groupId>
      <artifactId>flink-connector-kafka-0.10_2.11</artifactId>
      <version>1.4.2</version>
    </dependency>
    <!-- We need protobuf for chill-protobuf -->
    <dependency>
      <groupId>com.google.protobuf</groupId>
      <artifactId>protobuf-java</artifactId>
      <version>2.5.0</version>
    </dependency>
  </dependencies>

猜你喜欢

转载自blog.csdn.net/fct2001140269/article/details/84864151