Flink示例——State、Checkpoint、Savepoint

Flink示例——State、Checkpoint、Savepoint

版本信息

产品 版本
Flink 1.7.2
Java 1.8.0_231
Scala 2.11.12

Mavan依赖

  • pom.xml 依赖部分
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-java</artifactId>
        <version>${flink.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-streaming-java_2.11</artifactId>
        <version>${flink.version}</version>
    </dependency>
    <dependency>
        <groupId>org.apache.flink</groupId>
        <artifactId>flink-clients_2.11</artifactId>
        <version>${flink.version}</version>
    </dependency>
    

State Backend 状态后端

  • 用于管理State、Checkpoint
  • State Backend分类
    • MemoryStateBackend
      • state数据 -> TaskManager内存
      • checkpoint数据 -> JobManager内存
    • FsStateBackend
      • state数据 -> TaskManager内存
      • checkpoint数据 -> 远程文件系统(FileSystem)
    • RocksDBBackend
      • 所有数据 -> RocksDB -> 远程文件系统(FileSystem)
  • 代码
    StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    
    // 配置状态后端 State Backend
    // MemoryStateBackend
    env.setStateBackend(new MemoryStateBackend());
    
    // FsStateBackend
    // env.setStateBackend(new FsStateBackend("hdfs://192.168.0.100:9000/flink/checkpoints"));
    
    // RocksDBStateBackend 需要导包
    //<dependency>
    //    <groupId>org.apache.flink</groupId>
    //    <artifactId>flink-statebackend-rocksdb_2.11</artifactId>
    //    <version>${flink.version}</version>
    //</dependency>
    // env.setStateBackend(new RocksDBStateBackend("hdfs://192.168.0.100:9000/flink/checkpoints"));
    

State 示例

  • State分类
    • 算子状态 OperatorState
      • 列表状态 ListState
      • 联合列表状态 UnionListState
      • 广播状态 BroadcastState
    • 键控状态 KeyedState
      • 值状态 ValueState
      • 列表状态 ListState
      • 映射状态 MapState
      • 聚合状态 ReducingState & AggregatingState
  • 提供一个SourceFunction,方便后面测试
    public class CustomSourceFunction extends RichSourceFunction<Tuple2<String, Long>> {
    
        private boolean flag = true;
    
        @Override
        public void run(SourceContext<Tuple2<String, Long>> ctx) throws Exception {
            List<String> data = Arrays.asList("a", "b", "c", "d", "e", "f", "g");
            Random random = new Random();
            while (flag) {
                Thread.sleep(100);
                // 随机取一个值
                String key = data.get(random.nextInt(data.size()));
                long value = System.currentTimeMillis();
                ctx.collect(Tuple2.of(key, value));
            }
        }
    
        @Override
        public void cancel() {
            flag = false;
        }
    
    }
    
  • ValueState 示例
    public class ValueStateDemo {
    
        public static void main(String[] args) {
            StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
            env.setParallelism(3);
    
            // 自定义数据源
            CustomSourceFunction sourceFunction = new CustomSourceFunction();
            DataStreamSource<Tuple2<String, Long>> customDS = env.addSource(sourceFunction);
    
            // 处理
            customDS.keyBy(value -> value.f0)
                    .flatMap(flatMapWithState) // 带状态的Function
                    .print();
    
            try {
                env.execute();
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    
        // 定义一个带状态的FlatMapFunction
        // 该状态用于保证一个key内每次只处理事务时间最新的数据
        private static RichFlatMapFunction<Tuple2<String, Long>, String> flatMapWithState = new RichFlatMapFunction<Tuple2<String, Long>, String>() {
    
            // 设置一个记录时间的状态
            private ValueState<Long> timeState;
    
            @Override
            public void open(Configuration parameters) throws Exception {
                // 初始化State
                timeState = getRuntimeContext().getState(new ValueStateDescriptor<>("maxTime", Long.class));
                // 不可在此赋值
                // timeState.update(Long.MIN_VALUE);
            }
    
            @Override
            public void flatMap(Tuple2<String, Long> value, Collector<String> out) throws Exception {
                Long maxTime = timeState.value();
                // 如果时间更大,则数据为最新
                // maxTime == null 用于防止maxTime初始为null的情况
                if (maxTime == null || value.f1 > maxTime) {
                    // 更新时间状态
                    timeState.update(value.f1);
    
                    // 处理该数据
                    out.collect(value.f0 + "|" + value.f1);
                } else {
                    // 否则不处理,或者发送通知告警
                    System.out.println("---- Warning! ----");
                }
            }
        };
    
    }
    
  • ListState 示例
    ListState<String> myListState = getRuntimeContext().getListState(new ListStateDescriptor<String>("my_liststate", String.class));
    myListState.add("state_1");
    Iterable<String> stateIter = myListState.get();
    for (String state : stateIter) {
        System.out.println("state = " + state);
    }
    
  • MapState 示例
    MapState<String, Long> myMapState = getRuntimeContext().getMapState(new MapStateDescriptor<String, Long>("my_mapstate", String.class, Long.class));
    myMapState.put("state_key_1", 1L);
    Long value = myMapState.get("state_key_1");
    
  • ReducingState 示例
    // ReducingFunction(Math::max)控制每个传入State的值的reduce计算
    ReducingStateDescriptor<Long> stateDescriptor = new ReducingStateDescriptor<>("my_reducingstate", Math::max, Long.class);
    ReducingState<Long> reducingState = getRuntimeContext().getReducingState(stateDescriptor);
    reducingState.add(100L);
    Long result = reducingState.get();
    
  • AggregatingState与ReducingState同理,不做展示
  • BroadcastState 示例
    public class BroadcastStateDemo {
    
        public static void main(String[] args) {
            StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
    
            // TagsSourceFunction 用于广播,动态过滤事件数据
            TagsSourceFunction tagsSourceFunction = new TagsSourceFunction();
            DataStreamSource<String> tagDS = env.addSource(tagsSourceFunction);
            // 广播流
            BroadcastStream<String> myBroadcast = tagDS.broadcast(new MapStateDescriptor<>("my_broadcast", String.class, String.class));
    
            // EventSourceFunction 实际处理的事件数据
            EventSourceFunction eventSourceFunction = new EventSourceFunction();
            DataStreamSource<Tuple2<String, Long>> eventDS = env.addSource(eventSourceFunction);
    
            // 连接广播,并处理
            DataStream<Tuple2<String, Long>> resultDS = eventDS.connect(myBroadcast)
                    .process(new BroadcastProcessFunction<Tuple2<String, Long>, String, Tuple2<String, Long>>() {
                        // 广播的描述符
                        private MapStateDescriptor<String, String> stateDescriptor = new MapStateDescriptor<>("my_broadcast", String.class, String.class);;
    
                        @Override
                        public void processElement(Tuple2<String, Long> value, ReadOnlyContext ctx, Collector<Tuple2<String, Long>> out) throws Exception {
                            ReadOnlyBroadcastState<String, String> broadcastState = ctx.getBroadcastState(stateDescriptor);
                            // 根据条件过滤
                            if (broadcastState.contains(value.f0)) {
                                out.collect(value);
                            }
                        }
    
                        @Override
                        public void processBroadcastElement(String value, Context ctx, Collector<Tuple2<String, Long>> out) throws Exception {
                            // 广播流每来一条数据,就调用一次该方法
                            // 更新BroadcastState
                            BroadcastState<String, String> broadcastState = ctx.getBroadcastState(stateDescriptor);
                            broadcastState.put(value, "");
                        }
                    });
    
            resultDS.print();
    
            try {
                env.execute();
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
    
        public static class TagsSourceFunction extends RichSourceFunction<String> {
    
            private boolean flag = true;
    
            @Override
            public void run(SourceContext<String> ctx) throws Exception {
                List<String> data = Arrays.asList("b", "e", "g");
                Random random = new Random();
                while (flag) {
                    // 随机取一个值
                    ctx.collect(data.get(random.nextInt(data.size())));
    
                    Thread.sleep(5000);
                }
            }
    
            @Override
            public void cancel() {
                flag = false;
            }
    
        }
    
        public static class EventSourceFunction extends RichSourceFunction<Tuple2<String, Long>> {
    
            private boolean flag = true;
    
            @Override
            public void run(SourceContext<Tuple2<String, Long>> ctx) throws Exception {
                List<String> data = Arrays.asList("a", "b","c", "d", "e","f", "g", "h", "i", "j", "k");
                Random random = new Random();
                while (flag) {
                    Thread.sleep(100);
    
                    String key = data.get(random.nextInt(data.size()));
                    ctx.collect(Tuple2.of(key, System.currentTimeMillis()));
                }
            }
    
            @Override
            public void cancel() {
                flag = false;
            }
    
        }
    
    }
    

Checkpoint 示例

  • 用于Flink自动存储/恢复应用信息
  • 启用 Checkpoint
    // 每隔60秒做一次checkpoint
    env.enableCheckpointing(60 * 1000);
    // env.enableCheckpointing(60 * 1000, CheckpointingMode.EXACTLY_ONCE);
    // env.enableCheckpointing(60 * 1000, CheckpointingMode.AT_LEAST_ONCE);
    
  • Timeout 超时
    env.getCheckpointConfig().setCheckpointTimeout(1000);
    
  • FailOnCheckpointingErrors
    // 默认为true,checkpoint失败会导致Flink应用失败
    // 设置为false,checkpoint失败时也会继续运行Flink应用
    env.getCheckpointConfig().setFailOnCheckpointingErrors(false);
    
  • 最大同时运行的Checkpoint个数
    // 如果前一个checkpoint太慢,当第二个checkpoint触发时可能第一个还在
    // 就会出现多个checkpoint同时存在的情况
    env.getCheckpointConfig().setMaxConcurrentCheckpoints(3);
    
  • Checkpoint之间的最小间隔时间
    // 设置之后,MaxConcurrentCheckpoints无效,同时只能有一个Checkpoint运行
    env.getCheckpointConfig().setMinPauseBetweenCheckpoints(1000);
    
  • enableExternalizedCheckpoints
    // 默认情况下,Checkpoint用于自身应用异常时的恢复
    // 手动取消了应用后,会删除Checkpoint(仅在应用失败时,才会保留Checkpoint)
    env.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.DELETE_ON_CANCELLATION);
    // 手动取消了应用后,仍然保留Checkpoint
    // env.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);
    
  • 保存多个Checkpoint,编辑conf/flink-conf.yaml
    // 默认一个最新的
    state.checkpoints.num-retained: 10
    
  • 重启策略
    // 重启策略
    // 一共重试3次,每次重试间隔1秒
    env.setRestartStrategy(RestartStrategies.fixedDelayRestart(3, 1000));
    // 3min内最多重试3次,每次间隔5秒
    env.setRestartStrategy(RestartStrategies.failureRateRestart(3, Time.minutes(60), Time.seconds(10)));
    /*
    # 配置 conf/flink-conf.yaml
    # 无重启策略
    restart-strategy: none
    # 策略 fixedDelayRestart
    restart-strategy: fixed-delay
    restart-strategy.fixed-delay.attempts: 3
    restart-strategy.fixed-delay.delay: 10 s
    # 策略 failureRateRestart
    restart-strategy: failure-rate
    restart-strategy.failure-rate.max-failures-per-interval: 3
    restart-strategy.failure-rate.failure-rate-interval: 3 min
    restart-strategy.failure-rate.delay: 10 s
    */
    
  • 从Checkpoint恢复Job
    bin/flink run -s hdfs://skey_01:9000/flink-1.7.2/flink-checkpoints/19dd7c456b5507dc6b65cc836f319dd7/chk-30/_metadata flink-job.jar
    

Savepoint 示例

  • 用于手动存储/恢复应用信息
  • 默认目录,编辑conf/flink-conf.yaml
    state.savepoints.dir: hdfs://skey_01:9000/flink/savepoint
    
  • 停止Job,并存储Job状态
    // 如果未指定路径,则存入默认目录
    bin/flink stop -p hdfs://skey_01:9000/flink/savepoint 19dd7c456b5507dc6b65cc836f319dd7
    
  • 为正在运行的Job设置状态保存点
    // 如果未指定路径,则存入默认目录
    bin/flink savepoint -d hdfs://skey_01:9000/flink/savepoint 19dd7c456b5507dc6b65cc836f319dd7
    
  • 从Savepoint恢复Job
    bin/flink run -s hdfs://skey_01:9000/flink/savepoint/savepoint-26dad6-4597317cb1b6 flink-job.jar
    
发布了146 篇原创文章 · 获赞 54 · 访问量 17万+

猜你喜欢

转载自blog.csdn.net/alionsss/article/details/104270477