Flink checkpoint mechanism and state management

1 Checkpoint mechanism

1.1 CheckPoints

In order to make the state of Flink have good fault tolerance, Flink provides a checkpoint mechanism (CheckPoints). Through the checkpoint mechanism, Flink periodically generates checkpoint barriers on the data stream. When an operator receives the barrier, it generates a snapshot based on the current state, and then passes the barrier to the downstream operator, and the downstream operator receives After arriving at the barrier, a snapshot is also generated based on the current state, and passed to the final Sink operator in turn. When an exception occurs, Flink can restore all operators to their previous state based on the latest snapshot data.

checkpoint mechanism

1.2 Open checkpoint

By default, the checkpoint mechanism is turned off and needs to be turned on in the program:

// 开启检查点机制,并指定状态检查点之间的时间间隔
env.enableCheckpointing(1000); 

// 其他可选配置如下:
// 设置语义
env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);
// 设置两个检查点之间的最小时间间隔
env.getCheckpointConfig().setMinPauseBetweenCheckpoints(500);
// 设置执行Checkpoint操作时的超时时间
env.getCheckpointConfig().setCheckpointTimeout(60000);
// 设置最大并发执行的检查点的数量
env.getCheckpointConfig().setMaxConcurrentCheckpoints(1);
// 将检查点持久化到外部存储
env.getCheckpointConfig().enableExternalizedCheckpoints(ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);
// 如果有更近的保存点时,是否将作业回退到该检查点
env.getCheckpointConfig().setPreferCheckpointForRecovery(true);

1.3 Savepoint Mechanism

The savepoint mechanism (Savepoints) is a special implementation of the checkpoint mechanism. It allows you to manually trigger the Checkpoint and store the result persistently in the specified path. It is mainly used to prevent the Flink cluster from restarting or upgrading. resulting in loss of state.

1.4 RichFunction checkpoint combat

public class OperatorWarning implements CheckpointedFunction {
    
    
    // 非正常数据
    private List<Tuple2<String, Long>> bufferedData;
    // checkPointedState
    private transient ListState<Tuple2<String, Long>> checkPointedState;
   
    @Override
    public void initializeState(FunctionInitializationContext context) throws Exception {
    
    
        // 注意这里获取的是OperatorStateStore
        checkPointedState = context.getOperatorStateStore().
                getListState(new ListStateDescriptor<>("abnormalData", TypeInformation.of(new TypeHint<Tuple2<String, Long>>() {
    
    })));
        // 如果发生重启,则需要从快照中将状态进行恢复
        if (context.isRestored()) {
    
    
            for (Tuple2<String, Long> element : checkPointedState.get()) {
    
    
                bufferedData.add(element);
            }
        }
    }

    @Override
    public void snapshotState(FunctionSnapshotContext context) throws Exception {
    
    
        // 在进行快照时,将数据存储到checkPointedState
        checkPointedState.clear();
        for (Tuple2<String, Long> element : bufferedData) {
    
    
            checkPointedState.add(element);
        }
    }
}

2 State Management

2.1 Operator status

Operator State: As the name implies, the state is bound to the operator, and the state of an operator cannot be accessed by other operators. The explanation of Operator State in the official document is: each operator state is bound to one parallel operator instance, so it is more accurate to say that an operator state is bound to a concurrent operator instance, that is, assuming parallelism of operators degree is 2, then it should have two corresponding operator states:
Operator status

2.2 Keying status

Keyed State: It is a special operator state, that is, the state is distinguished according to the key value, and Flink will maintain a state instance for each type of key value. As shown in the figure below, each color represents a different key value, corresponding to four different state instances. It should be noted that the keyed state can only be used on KeyedStream, we can get KeyedStream through stream.keyBy(…).
keyed state

2.3 Monitoring status programming

Flink provides the following data formats to manage and store keyed state (Keyed State):

  • ValueState: Stores the state of a single-value type. Can be updated with update(T) and retrieved with T value().
  • ListState: Stores the state of the list type. Elements can be added with add(T) or addAll(List); and the entire list can be obtained with get().
  • ReducingState: used to store the result calculated by ReduceFunction, use add(T) to add elements.
  • AggregatingState: used to store the result calculated by AggregatingState, use add(IN) to add elements.
  • FoldingState: It has been marked as obsolete and will be removed in future versions. It is officially recommended to use AggregatingState instead.
  • MapState: maintains the state of the Map type.
 @Override
    public void open(Configuration parameters) {
    
    
        StateTtlConfig ttlConfig = StateTtlConfig
                // 设置有效期为 10 秒
                .newBuilder(Time.seconds(10))
                // 设置有效期更新规则,这里设置为当创建和写入时,都重置其有效期到规定的10秒
                .setUpdateType(StateTtlConfig.UpdateType.OnCreateAndWrite)
                /*设置只要值过期就不可见,另外一个可选值是ReturnExpiredIfNotCleanedUp,
                 代表即使值过期了,但如果还没有被物理删除,就是可见的*/
                .setStateVisibility(StateTtlConfig.StateVisibility.NeverReturnExpired)
                .build();
        ListStateDescriptor<Long> descriptor = new ListStateDescriptor<>("abnormalData", Long.class);
        descriptor.enableTimeToLive(ttlConfig);
        abnormalData = getRuntimeContext().getListState(descriptor);
    }

    @Override
    public void flatMap(Tuple2<String, Long> value, Collector<Tuple2<String, List<Long>>> out) throws Exception {
    
    
        Long inputValue = value.f1;
        // 如果输入值超过阈值,则记录该次不正常的数据信息
        if (inputValue >= threshold) {
    
    
            abnormalData.add(inputValue);
        }

        ArrayList<Long> list = Lists.newArrayList(abnormalData.get().iterator());
        // 如果不正常的数据出现达到一定次数,则输出报警信息
        if (list.size() >= numberOfTimes) {
    
    
            out.collect(Tuple2.of(value.f0 + " 超过指定阈值 ", list));
            // 报警信息输出后,清空状态
            abnormalData.clear();
        }
    }

2.4 Operator state programming

Compared with the keyed state, the operator state currently supports only the following three types of storage:

  • ListState: Stores the state of the list type.
  • UnionListState: Stores the state of the list type. The difference from ListState is that if the degree of parallelism changes, ListState will summarize all concurrent state instances of the operator, and then distribute them equally to the new Task; while UnionListState just aggregates all concurrent state instances of the operator. The state instances are aggregated, and the specific partitioning behavior is defined by the user.
  • BroadcastState: The state of the operator used for broadcasting.
 @Override
    public void initializeState(FunctionInitializationContext context) throws Exception {
    
    
        // 注意这里获取的是OperatorStateStore
        checkPointedState = context.getOperatorStateStore().
                getListState(new ListStateDescriptor<>("abnormalData", TypeInformation.of(new TypeHint<Tuple2<String, Long>>() {
    
    })));
        // 如果发生重启,则需要从快照中将状态进行恢复
        if (context.isRestored()) {
    
    
            for (Tuple2<String, Long> element : checkPointedState.get()) {
    
    
                bufferedData.add(element);
            }
        }
    }

Remarks: An operator state is bound to a concurrent operator instance, that is, assuming that the parallelism of the operator is 2, then it should have two corresponding operator states

3 state backend

3.1 Implementation of state management

Implementation of state management


  • The default method of MemoryStateBackend is to store based on the heap memory of the JVM, which is mainly suitable for local development and debugging.

  • FsStateBackend
    is stored based on the file system, which can be a local file system or a distributed file system such as HDFS. It should be noted that although FsStateBackend is selected, the ongoing data is still stored in the memory of TaskManager, and the state snapshot will be written to the specified file system only at checkpoint.

  • RocksDBStateBackend
    RocksDBStateBackend is a third-party state manager built into Flink, which uses the embedded key-value database RocksDB to store ongoing data. Wait until the checkpoint, and then persist the data in the specified file system, so when using RocksDBStateBackend, you also need to configure the file system for persistent storage. The reason for this is that RocksDB is less secure as an embedded database, but its read rate is faster than the full file system; compared with the full memory, its storage space is larger, so it is a A relatively balanced solution.

3.2 Configuration method

  • Configuration based on code is only effective for the current job:
// 配置 FsStateBackend
env.setStateBackend(new FsStateBackend("hdfs://namenode:40010/flink/checkpoints"));
// 配置 RocksDBStateBackend
env.setStateBackend(new RocksDBStateBackend("hdfs://namenode:40010/flink/checkpoints"));

// 配置 RocksDBStateBackend 时,需要额外导入下面的依赖:
<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-statebackend-rocksdb_2.12</artifactId>
    <version>1.12</version>
</dependency>
  • Configuration based on the flink-conf.yaml configuration file takes effect for all jobs deployed on the cluster:
state.backend: filesystem
state.checkpoints.dir: hdfs://namenode:40010/flink/checkpoints

Guess you like

Origin blog.csdn.net/wolfjson/article/details/118545872