Big data calculation engine of Flink Flink state management and fault tolerance

Original Address: big data computing Flink Flink engine of state management and fault tolerance

Stateful computing

In Flink architecture system, stateful computing is arguably one very important feature Flink. Refers to a state in the program calculation process of calculation, intermediate results inside Flink program, stores the calculated generated and supplied to the calculation result using Functions or grandchildren. As shown in FIG:
Flink state calculation schematic .png
status data may maintain local storage, where storage may be Flink heap memory or external memory heap may be a storage medium by a third party, for example: Flink already implemented RocksDB, of course, the user can to achieve their respective caching system to store state information to complete more complex computation logic. And the state is different is calculated, the calculation result is not stored stateless produced during calculations, the results will not used in the next calculation process, the procedure will be computed on the current calculation process, the calculation is done on the output and then the next access to the data, then the process.
Stateless achieve computational complexity is relatively low, relatively easy to implement, but unable to complete more complex business scenario mentioned, for example:

  • [] Users want to achieve CEP (complex event processing), to obtain a specific event in line with the rules of the time, the state of the calculation stored on events can be accessed, and then wait in line with the rules of the event trigger;
  • [] The user wants to follow minutes / hour / day calculations and the like polymerization, the current maximum value is obtained, like the mean polymerization index, which requires the use of a state to maintain the result of the current calculation process, for example the total number of events, and the sum of the maximum , minimum value;
  • [] Users want to achieve training machine learning models on Srteam, it can help you maintain state calculation parameters of the current version of the model used;
  • [] The user wants to use historical data is calculated, the state of computing can help the user data buffer, so that the user can directly obtain the corresponding historical data from the state.

Status Type

Flink depending on whether the data set partitions as Key, the state is divided into Keyed State and Operator State (Non-Keyed State) of two types.

Keyed State

shows a state of the associated key and can only be used on the data set corresponding to the type of KeyedStream Functions and Operators. Operator Keyed State State is a special case, except that according to the prior Keyed State key partitioned data sets, each corresponding to a combination of only a State Key and the Key Operator. Keyed State can be managed by a Key Group, mainly used when the operator changes the degree of parallelism automatically redistributing Keyed State data.

Operator State

Unlike Keyed State is, and only the parallel Operator State operator instance bindings, and independent data elements Key, part of the status of each operator holds all of the data instance of the data elements. Support Operator State when change operator parallelism Auto Reallocation example status data.

Flink while the Keyed State and Operator State has two forms, one of which is managed state (Managered State) forms, controlled and managed by the state data in the Runtime Flink, and memory state data conversion is called Hash tables or Recks DB object storage, and then these data through the internal interface state persisted to Checkpoints, the task data recovery tasks such abnormal state. Another is the native state (Row State) form, by the operator to manage a data structure, during when the trigger Checkpoints, Flink does not know the internal data structure of status data, converts the data into bytes only of data stored in Checkpoints, when Checkpoints when the recovery task, the operator himself deserializing a data structure of a state.

Notes: Flink recommended users Managered State management state data, mainly due to: Manager State can better support state data re-balancing and more sophisticated memory management.

Managered Keyed State

Flink Managered Keyed State following types may be used, each state has a corresponding usage scenario, the user can choose according to actual demand.

  • [] ValueState[T]:, E.g. user_id statistics corresponding to the number of transactions, the transaction will be updated each time the user count state value in the state corresponding to a single value Key. ValueState corresponding update method update(T), the value is T value();
  • [] ListState[T]: Status and Key list of elements corresponding to the stored status list List element. For example, define the IP address ListValue stores the user frequently visits. Add ListState elements using add(T) , addAll(List[T])two methods. Get element used Iterable<T> get()method, using an element updating update(List[T])method;
  • [] ReducingState[T]: Key definitions associated with data element values of a single aggregate state, after the user specifies the storage through ReduceFunction index calculation, therefore, ReduceState specify ReduceFunction status data to complete the polymerization. ReducingState used additive element add(T)method, using the acquired element T get();
  • [] AggregeateState[IN,OUT]: Defines the data elements of a single polymerization condition associated with the key value for the maintenance after the specified calculation AggregateFunction index data. And ReducingState compared to the input-output type AggregeateState not necessarily the same, but ReducingState input / output type must be consistent . And ListState Similarly, AggregatingState specify AggregateFunction polymerization operation completion status data. AggregatringState used additive element add(IN)method, using the acquired element OUT get()method;
  • [] MapState<UK, UV>: This will keep a list of mappings. You can key-value pairs in the state and retrieve all current maps stored Iterable. Use put(UK, UV)or add map putAll(Map[UK,UV])(Map <UK, UV>) . Search key value associated with the user may be used get(UK). For mapping, keys and values may be used can be retrieved view iteration entries(), keys()and values()respectively.

Stateful Function definitions
Example:
In RichFlatMapFunction defined ValueState, has completed the acquisition of the minimum value:

    inputStream.keyBy(_._1).flatMap(
      // (String,Long,Int) 输入类型
      // (String,Long,Long) 输出类型
      new RichFlatMapFunction[(Int,Long) , (Int,Long,Long)] {
        private var leastValueState:ValueState[Long] = _
        // 定义状态名称
        private var leastValueStateDesc:ValueStateDescriptor[Long] = _
        override def open(parameters: Configuration): Unit = {
          // 指定状态类型
          leastValueStateDesc = new ValueStateDescriptor[Long]("leastValueState" , classOf[Long])
          // 通过 getRuntimeContext.getState 拿到状态
          leastValueState = getRuntimeContext.getState(leastValueStateDesc)
        }
        override def flatMap(value: (Int, Long), out: Collector[(Int, Long, Long)]): Unit = {
          // 通过 value 拿到最小值
          val leastValue: Long = leastValueState.value()

          // 如果前一个指标大于最小值,则直接输出数据元素和最小值
          if ( leastValue != 0L && value._2 > leastValue){
            out.collect((value._1 , value._2 , leastValue))
          }else{
            // 如果当前指标小于最小值,则更新状态中的最小值
            leastValueState.update(value._2)
            // 将当前数据中的指标作为最小值输出
            out.collect(value._1 , value._2 , value._2)
          }
        }
      }).print()

State Lifecycle

for any type Keyed State can set the lifecycle state (TTL), to ensure that the state can immediately clean up the data within the specified time. Life cycle state function configured by StateTtlConfig then passed StateDescriptor StateTtlConfig arranged in enableTimeToLive method can be. Keyed State configuration example is shown below:

          val config: StateTtlConfig = StateTtlConfig
            // 指定TTL时长为 5s
            .newBuilder(Time.seconds(5))
            // 指定TTL 刷新只对创建和写入操作有效
            .setUpdateType(StateTtlConfig.UpdateType.OnCreateAndWrite)
            // 指定状态可见性不返回过期数据
            .setStateVisibility(StateTtlConfig.StateVisibility.NeverReturnExpired)
            .build()
          leastValueStateDesc.enableTimeToLive(config)

In addition to the parameters in StateTtlConfig expiration time set by the method newBuilder () it is necessary, the other parameters are optional, or use the default value. The method in which setUpdateType passed There are two types:

  1. StateTtlConfig.UpdateType.onCreateAndWrite TTL only update when creating and writing;
  2. StateTtlConfig.UpdateType.OnReadAndWriter only read and write operations update TTL;
    note that expired state data configured according to UpdateType parameters, only to be written to or read is to be updated TTL, that is to say if a state index has not used live update, you never fire clean-up operations on the state of the data, this situation could lead to state data in the system is growing.

Further, the state can be set by a method setStateVisibility visibility, to determine whether the return state data based on expired data cleaning:

  1. StateTtlConfig.StateVisibility.NeverReturnExpired: Expiration status data will not return (default)
  2. StateTtlConfig.StateVisibility.ReturnExpiredIfNotCleanedUp: the state of the data even if expired but still have not been cleared to return

Scala DataStream API used state

Directly on the code snippet:

    inputStream.keyBy(_._1)
      // 指定输入参数类型和状态参数类型
      .mapWithState((in:(Int,Long) , count : Option[Int]) =>
        // 判断count 类型是否非空
        count match {
          // 输出 key , count 并在原来 count 数据上累加
          case Some(c) => ((in._1 , c) , Some(c + in._2))
            // 如果状态为空,则将指标填入
          case None => ((in._1 , 0) , Some(in._2))
        }
      )

Manager Operator State

Operator State is a non-keyed-state, the sub-instance associated with the operation of the parallel count, for example, in Kafka Connector, each operator terminal Kafka consumption instance corresponds to a partition in Kafka, maintenance and Offsets partial partition Topic as the shift amount Operators operator State. Flink may be implemented in CheckpointedFunctionor ListCheckpoint<T extends Serializable>two interfaces to define the function of the operation Managered operator State.

State CheckpointedFunction interface operation by the Operator

CheckpointedFunction FIG interface definition:

@PublicEvolving
@SuppressWarnings("deprecation")
public interface CheckpointedFunction {

    /**
     * This method is called when a snapshot for a checkpoint is requested. This acts as a hook to the function to
     * ensure that all state is exposed by means previously offered through {@link FunctionInitializationContext} when
     * the Function was initialized, or offered now by {@link FunctionSnapshotContext} itself.
     *
     * @param context the context for drawing a snapshot of the operator
     * @throws Exception
     */
    void snapshotState(FunctionSnapshotContext context) throws Exception;

    /**
     * This method is called when the parallel function instance is created during distributed
     * execution. Functions typically set up their state storing data structures in this method.
     *
     * @param context the context for initializing the operator
     * @throws Exception
     */
    void initializeState(FunctionInitializationContext context) throws Exception;
}

In each individual operator in, Managered Operator State List are stored in the status data between the operator and the operator independently of one another, more suitable for re-List stored distribution data, currently supports Manager Operator State Flink two important distribution strategy, namely Event-split Redistribution and Union Redistribution.

  • [] The Event Split-Redistribution : Each Operator List example list contains some elements of the state of the entire data list List of all, the entire state of all data collection list List. When the trigger restore / redistribution operation, by the same number of list List status data evenly into parallelism with the operator, in each instance there is a task List, which may be empty or contain a plurality of elements.
  • [] Of Union Redistribution : each instance Operator List contains a list of all state of the element when the trigger restore / redistribution operation, each operator can obtain a complete list of status elements.

Checkpoints 和 Savepoints

State Manager

Querable State

Guess you like

Origin www.cnblogs.com/sun-iot/p/12089562.html