Look at flink source code to learn flink----flink state

foreword

Through the explanation of flink source code, let's enter the world of flink together.

The following is the content of the text of this article, and the following cases are
for reference

  1. Look at flink source code to learn flink----flink state


state of flink

considerable

Apache Flink® — Stateful Computations over Data Streams
Stateful Computation on Data Streams
https://flink.apache.org/

Why is state needed?

● Fault tolerance
Batch calculation: none, then success, then recalculation.
Stream computing: failover mechanism
○ In most scenarios, it is incremental calculation, and the data is processed one by one, and each calculation depends on the last calculation result
○ Program error (machine, network, dirty data) causes the state recovery from the checkpoint when restarting the job
● Fault-tolerant mechanism, flink's precise fault-tolerant mechanism once. Continuously create snapshots of distributed data streams. The snapshots are lightweight and have little impact on performance. The state is saved in a configurable environment, on the master node or HDFS. In case of program failure (machine, network, software, etc.), the system restarts all operators and resets to the latest successful checkpoint. The input is reset to the corresponding state snapshot position to ensure that any record processed in the restarted parallel data stream is not part of the checkpoint state.
For fault tolerance to work, the data source (message queue or broker) needs to be able to replay the data stream. For example, the flink-kafka-connector
paper: Lightweight Asynchronous Snapshots for Distributed Dataflows https://arxiv.org/abs/1506.08603 describes the mechanism for flink to create snapshots.
The paper is based on the distributed snapshot algorithm to
checkpoint&savepoints
conceptually: SavepointBackup in traditional database, CheckpointMethods of recovering logs
● Flink is responsible for redistribution of state across parallel instances
● watermark
● barriers
barriers, which are inserted into the data stream, flow with the data, and split the data stream into two parts, one part enters the current snapshot and the other part enters the next snapshot, and the barrier carries the snapshot ID. Multiple barriers for multiple different snapshots will appear at the same time, that is, multiple snapshots may be created at the same time.

● element store...

What is state?

● Simple understanding:
stream computing data is fleeting, and real scenarios often require previous "data", so these required "data" are called state. Also called status.

After the original data enters the user code, it is output to the downstream. If the reading and writing of state is involved in the middle, these states will be stored in the local state backend.
● Detailed explanation:
state refers to the intermediate calculation result or metadata attribute of the calculation node in the flow calculation process.
For example:
○ Intermediate aggregation results during the aggregation process.
○ The offset of reading records in the process of consuming data in Kafka.
○ The operator includes any form of state, and these states must be included in the snapshot. There are many forms of state:
■ User-defined—a state directly created or modified by a transformation function such as map() or filter(). User-defined state can be: a simple variable of a Java object in a transformation function or key/value state associated with a function.
■ System state: This state refers to data cached as part of operator computation. A typical example is: window buffers (window buffers), the system collects the data corresponding to the window into it, until the window is calculated and launched.
Summary: A snapshot of the internal data (computational data and metadata attributes) of a flink task.

● State generally refers to the state of a specific task/operator.
Checkpoint represents a Flink Job, a global state snapshot at a specific moment, which includes the state of all tasks/operators.
Saving mechanism StateBackend (state backend), by default, State will be saved in the memory of TaskManager, and CheckPoint will be stored in the memory of JobManager.
The storage location of State and CheckPoint depends on the configuration of StateBackend. MemoryStateBackend based on memory, FsStateBackend based on file system, RocksDBState-Backend based on RockDB storage medium

state definition:

According to the definition of the state descriptor
, the StateTtlConfiguration object is passed to the state descriptor to realize the state cleaning.
● Define ttl (Time to Live)
● State time to live
● State time to live...

state classification:

● Whether it belongs to a certain key
○ key state: keyedStream save state
○ operator state: normal non-key save state
● whether it is managed by flink
○ raw state: managed by the application itself
○ manage state: flink management
keyedState
here is the key we are in The corresponding fields in GroupBy/PartitioneBy in the SQL statement, the key value is the Row byte array composed of groupby/PartitionBy fields, each key has its own State, and the State between the key and the key is invisible ;
● OperatorState
In the implementation of Source Connector inside Flink, OperatorState is used to record the offset of source data read.

KeyedState OperatorState
Whether there is a currently processed key (current key) none There is a current key
Whether the storage object is on heap There is only one on-heap implementation There are multiple implementations of on-heap and off-heap (RocksDB)
Do you need to manually declare snapshot (snapshot) and restore (restore) Manual implementation Implemented by backend itself, transparent to users
data size Generally small Generally large

An operator takes a snapshot of the state of a barrier after it has received all of its input streams and before emitting the barriers to its output streams. At this point, the data before the barrier has updated the state, and it will no longer depend on the data before the barrier. Since snapshots can be very large, the backend storage system is configurable. The default is to store in the memory of the JobManager, but for the production system, it needs to be configured as a reliable distributed storage system (such as HDFS). After the state storage is completed, the operator will confirm that its checkpoint is completed, and emit the barrier to the subsequent output stream.
  The snapshot now contains:
  1. For parallel input data sources: the position offset in the data stream when the snapshot was created
  2. For the operator: the state pointer stored in the snapshot
insert image description here

state creation (writing):

● Flink operates the code –> tasks one by one (put in taskmanager) –> each task contains an abstract class AbstractInvokable –> the main function of the task is to call AbstractInvokable.invoke() –> there are 5 implementations of this abstract method
insert image description here
Streaming The corresponding implementations in the processing are all inherited from StreamTask—>the ​​abstract class of StreamTask contains the invoke() method (about 150 lines of code)—>call the
Runnable interface in run()–>processInput(actionContext)–>inputProcessor.processInput(): the The method completes the processing of user input data (user data, watermark, checkpoint data) ----> streamOperator.processElement(record): streamOperator processes data
insert image description here
StreamOneInputProcessor—>streamOperator.processElement(record);
insert image description here
● StreamTask:
defines a complete life cycle,

protected abstract void init() throws Exception;
private void run() throws Exception {
    
    };
protected void cleanup() throws Exception {
    
    };
protected void cancelTask() throws Exception {
    
    };

Example: OneInputStreamTask (handle an input situation)
TaskManage---->start task----> Task->implement the Runnable interface run()---->dorun(): 320 lines of code, create an invokable object (reflection – > get class --> get constructor --> instantiate object)

private static AbstractInvokable loadAndInstantiateInvokable(
    ClassLoader classLoader, String className, Environment environment) throws Throwable {
    
    

    final Class<? extends AbstractInvokable> invokableClass;
    try {
    
    
        // 使用指定的classloader加载className对应的class,并转换为AbstractInvokable类型
        invokableClass =
            Class.forName(className, true, classLoader).asSubclass(AbstractInvokable.class);
    } catch (Throwable t) {
    
    
        throw new Exception("Could not load the task's invokable class.", t);
    }

    Constructor<? extends AbstractInvokable> statelessCtor;

    try {
    
    
        // 获取构造函数
        statelessCtor = invokableClass.getConstructor(Environment.class);
    } catch (NoSuchMethodException ee) {
    
    
        throw new FlinkException("Task misses proper constructor", ee);
    }

    // instantiate the class
    try {
    
    
        //noinspection ConstantConditions  --> cannot happen
        // 传入environment变量,创建出新的对象
        return statelessCtor.newInstance(environment);
    } catch (InvocationTargetException e) {
    
    
        // directly forward exceptions from the eager initialization
        throw e.getTargetException();
    } catch (Exception e) {
    
    
        throw new FlinkException("Could not instantiate the task's invokable class.", e);
    }
}

– invoke()
|
±—> Create basic utils (config, etc) and load the chain of operators
±—> operators.setup()
±—> task specific init()
±—> initialize-operator-states() : initializeState();
±—> open-operators()
±—> run()
±—> close-operators()
±—> dispose-operators()
±—> common cleanup
±—> task specific cleanup()

● Flink’s StreamTask threading model based on MailBox
Let’s first look at the original motivation for this transformation/improvement. In the previous Flink’s threading model, there will be multiple potential threads to concurrently access its internal state, such as event-processing and checkpoint Triggering, they all use a global lock (checkpoint lock) to ensure thread safety. The problems brought about by this implementation scheme are: ○ The
lock object will be passed in multiple classes, and the readability of the code is relatively poor
. ○ When using , if the lock is not acquired, it may cause many problems, making it difficult to locate the problem. The
lock object is also exposed to the user-oriented API (see SourceFunction#getCheckpointLock())
MailBox design document:
insert image description here

create state

Pass parameter abstract class StateDescriptor—> 5 subclass implementations, you can see that the final state is not limited to state types, operatorState & keyState

method comes from RichFunction, which belongs to the basic interface of rich functions, and specifies the RuntimeContext

public interface RichFunction extends Function {
    
    
    void open(Configuration parameters) throws Exception;
    void close() throws Exception;
    RuntimeContext getRuntimeContext();
    IterationRuntimeContext getIterationRuntimeContext();
    void setRuntimeContext(RuntimeContext t);
}

RuntimeContext interface —> abstract implementation class AbstractRuntimeUDFContext —> implementation class StreamingRuntimeContext
is based on StreamingRuntimeContext (other implementation classes are ultimately consistent), getState (ValueStateDescriptor stateProperties) method cuts in:

public <T> ValueState<T> getState(ValueStateDescriptor<T> stateProperties) {
    
    
      // 检查工作,判断是否为null,最终还是操作 keyedStateStore 这个属性,只是局部变量又拿了一次引用地址
      KeyedStateStore keyedStateStore = this.checkPreconditionsAndGetKeyedStateStore(stateProperties);
      //  对序列化器进行初始化
      stateProperties.initializeSerializerUnlessSet(this.getExecutionConfig());
      return keyedStateStore.getState(stateProperties);
    }
private KeyedStateStore checkPreconditionsAndGetKeyedStateStore(StateDescriptor<?, ?> stateDescriptor) {
    
    
      // 检查这个 stateDescriptor 不能为null,该参数是由最初一直传递至此  
      Preconditions.checkNotNull(stateDescriptor, "The state properties must not be null");
      // 该类自身的全局变量(属性、字段) keyedStateStore 不能为null
      Preconditions.checkNotNull(this.keyedStateStore, "Keyed state can only be used on a 'keyed stream', i.e., after a 'keyBy()' operation.");
        return this.keyedStateStore;
    }

Keyed state can only be used on a 'keyed stream', ie, after a 'keyBy()' operation. The
method judged as null is often referenced inside flink. StreamingRuntimeContext statically imports this method, so it can be called directly. You can understand one time:

public static <T> T checkNotNull(@Nullable T reference, @Nullable String errorMessage) {
    
    
        if (reference == null) {
    
    
            throw new NullPointerException(String.valueOf(errorMessage));
        } else {
    
    
            return reference;
        }
    }

At this point, it should be noted that if the property keyedStateStore of StreamingRuntimeContext itself is null, a null pointer exception will be thrown, so how to load this property with a negative value, the function of this property is well known, and it is related to the storage of the state. study next

There is a key point here. The constructor of this class accepts an AbstractStreamOperator<?> operator, and the keyedStateStore field is initialized as operator.getKeyedStateStore(). From this time, a conclusion can be drawn. The state acquisition and operator (Operator, for example map, flatMap) is related,

@VisibleForTesting
    public StreamingRuntimeContext(AbstractStreamOperator<?> operator, Environment env, Map<String, Accumulator<?, ?>> accumulators) {
    
    
        this(env, accumulators, operator.getMetricGroup(), operator.getOperatorID(), operator.getProcessingTimeService(), operator.getKeyedStateStore(), env.getExternalResourceInfoProvider());
    }

    public StreamingRuntimeContext(Environment env, Map<String, Accumulator<?, ?>> accumulators, MetricGroup operatorMetricGroup, OperatorID operatorID, ProcessingTimeService processingTimeService, @Nullable KeyedStateStore keyedStateStore, ExternalResourceInfoProvider externalResourceInfoProvider) {
    
    
        super(((Environment)Preconditions.checkNotNull(env)).getTaskInfo(), env.getUserCodeClassLoader(), env.getExecutionConfig(), accumulators, env.getDistributedCacheEntries(), operatorMetricGroup);
        this.taskEnvironment = env;
        this.streamConfig = new StreamConfig(env.getTaskConfiguration());
        this.operatorUniqueID = ((OperatorID)Preconditions.checkNotNull(operatorID)).toString();
        this.processingTimeService = processingTimeService;
        this.keyedStateStore = keyedStateStore;
        this.externalResourceInfoProvider = externalResourceInfoProvider;
    }


keyedStateStore is equivalent to operator.getKeyedStateStore(), and operator creates AbstractStreamOperator by AbstractStreamOperator<?> operator

public final void initializeState(StreamTaskStateInitializer streamTaskStateManager) throws Exception {
    
    
        TypeSerializer<?> keySerializer = this.config.getStateKeySerializer(this.getUserCodeClassloader());
        StreamTask<?, ?> containingTask = (StreamTask)Preconditions.checkNotNull(this.getContainingTask());
        CloseableRegistry streamTaskCloseableRegistry = (CloseableRegistry)Preconditions.checkNotNull(containingTask.getCancelables());
        StreamOperatorStateContext context = streamTaskStateManager.streamOperatorStateContext(this.getOperatorID(), this.getClass().getSimpleName(), this.getProcessingTimeService(), this, keySerializer, streamTaskCloseableRegistry, this.metrics, this.config.getManagedMemoryFractionOperatorUseCaseOfSlot(ManagedMemoryUseCase.STATE_BACKEND, this.runtimeContext.getTaskManagerRuntimeInfo().getConfiguration(), this.runtimeContext.getUserCodeClassLoader()), this.isUsingCustomRawKeyedState());
        this.stateHandler = new StreamOperatorStateHandler(context, this.getExecutionConfig(), streamTaskCloseableRegistry);
        this.timeServiceManager = context.internalTimerServiceManager();
        this.stateHandler.initializeOperatorState(this);
 // 这里 setKeyedStateStore 就是给 StreamingRuntimeContext.keyedStateStore 修改值
  this.runtimeContext.setKeyedStateStore((KeyedStateStore)this.stateHandler.getKeyedStateStore().orElse((Object)null));
    }

keyedStateStore —> is created by StreamOperatorStateHandler

public StreamOperatorStateHandler(StreamOperatorStateContext context, ExecutionConfig executionConfig, CloseableRegistry closeableRegistry) {
    
    
        this.context = context;
        this.operatorStateBackend = context.operatorStateBackend();
        this.keyedStateBackend = context.keyedStateBackend();
        this.closeableRegistry = closeableRegistry;
        if (this.keyedStateBackend != null) {
    
    
          // 创建了keyedStateStore
            this.keyedStateStore = new DefaultKeyedStateStore(this.keyedStateBackend, executionConfig);
        } else {
    
    
            this.keyedStateStore = null;
        }
    }

Summary: That is, the state backend does not exist, that is, the default is generated, and it is empty for the first time.
You may be curious about when initializeState was called by whom.
It comes from the operator chain. Flink will combine multiple operators that meet the conditions into an operator chain (OperatorChain). Then, when scheduling, a task is actually executing an OperatorChain. When there are multiple degrees of parallelism, multiple tasks each execute an OperatorChain
abstract parent class AbstractInvokable---->abstract subclass StreamTask—>invoke()—>initializeState(); openAllOperators();
initializeState();

private void initializeState() throws Exception {
    
    
		StreamOperator<?>[] allOperators = operatorChain.getAllOperators();
		for (StreamOperator<?> operator : allOperators) {
    
    
			if (null != operator) {
    
    
				operator.initializeState();
			}
		}
	}

openAllOperators();

private void openAllOperators() throws Exception {
    
    
		for (StreamOperator<?> operator : operatorChain.getAllOperators()) {
    
    
			if (operator != null) {
    
    
				operator.open();
			}
		}
	}

stateProperties.initializeSerializerUnlessSet(getExecutionConfig());
This function call actually came to StateDescriptor. I saw that all states are its subclasses.
The meaning of the annotation on the method: initialize the serializer unless it has been initialized before

//描述值类型的类型信息。只有在序列化器是惰性创建时才使用。
private TypeInformation<T> typeInfo;
//类型的序列化器。可能在构造函数中被急切地初始化,或者被惰性地初始化一次
public boolean isSerializerInitialized() {
    
    
		return serializerAtomicReference.get() != null;
	}
public void initializeSerializerUnlessSet(ExecutionConfig executionConfig) {
    
    
  // 先判断这个序列化器是否已经被创建,这个类代码比较简单,如下看看就好,
  // 如上看到调用默认构造器,则该对象的value字段为null,第一次代码到这里必定为null
		if (serializerAtomicReference.get() == null) {
    
    
      // 判断 typeInfo 是否为null,下面有代码剖析构造器这块
			checkState(typeInfo != null, "no serializer and no type info");
      // 尝试实例化和设置序列化器,这里是使用类型来创建序列化器,可以看到该处逻辑每执行一次就会创建一个序列化器
			TypeSerializer<T> serializer = typeInfo.createSerializer(executionConfig);
			// use cas to assure the singleton
      // 使用cas来保证单例,此处就是创建核心, compareAndSet 下面剖析
			if (!serializerAtomicReference.compareAndSet(null, serializer)) {
    
    
				LOG.debug("Someone else beat us at initializing the serializer.");
			}
		}
	}

state clear

Why do you need state clearing?

● state timeliness: effective within a certain period of time, once a certain point of time has passed, it has no application value
● Control the size of flink state: manage the ever-growing state size

How to define state clearing?

Flink1.6 introduces the State TTL function. The developer configures the expiration time and clears it after defining the timeout (Time to Live). The
StateTtlConfiguration object is passed to the state descriptor to realize the state cleanup.
Continuously clean up the historical data of RocksDB and heap state backend (FSStateBackend and MemoryStateBackend), so as to realize the continuous cleanup of expired state
Sample code

public class StateDemo {
    
    
    public static void main(String[] args) throws Exception {
    
    
        LocalStreamEnvironment env = StreamExecutionEnvironment.createLocalEnvironment();
        // this can be used in a streaming program like this (assuming we have a StreamExecutionEnvironment env)
        env.fromElements(Tuple2.of(1L, 3L), Tuple2.of(1L, 5L), Tuple2.of(1L, 10L), Tuple2.of(1L, 4L), Tuple2.of(1L, 2L))
                .keyBy(0)
                .flatMap(new MyFlatMapFunction())
                .print();

        // the printed output will be (1,4) and (1,5)
        env.execute();
    }
}

class MyFlatMapFunction extends RichFlatMapFunction<Tuple2<Long, Long>, Tuple2<Long, Long>> {
    
    

    private static final long serialVersionUID = 1808329479322205953L;
    /**
     * The ValueState handle. The first field is the count, the second field a running sum.
     */
    private transient ValueState<Tuple2<Long, Long>> sum;

    // 状态过期清除
    // flink 的状态清理是惰性策略,也就是我们访问的状态,可能已经过期了,但是还没有删除状态数据,我们可以配置
    // 是否返回过期状态的数据,不论是否返回过期数据,数据被访问后会立即清除过期状态。并且截止1.8.0 的版本
    // 状态的清除针对的是process time ,还不支持event time,可能在后期的版本中会支持。

    // flink的内部,状态ttl 功能是通过上次相关状态访问的附加时间戳和实际状态值来实现的,这样的方案会增加存储
    // 上的开销,但是会允许flink程序在查询数据,cp的时候访问数据的过期状态
    StateTtlConfig ttlConfig =
            StateTtlConfig.newBuilder(Time.days(1)) //它是生存时间值
                    .setUpdateType(StateTtlConfig.UpdateType.OnCreateAndWrite)
                    //状态可见性配置是否在读取访问时返回过期值
//            .setStateVisibility(StateTtlConfig.StateVisibility.NeverReturnExpired)
                    .cleanupFullSnapshot() // 在快照的时候进行删除
                    .build();


    @Override
    public void flatMap(Tuple2<Long, Long> input, Collector<Tuple2<Long, Long>> out) throws Exception {
    
    

        // access the state value
        Tuple2<Long, Long> currentSum = sum.value();

        // update the count
        currentSum.f0 += 1;

        // add the second field of the input value
        currentSum.f1 += input.f1;

        // update the state
        sum.update(currentSum);

        // if the count reaches 2, emit the average and clear the state
        if (currentSum.f0 >= 2) {
    
    
            out.collect(new Tuple2<>(input.f0, currentSum.f1 / currentSum.f0));
            sum.clear();
        }
    }

    @Override
    public void open(Configuration config) {
    
    
        ValueStateDescriptor<Tuple2<Long, Long>> descriptor =
                new ValueStateDescriptor<>(
                        "average", // the state name
                        TypeInformation.of(new TypeHint<Tuple2<Long, Long>>() {
    
    
                        }), // type information
                        Tuple2.of(0L, 0L)); // default value of the state, if nothing was set

        //设置stage过期时间
        descriptor.enableTimeToLive(ttlConfig);
        sum = getRuntimeContext().getState(descriptor);
    }
}

core code

StateTtlConfig ttlConfig =
            StateTtlConfig.newBuilder(Time.days(1)) //它是生存时间值
                    .setUpdateType(StateTtlConfig.UpdateType.OnCreateAndWrite)
                    //状态可见性配置是否在读取访问时返回过期值
                    .setStateVisibility(StateTtlConfig.StateVisibility.NeverReturnExpired)
                    .setTtlTimeCharacteristic(StateTtlConfig.TtlTimeCharacteristic.ProcessingTime)
                    .cleanupFullSnapshot() // 在快照的时候进行删除
                    .build();

quite 1.9

private StateTtlConfig(
		UpdateType updateType,
		StateVisibility stateVisibility,
		TtlTimeCharacteristic ttlTimeCharacteristic,
		Time ttl,
		CleanupStrategies cleanupStrategies) {
    
    
		this.updateType = checkNotNull(updateType);
		this.stateVisibility = checkNotNull(stateVisibility);
		this.ttlTimeCharacteristic = checkNotNull(ttlTimeCharacteristic);
		this.ttl = checkNotNull(ttl);
		this.cleanupStrategies = cleanupStrategies;
		checkArgument(ttl.toMilliseconds() > 0, "TTL is expected to be positive.");
	}

● .newBuilder()
: Indicates the state expiration time. Once set, the last access timestamp + TTL exceeds the current time, and the mark state expires.
Implementation:
https://ci.apache.org/projects/flink/flink-docs-master /api/java/org/apache/flink/runtime/state/ttl/class-use/TtlTimeProvider.html
● .setUpdateType() You can see the source code
setUpdateType (StateTtLConfig.UpdateType.OnCreateAndWrite)
indicates the timing of updating the state timestamp, is an Enum object.
○ Disabled, indicates that the timestamp will not be updated;
○ OnCreateAndWrite, indicates that the timestamp will be updated when the state is created or written every time;
○ OnReadAndWrite, in addition to updating the timestamp when the state is created and written, reading will also The timestamp of the updated state.
● .setStateVisibility()
indicates how to deal with the state that has expired but has not been cleaned up, and it is also an Enum object.
○ ReturnExpiredlfNotCleanedUp, even if the timestamp of this state indicates that it has expired, it will be returned to the caller as long as it has not been actually cleaned up; ○ NeverReturnExpired, once
this state expires, it will never be returned to the caller The caller will only return an empty state, avoiding the interference caused by the expired state.
● .setTtlTimeCharacteristic(StateTtlConfig.TtlTimeCharacteristic.ProcessingTime)
TimeCharacteristic and TtlTimeCharacteristic: indicate the time mode applicable to the State TTL function, which is still an Enum object.
The former has been marked as Deprecated (abandoned), and it is recommended that new code adopt the new TtlTimeCharacteristic parameter.
As of Flink 1.8, only ProcessingTime is supported as a time mode, and State TTL support for EventTime mode is still under development. (See 1.9 also only supports ProcessingTime)
Flink time concept
insert image description here
○ EventTime: event creation time
○ Processing Time: local system time when data flows to each time-based operation operator, default
○ Ingestion Time: time when data enters flink
.cleanupFullSnapshot() : Look at the source code
Indicates the cleanup strategy for expired objects. Currently, there are three Enum values.
○ FULL_STATE_SCAN_SNAPSHOT: corresponds to the EmptyCleanupStrategy class, which means that the expired state will not be actively cleaned up. When a full snapshot (Snapshot / Checkpoint) is executed, a smaller state file will be generated, but the local state will not be reduced. Only when the job is restarted and restored from the last snapshot point will the local state actually be reduced, so memory pressure may still not be resolved.
In order to solve the problem of memory pressure, Flink also provides an enumeration value for incremental cleanup. Flink can be configured to perform a cleanup operation every time several records are read, and you can specify how many invalid records to clean up each time: ○ INCREMENTAL_CLEANUP
: For Heap StateBackend
○ ROCKSDB_COMPACTION_FILTER (1.9 obsolete) For RocksDB StateBackend, the state cleaning of RocksDB is implemented by calling the FlinkCompactionFilter written in C++ language through JNI. .
Frequently Asked Questions:
● Is past status data accessible?
State expiration clearing, flink's state cleanup is an inert strategy, that is, the state we visit may have expired, but the state data has not been deleted. We can configure whether to return the data in the expired state, regardless of whether the expired data is returned, the data is accessed The expired state is cleared immediately after.
Internally in Flink, state TTL functionality is implemented by storing an additional timestamp of the last relevant state access along with the actual state value. Although this approach adds some storage overhead, it allows Flink programs to access the expired state of data when querying data, checkpointing, and restoring data.
It is worth noting:
and as of version 1.8.0, the status clearing is for process time, and event time is not yet supported. Users can only define the status TTL according to the processing time (Processing Time). Future versions of Apache Flink plan to support event time (Event Time)
● How to avoid reading outdated data?
When a state object is accessed in a read operation, Flink will check its timestamp and clear the state if it is expired (depending on the configured state visibility, whether to return an expired state). Due to this delayed deletion feature, stale state data that will never be accessed again will take up storage space forever unless it is garbage collected.
So how do you remove expired state without application logic explicitly handling it? Usually, we can configure different strategies for background deletion.
○ Full snapshot automatically deletes expired state
○ Incremental cleanup of heap state backend
○ RocksDB background compaction can filter out expired state
○ Use timer to delete (Timers)

State storage implementation?

How Flink saves state data, there is an interface StateBackend—>abstract class AbstractStateBackend, which has 3 implementations
MemoryStateBackend, memory-based HeapStateBackend
is used in debug mode, not recommended for production mode
FsStateBackend based on HDFS
Distributed file persistence, The memory is operated every time reading and writing, and the OOM problem needs to be considered
● RocksDBStateBackend based on RocksDB
Local files + remote HDFS persistence
The default is StateBackendLoader, which loads RocksDBStateBackend

State stored procedure

two stages

  1. Store locally to RocksDB first
  2. The purpose of synchronizing to remote HDFS asynchronously
    : not only eliminates the limitations of HeapStateBackend (memory size, machine failure and loss, etc.), but also reduces the network IO overhead of pure distributed storage.

Summarize

That's all for today.

Guess you like

Origin blog.csdn.net/qq_42859864/article/details/120656364