[Big Data] Detailed Explanation of Flink (2): Core Part II

Flink Detailed Explanation (2): Core Part II

22. You mentioned State just now, so let’s briefly talk about what State is.

insert image description here
In Flink, the state is called stateand is used to store intermediate calculation results or cache data. According to whether the state needs to save intermediate results, it can be divided into stateless computing and stateful computing .

  • For stream computing, events are continuously generated. If each calculation is independent of each other and does not depend on upstream and downstream events, the same input can get the same output, which is stateless computing.
  • If a computation needs to depend on previous or subsequent events, it is called a stateful computation.

insert image description here
Stateful calculations such as sumsummation, data accumulation, etc.

insert image description here

23. What are the states of Flink?

(1) According to whether it is managed by the user or managed by Flink , the state can be divided into the original state and the managed state .

  • Raw state (Raw State): self-managed by the user.
  • Managed State (Managed State): State managed by Flink itself.

The difference between the two :

  • In terms of state management , Managed State is managed by Flink Runtime, automatically stored and restored, and optimized for memory management; while Raw State needs to be managed by the user and serialized by himself. Flink does not know whether the data stored in the State is What structure, only the user knows, needs to be finally serialized into a storable data structure.
  • In terms of state data structure , Managed State supports known data structures, such as Value, List, Mapand so on. Raw State only supports byte arrays, and all states must be converted to binary byte arrays.
  • From the recommended usage scenarios , Managed State can be used in most cases, and Raw State is used when Managed State is not enough, such as when a custom Operator is required. In the actual production process, only Managed State is recommended.

(2) State is divided into two types according to whether there is a keyKeyedState or not OperatorState.

KeyedState Features

  • It can only be used in operators on KeyedStream, and the state is bound to a specific key.
  • Each key on the KeyedStream stream corresponds to a state object. If an operator instance processes multiple keys and accesses corresponding multiple states, it can correspond to multiple states.
  • KeyedState is stored in StateBackend.
  • Accessed through the RuntimeContext, implements Rich Functionthe interface.
  • Support multiple data structures: ValueState, ListState, ReducingState, AggregatingState, MapState.

insert image description here
OperatorState Features

  • Can be used for all operators, but the entire operator only corresponds to one state.
  • There are many ways to redistribute when concurrent changes are available: (1) evenly distribute (2) each get the full amount after merging.
  • implementation CheckpointedFunctionor ListCheckpointedinterface.
  • Currently only ListStatedata structures are supported.

insert image description here
Here fromElementswill call FromElementsFunctionthe class of , in which the operator state of type is used ListState.

24. Do you understand the status of Flink broadcasting?

In Flink, the broadcast state is called BroadcastState. Used in broadcast state mode. The so-called broadcast state mode means that the data from one flow needs to be broadcast to all downstream tasks, stored locally in the operator, and depends on the broadcast data when processing another flow. The broadcast state mode is illustrated below with an example.

insert image description here
The example in the figure above contains two streams. One is the Kafka model stream . The model is a model trained by machine learning or deep learning. The model is sent to all downstream rule operators through broadcasting. The rule operator caches the rules in In Flink's local memory, the other is the Kafka data stream , which is used to receive the test set, which depends on the model in the model stream , and completes the inference task of the test set through the model.

The broadcast state must be MapStateof the type, and the broadcast state mode needs to be processed with the broadcast function , which provides an interface for processing broadcast data streams and ordinary data streams.

25. What are the Flink state interfaces?

Using state in Flink contains two state interfaces:

  • State operation interface : use the state object itself to store, write, and update data.
  • State Provider :StateBackendGet the state object itself from .

1. State operation interface

The state operation interface in Flink is aimed at two types of users, namely application developers and the Flink framework itself . So Flink designed two sets of interfaces.

(1) State interface for developers

The development-oriented State interface only provides the basic operation interface for adding, deleting, and modifying data in the State, and the user cannot access other information required by the state at runtime. The interface system is as follows:

insert image description here
(2) Facing the internal State interface

The internal State interface is used by the Flink framework, which provides more State methods and can be flexibly expanded as needed. In addition to access to the data in the State, it also provides internal runtime information, such as the serializer of the data in the State, the namespace, the serializer of the namespace, and the interface for merging namespaces. The internal State interface is named as InternalxxxState.

2. Status access interface

After having the state, how should the developer access the state when customizing UDF (UserDefineFunction, user-defined function)?

State will be saved in StateBackend, but StateBackend contains different types. Therefore, two state access interfaces are abstracted in Flink: OperatorStateStoreand KeyedStateStore, when users write UDF, they do not need to consider which StateBackend type interface is used.

(1) OperatorStateStore interface principle

insert image description here
OperatorState data is stored in memory in the form of Map, and does not use RocksDBStateBackendand HeapKeyedStateBackend.

(2) KeyedStateStore interface principle

insert image description here
KeyedStateStore data is stored using RocksDBStateBackendor HeapKeyedStateBackend, and the creation and acquisition of states in KeyedStateStore are handed over to the specific StateBackend for processing. KeyedStateStore itself is more like a proxy.

26. How is the Flink state stored?

In Flink, state storage is called StateBackend , which has two capabilities:

  • Provides the ability to access State during the calculation process, and developers can use the interface of StateBackend to read and write data when writing business logic.
  • Ability to persist State to external storage to provide fault tolerance.

Flink state provides three storage methods:

  • Memory type :MemoryStateBackend, suitable for verification, testing, not recommended for production use.
  • File type :FSStateBackend, suitable for long-term and large-scale data.
  • RocksDB :RocksDBStateBackend, suitable for long-term and large-scale data.

The StateBackend mentioned above is user-oriented . The relationship between the three states in Flink is as follows:

insert image description here
At runtime, both the local State MemoryStateBackendand FSStateBackendthe local State are stored in the memory of the TaskManager, so the bottom layer depends on it HeapKeyedStateBackend. HeapKeyedStateBackendFor the inside of the Flink engine, the user does not need to be aware of it.

1. Memory StateBackend

MemoryStateBackend, all the State data required at runtime are stored in the memory on the TaskManager JVM heap , and the KV type State and window operator State use HashTable to store data, triggers, etc. When performing a checkpoint, the snapshot data of the State will be saved in the memory of the JobManager process .

MemoryStateBackendSnapshots can be taken asynchronously (or synchronously, asynchronous is recommended ) to avoid blocking operators to process data.

The memory-based StateBackend is not recommended for use in a production environment, and you can develop and debug tests locally. Note the following points:

  • State is stored in the JobManager's memory, limited by the JobManager's memory size.
  • Each State defaults to 5 MB 5MB5 MB , adjustable viaMemoryStateBackendconstructor.
  • Each Stale cannot exceed the Akka Frame size.

2. File StateBackend

FSStateBackend, all the State data required at runtime are stored in the memory of TaskManager , and when the checkpoint is executed, the snapshot data of the State will be saved to the configured file system .

It can be a distributed or local file system, the path is as follows:

  • HDFS path: " hdfs://namenode:40010/flink/checkpoints "
  • Local path: " file:///data/flink/checkpoints "

FSStateBackendIt is suitable for stateful processing tasks that deal with large states, long windows, or large key-value states. Note the following points:

  • State data is first stored in TaskManager's memory.
  • State size cannot exceed TM memory.
  • TM writes State data to external storage asynchronously.

MemoryStateBackendand FSStateBackendboth depend on HeapKeyedStateBackend, HeapKeyedStateBackendusing State to store data.

3、RocksDBStateBackend

RocksDBStateBackendIt is different from memory type and file type.

RocksDBStateBackendUse the embedded local database RocksDB to store the flow computing data state in the local disk, which will not be limited by the memory size of the TaskManager. When performing checkpoints, the state data stored in the entire RocksDB will be fully or incrementally persisted To the configured file system, a small amount of checkpoint metadata is stored in the JobManager memory. RocksDB overcomes the problem of State being limited by memory, and at the same time, it can be persisted to the remote file system, which is more suitable for production use.

Disadvantages: RocksDBStateBackendCompared with the memory-based StateBackend, the cost of accessing State is much higher, which may lead to a sharp drop in the throughput of the data flow, and may even be reduced to 1/10 1/10 of the original1/10

Applicable scene

  • Best suited for stateful processing tasks that deal with large states, long windows, or large key-value states.
  • RocksDBStateBackendIdeal for high availability scenarios.
  • RocksDBStateBackendis currently the only backend that supports incremental checkpointing. Incremental checkpoints are ideal for very large state scenarios.

important point

  • The total state size is limited to disk size and not limited by memory.
  • RocksDBStateBackendIt is also necessary to configure an external file system to save the State centrally.
  • RocksDB's JNI API is based on byte arrays, and the size of a single Key and a single Value cannot exceed 8 88 bytes.
  • For applications using state with coalescing operations, such as ListState, over time may accumulate to more than 2 31 2^{31}231 bytes in size, which will cause subsequent queries to fail.

27. How is the Flink state persisted?

First of all, the state of Flink must eventually be persisted to a third-party storage to ensure that it can be recovered after a cluster failure or a job hangs. RocksDBStateBackendThere are two persistence strategies:

  • Full persistence strategy , RocksFullSnapshotStrategy
  • Incremental persistence strategy , RocksIncementalSnapshotStrategy

1. Full persistence strategy

Write the full amount of State to the state store (HDFS) each time. StataBackend of memory type, file type, and RocksDB type all support full persistence strategy.

insert image description here
When executing the persistence strategy, the asynchronous mechanism is used, and each operator starts 1 1An independent thread writes its own state into distributed storage and reliable storage. During the persistence process, the state may be continuously modified. The memory-based state backend usesCopyOnWriteStateTableto ensure thread safety,RocksDBStateBackendand the RocksDB snapshot mechanism is used to ensure thread safety.

2. Incremental persistence strategy

Incremental persistence is the State that is incrementally persisted each time, and only RocksDBStateBackendsupports incremental persistence.

Flink's incremental checkpoint is based on RocksDB , which is a KV storage based on LSM-Tree. The new data is kept in memory, called memtable. If the keys are the same, the later data will overwrite the previous data. Once the data memtableis full, RocksDB will compress the data and write it to disk. memtableAfter the data is persisted to disk, it becomes immutable sstable.

Because sstableis immutable, Flink sstablecan calculate what has changed in the state by comparing the RocksDB files created and deleted by the previous checkpoint.

In order to ensure sstablethat is immutable, Flink will trigger a refresh operation in RocksDB to force memtableflush to disk. When Flink performs a checkpoint, it sstablepersists new s to HDFS, while retaining references. In this process, Flink does not persist all local data sstable, because part of the local history sstablehas been persisted to the storage in the previous checkpoint, and only needs to increase sstablethe number of references to the file.

RocksDB will merge sstableand delete duplicate data in the background. Then delete the original one in RocksDB sstableand replace it with the newly synthesized one sstable. The new sstablecontains sstablethe information from the deleted , by merging the histories sstableinto a new one sstableand deleting the histories sstable. It can reduce the history files of checkpoints and avoid the generation of a large number of small files.

28. How to clean up the Flink state after it expires?

1. State expires in DataStream

The cleanup policy StateTtlConfig can be set for each state in DataStream, and the content that can be set is as follows:

  • Expiration time: If it has not been accessed for a long time, it will be regarded as State expired, similar to cache.
  • Expiration time update strategy: update-on-create and write, update-on-read and write.
  • State visibility: Available if not cleaned up, unavailable if timed out.

2. State expiration in Flink SQL

Flink SQL generally uses State in stream Join and aggregation scenarios. If State is not cleaned up regularly, it will lead to too many States and memory overflow. The cleaning policy configuration is as follows:

StreamQueryConfig qConfig = ...
//设置过期时间为 min = 12小时 ,max = 24小时
qConfig.withIdleStateRetentionTime(Time.hours(12),Time.hours(24));

Guess you like

Origin blog.csdn.net/be_racle/article/details/132174494