Big Data Theory No.1-Talking about data transmission in Flink

Chapter 1 Questions

During the operation of Flink, TaskManagers will continuously exchange data, and TaskManager processes execute tasks in units of Slots. There can be multiple Slots in a TaskManager, and each Slot may be connected, and Transmission of data, and in the scenario of stream computing, the interaction between data needs to be very timely, but if every interaction is transmitted immediately, the consumption of network resources will be very large, and if the network overhead reaches the peak, it will have a reverse effect了: The timeliness has deteriorated, so how does Flink solve this problem?

 

Chapter 2 Buffer

When transferring data between slots, the data is actually collected in the buffer first , rather than sent one by one. When the data is stored in the buffer, the data reaches a certain condition, and then the data is sent in batches. (Kafka also has a similar mechanism)

If the slot is in a different TaskManager, the data transmission also needs to go through the network, and the TaskManager will share the network connection when transmitting data to another TaskManager. Each TaskManager itself must provide enough buffers to service all connections.

Flink's default network buffer configuration can cope with the use of small and medium-sized traffic. However, if it is a large traffic scenario, users still need to adjust their own "memory and network cache" configuration, the relevant configuration can be checked on the official website, and I will not go into details here.

After having a buffer, think about a question: too much data buffered, the higher the delay of data arrival, how does a real-time computing framework like Flink solve this problem?

 

Chapter 3 Credit-based flow control

Flink implements a credit-based flow control mechanism to reduce latency. The working principle is roughly as follows:

  • The sender will add the backlog of data in its own buffer to the sent data and send it to the receiver at the same time.
  • After the receiving end receives the data from the sending end, it will generate a credit value according to the size of the backlog in its buffer area and return the credit value to the sending end.
  • The sender will transmit as much buffer data as possible according to the range limited by the credit value.

In this way, each sender will be granted a credit value. If a sender has too much data backlog, then the credit value granted can enable it to send as much data as possible, thereby reducing its backlog. This mechanism Will allocate network resources well when data skew occurs.

 

Chapter 4 Task Links

In addition, Flink also has a task link mechanism to reduce the cost of local communication. But the task link mechanism is conditional:

  • Condition 1: Each operator must have the same degree of progress;
  • Condition 2: The operators connect and exchange data through the local forwarding channel.

The functions in the operators that meet the conditions will be "fused" into the same task and executed in the same thread. Each function only needs to be called through a simple method to exchange data. Data transmission basically has the overhead of serialization and communication (network communication).

Note: Flink turns on task link by default. If there is a business that does not require task link, you need to disable it by yourself.

 

In summary, Flink uses these mechanisms to ensure a balance between "real-time" and "resource utilization" in the stream computing process, so as to achieve higher throughput and optimal efficiency.

Guess you like

Origin blog.csdn.net/dzh284616172/article/details/108721316