What is Flink?

Flink is an open source big data framework and distributed processing engine. It is open sourced by the Apache Software Foundation. It is used for unbounded (there is a starting point of the data flow, but there is no end point of the data flow) and bounded (there is a data flow end point). The starting point and the end point of the data flow) perform stateful calculations on the streaming data.

advantage

  • Suitable for all streaming application scenarios, such as event-driven applications, data pipelines and ETL processing.
  • High-level calculation correctness guarantees support precise one-time semantics and ensure that data is consumed only once without omissions. This is generally very difficult to achieve. In addition, the out-of-order data calculation caused by delay can be handled based on event time (Event time) and delay mechanism.
  • Large-scale cluster computing capabilities support horizontal expansion, large-scale state storage, and incremental checkpoint mechanisms. When computing power is insufficient, the overall computing power can be increased by adding computing nodes.
  • Application operation and maintenance costs are low, it supports multiple deployment modes, and can be deployed flexibly. In addition, the high availability mechanism can ensure the stability of services to the greatest extent. Even if a node goes down, it will not affect other nodes' provision of external services.
  • Excellent computing performance. By performing data calculations in memory, high throughput and low latency data processing capabilities are achieved, which is very important for real-time processing programs.
  • Hierarchical API. Different development users have different preferences for API usage. Flink SQL API can implement integrated processing of streaming batch data based on SQL syntax, which is also more user-friendly. In addition, a dedicated DataStream API is provided to handle stream data calculations, and a DataSet API is provided to handle batch data calculations. For functions not provided by the upper layer, users can customize data calculation logic based on the underlying API.

おすすめ

転載: blog.csdn.net/qq_39813400/article/details/131176546