Window (window) is a way to cut an infinite flow into a finite flow, which distributes flow data into buckets of finite size for analysis
window type
- Time Window
- rolling time window
- sliding time window
- session window
2. Count Window
- rolling count window
- sliding count window
Tumbling Windows
- Segment the data according to a fixed window length
- Time aligned, window length fixed, no overlap
Sliding Windows
- A sliding window is a more generalized form of a fixed window, which consists of a fixed window length and a sliding interval
- The window length is fixed and can overlap
Session Windows
- It consists of a series of events combined with a timeout gap of a specified length of time, that is, a new window will be generated if no new data is received for a period of time
- Features: no time alignment
API
- Window allocator - window() method
- We can use .window() to define a window, and then do some aggregation or other processing operations based on this window. Note that the window () method must be used after keyBy
- Flink provides simpler .timeWindow and .countWindow methods for defining time windows and count windows.
create window
- Tumbling time window
- sliding time window
- session window
- Tumbling count window
- sliding count window
window function
- Incremental aggregation functions (incremental aggregation functions) calculate each piece of data when it arrives, and maintain a simple state ReduceFunction, AggregateFunction
- Full window functions (full window functions) first collect all the data of the window, and then traverse all the data when calculating ProcessWindowFunction, WindowFunction
- trigger() trigger. Define when the window is closed, trigger the calculation and output the result
- .evictor() remover
- Define logic to remove certain data. .allowedLateness() allows processing late data
- .sideOutputLateData() puts late data into the side output stream
- .getSideOutput() Gets the side output stream