I. Introduction
Flink uses windowAll to generate AllwindowedStream and then calls Trigger to execute the window triggering logic. Let's make a basic understanding of Trigger trigger.
2. Introduction to Trigger
Trigger is translated as trigger, trigger, and its function is to trigger the window for calculation under certain conditions. If it is an internal operator, it will execute the corresponding operator. If the custom implementation of ProcessAllWindowFunction, it will trigger the custom execution logic. The trigger determines when the window (formed by the window evaluator) is ready to be processed by the window function. Every WindowAssigner has a default trigger. If the default trigger does not suit your needs, you can specify a custom trigger using trigger(...). Trigger is more common in real-time output of large window data. For example, for a 100s window, the data is triggered every 10s to execute the window logic.
1. Trigger method
· onElement
public abstract TriggerResult onElement(T var1, long var2, W var4, Trigger.TriggerContext var5) throws Exception;
The onElement() method is called for each element added to the window. Taking the most basic CountTrigger as an example, every time an element arrives, the corresponding Trigger class will perform counter accumulation and judgment. If the number of arrivals is accumulated to the corresponding count, it will trigger and execute a window logic.
· onEventTime
public abstract TriggerResult onEventTime(long var1, W var3, Trigger.TriggerContext var4) throws Exception;
When the registered event time timer fires, the onEventTime() method is called. Generally, the execution time will be reset or the eventTime of the next execution will be registered after being triggered.
· OnProcessingTime
public abstract TriggerResult onProcessingTime(long var1, W var3, Trigger.TriggerContext var4) throws Exception;
The onProcessingTime() method is called when the registered processing time timer fires. The basic processing method is the same as above.
· onMerge
public void onMerge(W window, Trigger.OnMergeContext ctx) throws Exception {
throw new UnsupportedOperationException("This trigger does not support merging.");
}
The onMerge() method is related to stateful triggers, where the states of two triggers can be merged when their corresponding windows are merged, such as when using session windows. When two windows are merged, the state values of the two are merged, which can be regarded as a reduce function, which combines the state variables of TimeWindow1 and Timewindow2 into one.
· clear
public abstract void clear(W var1, Trigger.TriggerContext var2) throws Exception;
Finally, the clear() method does whatever is needed to delete the corresponding window. Taking the most basic CountTrigger as an example, clear will clear the counter state, that is, reset it to 0.
· canMerge
public boolean canMerge() {
return false;
}
Whether the trigger supports the onMerge method to merge the two states.
2. Trigger state
The three methods of onElement, onProcessTime, and onEventTime will return a TriggerResult, which is an enumeration class and corresponds to the window operation returned after the method is executed.
· TriggerResult.CONTINUE - skip, do nothing
TriggerResult.FIRE - trigger window calculation
· TriggerResult.PURGE - clears the window element
TriggerResult.FIRE_AND_PURGE - triggers a window action, then clears the window element
Taking CountTrigger as an example, every time Count elements are accumulated, TriggerResult.FIRE will be returned to execute the window logic, and when there are not enough Count elements, TiggerResult.CONTINUE will be returned.
3. Flink comes with Trigger
The Flink org.apache.flink.streaming.api.windowing.triggers class comes with the following window triggers. If you need to customize the trigger, you only need to implement the trigger method of the Trigger class. For example, you can combine CountTrigger and ProcessingTimeTrigger to achieve a CountAndProcessingTime Trigger with dual triggers for count and processing time.
ContinuousEventTimeTrigger | Continuous Event Time Trigger |
ContinuousProcessingTimeTrigger | Continuous processing of time triggers |
CountTrigger | count trigger |
DeltaTrigger | Threshold trigger |
EventTimeTrigger | event time trigger |
ProcessingTimeoutTrigger | Processing time timeout trigger |
ProcessingTimeTrigger | Processing time triggers |
PurgingTrigger | Force PURGE trigger |
3. API example
1. Scala example
The following example aggregates the original DataStream in a rolling window of 10s, where Trigger is set to CountTrigger, and triggers every 30 elements.
val allwindowedStream = dataStream
.windowAll(TumblingProcessingTimeWindows.of(Time.seconds(10)))
.trigger(CountTrigger.of[TimeWindow](30L))
.process(new ProcessAllWindowFunction[String, String, TimeWindow] {
override def process(context: Context, elements: Iterable[String], out: Collector[String]): Unit = {
val info = elements.toArray.mkString(",")
out.collect(info)
}
}).setParallelism(1)
allwindowedStream.print()
Tips:
The Trigger parameter needs to specify implicit T, which is the [TimeWindow] after of. If the output type T of the corresponding data is added here, an error will be reported Required: Trigger[_ >: String,_ >: TimeWindow] :
Required: Trigger[_ >: String,_ >: TimeWindow]
Found: ContinuousProcessingTimeTrigger[String]
2. Java Example
The following example generates a rolling window of 10s for the original DataStream, and triggers the processing logic of the window every 5s according to the continuous processing time, namely ProcessFunction.
dataStream
.setParallelism(processParallel)
.windowAll(TumblingProcessingTimeWindows.of(Time.seconds(10)))
.trigger(ContinuousProcessingTimeTrigger.of(Time.seconds(5)))
.process(new ProcessFunction())
.addSink(outputSink)
.setParallelism(processParallel)
.print()
4. Summary and matters needing attention
1. Default trigger
EventTime-based windows use EventTimeTrigger by default, and ProcessTime-based windows use ProcessingTimeTrigger by default
2.GlobalWindow
The default trigger for GlobalWindow is NeverTrigger, which never fires. So when using GlobalWindow you always have to define a custom trigger.
3. Window trigger logic
Once the trigger determines that the window is ready for processing, it fires, that is, it returns FIRE or FIRE_AND_PURGE. This is the signal that the window operator emits the result of the current window. Given a window with ProcessWindowFunction, all elements are passed to ProcessWindowFunction. Windows with ReduceFunction or AggregateFunction just send their aggregated results.
4.FIRE AND PURGE
FIRE triggers the window without clearing the window elements, PURGE triggers the window but clears the window elements. When custom editing, you need to pay attention to avoid losing a batch of data after the window is triggered. Secondly, PURGE only clears the elements of the window, and some custom metadata of the window. and base properties are not cleared.