Four, flink - window, eventTime and watermark principle and use

A, flink the window mechanism

1.1 window Overview

computing a streaming flow is designed to handle data processing engine unlimited data set, the data set is infinitely an increasingly attractive essentially unlimited data set, and a cutting window limited unlimited data block a means of treatment.
Window core unlimited data stream is processed, Window will be split into an infinite stream of "buckets" bucket limited size, we can do the calculation operations on these buckets.

1.2 window type

window can be divided into two categories:
CountWindow: generating a Window specified according to the number of pieces of data, independent of time. Less used
TimeWindow: generating a Window in time. Very commonly, below the main window of time what type. There are four categories: scrolling window (Tumbling Window), the sliding window (Sliding Window), the session window (Session Window) and Global Window (global window is rarely used).

1.2.1 scrolling window (Tumbling Windows)

Summary: The data slice based on the data length of a fixed window. Only one operating parameter is the window size
characteristics: time alignment, the window length is fixed, there is no overlap.
Each rolling window distributor element is assigned to a specified window size of window, scroll a window has a fixed size, and do not overlap (both before and after the time point immediately). For example: If you specify a scrolling window size of five minutes, to create the window as shown below:
Four, flink - window, eventTime and watermark principle and use
Fig 1.2.1 scrolling window
application scenarios: BI suitable statistics (a polymerisation calculated for each time period).

1.2.2 sliding window (Sliding Windows)

Overview: sliding window is a more generalized form of a fixed window, sliding window operating parameters by a fixed interval sliding window length and composition.
Features: time alignment, the window length is fixed, there is overlap.
Sliding window distributor element is assigned to a fixed length window, similar to a scroll window, the window size to the window size parameter configured by the other sliding window control parameters of frequency starts sliding window. Accordingly, if the slide sliding window parameter is less than the window size, then the window can overlap, in which case the elements will be assigned to the plurality of windows.
For example, you have 5 minutes and 10 minutes sliding window, each window comprising a window for 5 minutes with the last 10 minutes of data generated, as shown below:
Four, flink - window, eventTime and watermark principle and use
Fig 1.2.2 sliding window
application scenarios: For a recent statistical period (recent seeking an interface failure rate 5min to decide whether or not to alarm).

1.2.3 session window (Session Windows)

Overview: a length of time specified by the timeout event a series combination consisting of a gap, like session web application, i.e. a period of time no new data is received will generate a new window.
Features: No aligned. No fixed length window
session window dispenser activities performed by the elements packet session, session window to scroll the window and compared with the sliding window, there will be no overlap and the case of a fixed start time and end time, on the contrary, when it is a no longer receive elements within a fixed period of time, that is generated at intervals of inactivity, the window will close. A session window interval is configured by a session, this session interval defines the length of the inactive period, when the period of inactivity is generated, then the current session will be closed and the subsequent elements will be assigned to the window to a new session.
Four, flink - window, eventTime and watermark principle and use
Figure 1.2.3 Session window

1.3 window windows api

1.3.1 window api classification

window数据源分为两种,一种是典型的KV类型(keyedStream),另一种是非KV类型(Non-keyedStream)。
区别:
keyedStream:
需要在使用窗口操作前,调用 keyBy对KV按照key进行分区,然后才可以调用window操作的api,比如 countWindow,timeWindow等

Non-keyedstream:
如果使用窗口操作前,没有使用keyBy算子,那么就认为是Non-keyedstream,调用的window api就是 xxxWindowAll,比如countWindowAll,timeWindowAll,而且因为是非KV,所以无法分区,也就是只有一个分区,那么这个窗口并行度只能是1。这个是要注意的。

1.3.2 countWindow

CountWindow根据窗口中相同key元素的数量来触发执行,执行时只计算元素数量达到窗口大小的key对应的结果。

有两个用法:
countWindow(window_size):只指定窗口大小,此时窗口是滚动窗口
countWindow(window_size, slide):指定窗口大小以及滑动间隔,此时窗口是滑动窗口

注意:CountWindow的window_size指的是相同Key的元素的个数,不是输入的所有元素的总数。

1、滚动窗口
默认的CountWindow是一个滚动窗口,只需要指定窗口大小即可,当元素数量达到窗口大小时,就会触发窗口的执行。

import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.functions.ReduceFunction;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.util.Collector;

public class WindowTest {
    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        DataStreamSource<String> source = env.readTextFile("/test.txt");
        source.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public void flatMap(String s, Collector<Tuple2<String, Integer>> collector) throws Exception {
                for (String s1 : s.split(" ")) {
                    collector.collect(new Tuple2<>(s1, 1));
                }
            }
        }).keyBy(0).countWindow(5).reduce(new ReduceFunction<Tuple2<String, Integer>>() {
            @Override
            public Tuple2<String, Integer> reduce(Tuple2<String, Integer> t1, Tuple2<String, Integer> t2) throws Exception {
                return new Tuple2<>(t1.f0, t1.f1 + t2.f1);
            }
        }).print();

        env.execute("滚动窗口");
    }

}

2、滑动窗口
动窗口和滚动窗口的函数名是完全一致的,只是在传参数时需要传入两个参数,一个是window_size,一个是sliding_size。
下面代码中的sliding_size设置为了2,也就是说,每收到两个相同key的数据就计算一次,每一次计算的window范围是5个元素。

import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.functions.ReduceFunction;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.util.Collector;

public class WindowTest {
    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        DataStreamSource<String> source = env.readTextFile("/test.txt");
        source.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public void flatMap(String s, Collector<Tuple2<String, Integer>> collector) throws Exception {
                for (String s1 : s.split(" ")) {
                    collector.collect(new Tuple2<>(s1, 1));
                }
            }
        }).keyBy(0).countWindow(5,2).reduce(new ReduceFunction<Tuple2<String, Integer>>() {
            @Override
            public Tuple2<String, Integer> reduce(Tuple2<String, Integer> t1, Tuple2<String, Integer> t2) throws Exception {
                return new Tuple2<>(t1.f0, t1.f1 + t2.f1);
            }
        }).print();

        env.execute("滑动窗口");
    }

}

1.3.3 timeWindow

​ TimeWindow是将指定时间范围内的所有数据组成一个window,一次对一个window里面的所有数据进行计算。同样支持类似上面的滚动窗口和滑动窗口模式。有两个工作参数:window_size和slide。只指定window_size时是滚动窗口。

1、滚动窗口
​ Flink默认的时间窗口根据Processing Time 进行窗口的划分,将Flink获取到的数据根据进入Flink的时间划分到不同的窗口中。

import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.functions.ReduceFunction;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.util.Collector;

public class WindowTest {
    public static void main(String[] args) throws Exception {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        DataStreamSource<String> source = env.readTextFile("/test.txt");
        source.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public void flatMap(String s, Collector<Tuple2<String, Integer>> collector) throws Exception {
                for (String s1 : s.split(" ")) {
                    collector.collect(new Tuple2<>(s1, 1));
                }
            }
        }).keyBy(0).timeWindow(Time.seconds(2)).reduce(new ReduceFunction<Tuple2<String, Integer>>() {
            @Override
            public Tuple2<String, Integer> reduce(Tuple2<String, Integer> t1, Tuple2<String, Integer> t2) throws Exception {
                return new Tuple2<>(t1.f0, t1.f1 + t2.f1);
            }
        }).print();

        env.execute("滚动窗口");
    }

}

2、滑动窗口
和上面类似,就是参数里面增加了slide参数,也就是滑动时间间隔。时间间隔可以通过Time.milliseconds(x),Time.seconds(x),Time.minutes(x)等其中的一个来指定。

1.3.4 window reduce

也就是在窗口算子之后执行reduce算子,用法和普通的reduce一样,只不过reduce的单位是一个窗口。即每一个窗口返回一次reduce结果。程序在上面,不重复了。

1.3.5 window fold

也就是在窗口算子之后执行fold算子,用法和普通的fold一样,只不过fold的单位是一个窗口。即每一个窗口返回一次reduce结果。程序在上面,不重复了。

1.3.6 window聚合操作

指的是max、min等这些聚合算子,只不过是在window算子之后使用,以窗口为单位,每一个窗口返回一次聚合结果,而不是像普通那样,每一次聚合结果都返回。

二、time、watermark和window

2.1 flink中 time的分类

在flink中,time有不同分类,如下:
Event Time:
是事件创建的时间。它通常由事件中的时间戳描述,例如采集的日志数据中,每一条日志都会记录自己的生成时间,Flink通过时间戳分配器访问事件时间戳。

Ingestion Time:
是数据进入Flink的时间。

Processing Time:
是每一个执行基于时间操作的算子的本地系统时间,与机器相关,默认的时间属性就是Processing Time。也就是数据被处理时的当前时间。

这些时间有什么不同呢?因网络传输需要时间,所以Ingestion Time不一定和Event Time相等,很多情况下是不等的。同样Processing Time表示数据处理时的时间,如果数据是很久之前采集的,现在才处理,那么很明显,三个时间time都不会相等的。
Four, flink - window, eventTime and watermark principle and use
​ 图 2.1 flink--时间的概念

例子:
一条日志进入Flink的时间为2017-11-12 10:00:00.123,到达Window的系统时间为2017-11-12 10:00:01.234,日志的内容如下:
2017-11-02 18:37:15.624 INFO Fail over to rm2
可以看到,三个time都不相等。而对于业务来说,要统计1min内的故障日志个数,哪个时间是最有意义的?—— eventTime,因为我们要根据日志的生成时间进行统计。但是flink默认的窗口的时间是Processing Time,那么如何引入eventTime呢?

2.2 eventTime的引入

​ 在Flink的流式处理中,绝大部分的业务都会使用eventTime,一般只在eventTime无法使用时,才会被迫使用ProcessingTime或者IngestionTime。默认使用的是ProcessingTime。那么如何指定flink使用指定的time呢?

2.2.1 引入方式1:设置env时间类型

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setStreamTimeCharacteristic(时间类型);

//三种类型的time对应如下:
TimeCharacteristic.EventTime;  eventtime
TimeCharacteristic.IngestionTime;  到达flink的时间
TimeCharacteristic.ProcessingTime;  处理数据的时间

这种方式是整个env全局生效的,是直接将env默认的时间设置为eventtime。后面的窗口操作默认就会使用eventtime作为时间依据。如果想不同的窗口设置不同的时间类型,这种方式就行不通了。

2.2.2 引入方式2:单独设置window的实际类型

stream.window(TumblingEventTimeWindows.of(Time.seconds(5)))

.window这个api就是所有窗口总的api,其他窗口api都是通过这个api封装出来的。可以通过这个总api,参数直接窗口的类型,比如上面的就是指定eventtime 的timewindow,这样并不会影响整个env的时间类型。

同样的,其他时间类型窗口,比如:
SlidingEventTimeWindows  滑动eventtime窗口

基本上看名字就知道是什么时间类型(三大时间类型)、以及什么类型(滑动、滚动、会话窗口)的窗口了。注意:eventtime没有session窗口,processingTime和

2.3 watermark的原理

2.3.1 引入背景

​ 我们知道,流处理从事件产生,到流经source,再到operator,中间是有一个过程和时间的,虽然大部分情况下,流到operator的数据都是按照事件产生的时间顺序来的,但是也不排除由于网络、背压等原因,导致乱序的产生,所谓乱序,就是指Flink接收到的事件的先后顺序不是严格按照事件的Event Time顺序排列的。
Four, flink - window, eventTime and watermark principle and use
​ 图 2.3 数据的乱序
​ 那么此时出现一个问题,一旦出现乱序,如果只根据eventTime决定window的运行,我们不能明确数据是否全部到位,但又不能无限期的等下去,此时必须要有个机制来保证一个特定的时间后,必须触发window去进行计算了,这个特别的机制,就是Watermark。
解释:
如果只按照到达的event的eventtime来触发窗口操作,假设有event1~5。如果到达顺序是乱的,比如event5最先达到,然后event1也达到了,那么flink这边怎么知道这中间还有没有数据呢?没办法的,不能确定数据是否完整到达,也不能无限制等待下去。所以需要一种机制来处理这种情况。

2.3.2 watermark机制原理

​ Watermark是一种衡量Event Time进展的机制,它是数据本身的一个隐藏属性,数据本身携带着对应的Watermark。Watermark是用于处理乱序事件的,而正确的处理乱序事件,通常用Watermark机制结合window来实现。
​ 数据流中的Watermark用于表示timestamp小于Watermark的数据,都已经到达了,因此,window的执行也是由Watermark触发的。
​ Watermark可以理解成一个延迟触发机制,我们可以设置Watermark的延时时长t,每次系统会校验已经到达的数据中最大的maxEventTime,然后认定eventTime小于maxEventTime - t的所有数据都已经到达,如果有窗口的停止时间等于maxEventTime – t,那么这个窗口被watermark触发执行。
解释:
​ watermark是一种概率性的机制。假设event1~5,如果event5已经到达了,那么其实按照event产生的先后顺序,正常情况下,前面的event1~4应该也到达了。而为了保证前面的event1~4的到达(其实是更多的到达,但是不一定全部都到达),在event5到达了之后,提供一定的延迟时间t。当event5到达,且经过 t 时间之后,正常情况下,前面的event1~4 大概率会到达了,如果没有到达,属于少数情况,那么就认为event5之前的event都到达了,无论是否真的全部到达了。如果在延迟时间之后到达了,这个旧数据直接会被丢弃。所以其实watermark就是一种保障更多event乱序到达的机制,提供了一定的延时机制,而因为只会延迟一定的时间,所以也不会导致flink无限期地等待下去。

有序数据流的watermark如下:(watermark设置为0)
Four, flink - window, eventTime and watermark principle and use
​ 图 2.4 有序数据流的watermark
乱序数据流的watermark如下:(watermark设置为2)
Four, flink - window, eventTime and watermark principle and use
​ 图 2.5 乱序数据流的watermark
​ 当Flink接收到每一条数据时,都会产生一条Watermark,这条Watermark就等于当前所有到达数据中的maxEventTime - 延迟时长t,也就是说,Watermark是由数据携带的,一旦数据携带的Watermark比当前未触发的窗口的停止时间要晚,那么就会触发相应窗口的执行。由于Watermark是由数据携带的,因此,如果运行过程中无法获取新的数据,那么没有被触发的窗口将永远都不被触发。
​ 上图中,我们设置的允许最大延迟到达时间为2s,所以时间戳为7s的事件对应的Watermark是5s,时间戳为12s的事件的Watermark是10s,如果我们的窗口1是1s~5s,窗口2是6s~10s,那么时间戳为7s的事件到达时的Watermarker恰好触发窗口1,时间戳为12s的事件到达时的Watermark恰好触发窗口2。
​ Window会不断产生,属于这个Window范围的数据会被不断加入到Window中,所有未被触发的Window都会等待触发,只要Window还没触发,属于这个Window范围的数据就会一直被加入到Window中,直到Window被触发才会停止数据的追加,而当Window触发之后才接受到的属于被触发Window的数据会被丢弃。如果产生的窗口中没有新到的数据,也就不会有watermark,那么窗口就不会被触发计算。

2.3.3 watermark的触发计算的条件

watermark时间(max_eventTime-t) >= window_end_time;
在[window_start_time,window_end_time)中有数据存在。

2.3.4 watermark的产生方式

Punctuated:不间断产生
数据流中每一个递增的EventTime都会产生一个Watermark。
在实际的生产中Punctuated方式在TPS很高的场景下会产生大量的Watermark在一定程度上对下游算子造成压力,所以只有在实时性要求非常高的场景才会选择Punctuated的方式进行Watermark的生成。

Periodic:周期性产生
周期性的(一定时间间隔或者达到一定的记录条数)产生一个Watermark。
在实际的生产中Periodic的方式必须结合时间和积累条数两个维度继续周期性产生Watermark,否则在极端情况下会有很大的延时。

这两种有不同的api实现,下面会讲

2.4 watermark的引入以及接口

2.4.1 watermark引入

需要先引入eventime,然后引入watermark

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);

DataStreamSource<String> source = env.readTextFile("/test.txt");

//引入的watermark的实现类
source.assignTimestampsAndWatermarks(xx)

watermark的实现有两大类,对应上面的两种watermark的产生方式,有两个接口:

AssignerWithPeriodicWatermarks;   周期性产生watermark,即Period
AssignerWithPunctuatedWatermarks;  Punctuated:不间断产生

2.4.2 AssignerWithPeriodicWatermarks接口

看看AssignerWithPeriodicWatermarks这个接口的源码,主要用于周期性产生watermark

public interface AssignerWithPeriodicWatermarks<T> extends TimestampAssigner<T> {
    //获取当前的watermark
    @Nullable
    Watermark getCurrentWatermark();
}

//父接口===================
public interface TimestampAssigner<T> extends Function {
    //获取当前的时间戳
    long extractTimestamp(T var1, long var2);
}

There are mainly two methods need to cover, getCurrentWatermark () for generating a watermark, extractTimestamp for acquiring a timestamp of each event.
Since this is a periodically generated watermark interface, it is necessary to specify how long the generation period, specify env configuration, such as:

env.getConfig().setAutoWatermarkInterval(n ms);
记住间隔时间单位是毫秒

example:

/*根据eventTime 创建处理watermark
*/
public class BoundedOutOfOrdernessGenerator implements AssignerWithPeriodicWatermarks<MyEvent> {

    //watermark延迟时间 t,单位是毫秒
    private final long maxOutOfOrderness = 3500; // 3.5 seconds

    //保存当前最大的时间戳
    private long currentMaxTimestamp;

    //根据传递进来的event,获取time,然后如果比当前最大的time还大,就替换,否则保持。因为数据乱序到达是无法保证时间是递增的
    @Override
    public long extractTimestamp(MyEvent element, long previousElementTimestamp) {
        long timestamp = element.getCreationTime();
        currentMaxTimestamp = Math.max(timestamp, currentMaxTimestamp);
        return timestamp;
    }

    //返回watermark
    @Override
    public Watermark getCurrentWatermark() {
        // return the watermark as current highest timestamp minus the out-of-orderness bound
        return new Watermark(currentMaxTimestamp - maxOutOfOrderness);
    }
}

setAutoWatermarkInterval (n ms) plus the set, it may be periodically generated watermark.

2.4.3 AssignerWithPunctuatedWatermarks接口

AssignerWithPunctuatedWatermarks look at this source interface is mainly used to produce real-time watermark

public interface AssignerWithPunctuatedWatermarks<T> extends TimestampAssigner<T> {
    //获取最新的watermark
    @Nullable
    Watermark checkAndGetNextWatermark(T var1, long var2);
}

//父接口
public interface TimestampAssigner<T> extends Function {
    //从event中获取timestamp
    long extractTimestamp(T var1, long var2);
}

In fact, the wording and similar to the above, but this does not set a time interval to generate watermark

2.4.4 flink own watermark implementation class

1, BoundedOutOfOrdernessTimestampExtractor
inherited a class AssignerWithPeriodicWatermarks interface and see its source

package org.apache.flink.streaming.api.functions.timestamps;

import org.apache.flink.streaming.api.functions.AssignerWithPeriodicWatermarks;
import org.apache.flink.streaming.api.watermark.Watermark;
import org.apache.flink.streaming.api.windowing.time.Time;

public abstract class BoundedOutOfOrdernessTimestampExtractor<T> implements AssignerWithPeriodicWatermarks<T> {
    private static final long serialVersionUID = 1L;
    private long currentMaxTimestamp;
    private long lastEmittedWatermark = -9223372036854775808L;
    private final long maxOutOfOrderness;

    //构造方法中接收一个参数,就是延迟时间 t
    public BoundedOutOfOrdernessTimestampExtractor(Time maxOutOfOrderness) {
        if (maxOutOfOrderness.toMilliseconds() < 0L) {
            throw new RuntimeException("Tried to set the maximum allowed lateness to " + maxOutOfOrderness + ". This parameter cannot be negative.");
        } else {
            this.maxOutOfOrderness = maxOutOfOrderness.toMilliseconds();
            this.currentMaxTimestamp = -9223372036854775808L + this.maxOutOfOrderness;
        }
    }

    public long getMaxOutOfOrdernessInMillis() {
        return this.maxOutOfOrderness;
    }

    //需要重写的方法,用于获取timestamp
    public abstract long extractTimestamp(T var1);

    //获取watermark的方法已经写好了,用传递进来的延迟时间t来计算得出watermark
    public final Watermark getCurrentWatermark() {
        long potentialWM = this.currentMaxTimestamp - this.maxOutOfOrderness;
        if (potentialWM >= this.lastEmittedWatermark) {
            this.lastEmittedWatermark = potentialWM;
        }

        return new Watermark(this.lastEmittedWatermark);
    }

    public final long extractTimestamp(T element, long previousElementTimestamp) {
        long timestamp = this.extractTimestamp(element);
        if (timestamp > this.currentMaxTimestamp) {
            this.currentMaxTimestamp = timestamp;
        }

        return timestamp;
    }
}

This class is to achieve a user-definable set the delay time t a watermark.

2, AscendingTimestampExtractor
also inherited a class AssignerWithPeriodicWatermarks interface. A data source having stable timestamp increment, such kafka partition data, each piece of information is incremented by +1 for this class. Only you need to rewrite
extractAscendingTimestamp method.

2.5 eventTime, window, and in conjunction with the example watermark

package flinktest;

import org.apache.flink.api.common.functions.FlatMapFunction;
import org.apache.flink.api.common.functions.MapFunction;
import org.apache.flink.api.common.functions.ReduceFunction;
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.api.TimeCharacteristic;
import org.apache.flink.streaming.api.datastream.DataStreamSource;
import org.apache.flink.streaming.api.environment.StreamExecutionEnvironment;
import org.apache.flink.streaming.api.functions.timestamps.BoundedOutOfOrdernessTimestampExtractor;
import org.apache.flink.streaming.api.windowing.time.Time;
import org.apache.flink.util.Collector;

public class EventTimeTest {
    public static void main(String[] args) {
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
        env.getConfig().setAutoWatermarkInterval(1000);

        DataStreamSource<String> source = env.readTextFile("/tmp/test.txt");

        source.assignTimestampsAndWatermarks(new BoundedOutOfOrdernessTimestampExtractor<String>(Time.milliseconds(3000)) {
            @Override
            public long extractTimestamp(String s) {
                return Integer.valueOf(s.split(" ")[0]);
            }
        }).flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
            @Override
            public void flatMap(String s, Collector<Tuple2<String, Integer>> collector) throws Exception {
                Tuple2<String, Integer> tmpTuple = new Tuple2<>();
                for (String s1 : s.split(" ")) {
                    tmpTuple.setFields(s1, 1);
                    collector.collect(tmpTuple);
                }
            }
        }).keyBy(0)
                .timeWindow(Time.seconds(10))
                .reduce(new ReduceFunction<Tuple2<String, Integer>>() {
                    @Override
                    public Tuple2<String, Integer> reduce(Tuple2<String, Integer> t1, Tuple2<String, Integer> t2) throws Exception {
                        return new Tuple2<>(t1.f0, t1.f1 + t2.f1);
                    }
                })
                .print();
         try {
            env.execute("eventtime test");
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

window api class inheritance structure

Guess you like

Origin blog.51cto.com/kinglab/2457255