Flink 广播流最佳实践

版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明。
本文链接: https://blog.csdn.net/qq_22222499/article/details/100044685

广播流与普通流JOIN图解

user actions 可以看作是事件流

patterns 为广播流,把全量数据加载到不同的计算节点
在这里插入图片描述

普通双流join
根据join 条件,根据key的发到同一个计算节点,如下图类似
在这里插入图片描述

代码实践

主要功能,实现动态添加监控词,满足条件数据继续向下流转,不满足条件的,直接丢弃

主类

 StreamExecutionEnvironment environment = StreamExecutionEnvironment.getExecutionEnvironment();
        environment.setStreamTimeCharacteristic(TimeCharacteristic.IngestionTime);
        environment.enableCheckpointing(1000 * 180);
        FlinkKafkaConsumer010<String> location = KafkaUtil.getConsumer("event_stream", "test_1", "test");
        FlinkKafkaConsumer010<String> object = KafkaUtil.getConsumer("bro_stream", "test_2", "test");
        // 把事件流按key进行分流,这样相同的key会发到同一个节点
        KeyedStream<People, String> driverDatastream = environment.addSource(location).map(new MapFunction<String, Driver>() {

            @Override
            public People map(String s) throws Exception {
                return parse(s);
            }
        }).keyBy((KeySelector<People, String>) people -> people.id);
    
        // 描述这个map ,key value都为string 
        MapStateDescriptor<String, String> mapStateDescriptor = new MapStateDescriptor<String, String>("register", Types.STRING, Types.STRING);
        BroadcastStream<String> broadcast = environment.addSource(object).broadcast(mapStateDescriptor);
        driverDatastream.connect(broadcast).process(new PatternEvaluator()).print();
        try {
            environment.execute("register collect");
        } catch (Exception e) {
            e.printStackTrace();
        }

对元素处理

public class PatternEvaluator extends KeyedBroadcastProcessFunction<String, People, String, People> {

    MapStateDescriptor<String, String> mapStateDescriptor;

    @Override
    public void open(Configuration parameters) throws Exception {
        super.open(parameters);
        // 这里需要初始化map state 描述
        mapStateDescriptor = new MapStateDescriptor<String, String>("register", Types.STRING, Types.STRING);

    }

    // 处理每一个元素,看state是否有匹配的,有的话,下发到下一个节点
    @Override
    public void processElement(People value, ReadOnlyContext ctx, Collector<People> out) throws Exception {
        ReadOnlyBroadcastState<String, String> broadcastState = ctx.getBroadcastState(mapStateDescriptor);
        if ((value.getIdCard() != null && broadcastState.get(value.getIdCard()) != null) || (value.getPhone() != null && broadcastState.get(value.getPhone()) != null)) {
            System.out.println("匹配到" + value.toString());
            out.collect(value);
        }

    }


    // 新增加的广播元素,放入state中
    @Override
    public void processBroadcastElement(String value, Context ctx, Collector<People> out) throws Exception {
        System.out.println("新增加需要监控的" + value.toString());
        BroadcastState<String, String> broadcastState = ctx.getBroadcastState(mapStateDescriptor);
        broadcastState.put(value, value);
    }
}

猜你喜欢

转载自blog.csdn.net/qq_22222499/article/details/100044685