Flink’s output operator Redis Sink

Redis Sink

In the documentation of the new version of Flink, the specific use of Redis Sink is not found, but its specific use can be learned throughApache Bahir< /span>

Redis has its extremely high write and read performance, so it is also one of the frequently used sinks. This can be implemented manually using the Java Redis client Jedis, or using the implementations provided by Flink and Bahir.

The open source Redis Connector is very convenient to use, but it cannot use some advanced functions in Jedis, such as setting expiration time, etc.

jedisimplementation

Add dependencies

<dependency>
    <groupId>redis.clients</groupId>
    <artifactId>jedis</artifactId>
    <version>5.0.0</version>
</dependency>

Custom Redis Sink

Customize a RedisSink function, inherit RichSinkFunction, and rewrite the open, invoke and close methods in it

open:用于新建Redis客户端

invoke:将数据存储到Redis中,这里将数据以字符串的形式存储到Redis中

close:使用完毕后关闭Redis客户端
import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.configuration.Configuration;
import org.apache.flink.streaming.api.functions.sink.RichSinkFunction;
import redis.clients.jedis.Jedis;

public class RedisSink extends RichSinkFunction {
    
    

    private transient Jedis jedis;
    
	@Override
    public void open(Configuration config) {
    
    
        jedis = new Jedis("localhost", 6379);
    }

    @Override
    public void invoke(Object value, Context context) throws Exception {
    
    
        Tuple2<String, String> val = (Tuple2<String, String>) value;
        if (!jedis.isConnected()) {
    
    
            jedis.connect();
        }
        jedis.set(val.f0, val.f1);
    }

    @Override
    public void close() throws Exception {
    
    
        jedis.close();
    }
}

Use sink

    public static void main(String[] args) throws Exception {
    
    
        // 创建执行环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        // 创建data source
        DataStreamSource<String> dataStreamSource = env.fromElements("Flink", "Spark", "Storm");
        // 应用转换算子
        SingleOutputStreamOperator<Tuple2<String, String>> streamOperator = dataStreamSource.map(new MapFunction<String, Tuple2<String, String>>() {
    
    
            @Override
            public Tuple2<String, String> map(String s) throws Exception {
    
    
                return new Tuple2<>(s, s);
            }
        });

        // 关联sink:将给定接收器添加到此数据流
        streamOperator.addSink(new RedisSink());

        // 执行
        env.execute("redis sink");
    }

verify

Query data in the Redis console

本地:0>keys *
1) "Flink"
2) "Storm"
3) "Spark"

Open Source Redis Connector

You can use the implementations provided by Flink and Bahir, which internally use the Java Redis client Jedis to implement Redis Sink.

Considerable:https://mvnrepository.com/artifact/org.apache.bahir/flink-connector-redis

Outside:https://bahir.apache.org/#home

Add dependencies

The usage of Flink is similar to Bahir. Here we take the dependent Jar provided by Flink as an example and introduce flink-connector-redis.

Dependency packages provided by Flink

<dependency>
    <groupId>org.apache.bahir</groupId>
    <artifactId>flink-connector-redis_2.12</artifactId>
    <version>1.1.0</version>
</dependency>

Dependency packages provided by Bahir

<dependency>
   <groupId>org.apache.bahir</groupId>
   <artifactId>flink-connector-redis_2.11</artifactId>
   <version>1.0</version>
</dependency>

Custom Redis Sink

Redis Sink can be customized by implementing RedisMapper

The customized MyRedisSink implements RedisMapper and overrides getCommandDescription, getKeyFromData, and getValueFromData.

getCommandDescription:定义存储到Redis中的数据格式,这里定义RedisCommandSET,将数据以字符串的形式存储

getKeyFromData:定义SETKey

getValueFromData:定义SET的值

RedisCommand

All available commands of Redis, each command belongs to RedisDataType.

public enum RedisCommand {
    
    

    /**
     * 将指定值插入到存储在键上的列表的头部。
     * 如果键不存在,则在执行推送操作之前将其创建为空列表。
     */
    LPUSH(RedisDataType.LIST),
    /**
     * 将指定值插入到存储在键上的列表的尾部。
     * 如果键不存在,则在执行推送操作之前将其创建为空列表。
     */
    RPUSH(RedisDataType.LIST),
    /**
     * 将指定成员添加到存储在键上的集合中。
     * 忽略已经是该集合成员的指定成员。
     */
    SADD(RedisDataType.SET),
    /**
     * 设置键的字符串值。如果键已经持有一个值,
     * 则无论其类型如何,都会被覆盖。
     */
    SET(RedisDataType.STRING),
    /**
     * 设置键的字符串值,并设置生存时间(TTL)。
     * 如果键已经持有一个值,则无论其类型如何,都会被覆盖。
     */
    SETEX(RedisDataType.STRING),
    /**
     * 将元素添加到以第一个参数指定的变量名存储的HyperLogLog数据结构中。
     */
    PFADD(RedisDataType.HYPER_LOG_LOG),
    /**
     * 将消息发布到给定的频道。
     */
    PUBLISH(RedisDataType.PUBSUB),
    /**
     * 将具有指定分数的指定成员添加到存储在键上的有序集合中。
     */
    ZADD(RedisDataType.SORTED_SET),
    ZINCRBY(RedisDataType.SORTED_SET),

    /**
     * 从存储在键上的有序集合中删除指定的成员。
     */
    ZREM(RedisDataType.SORTED_SET),
    /**
     * 在键上的哈希中设置字段的值。如果键不存在,
     * 则创建一个持有哈希的新键。如果字段已经存在于哈希中,则会覆盖它。
     */
    HSET(RedisDataType.HASH),
    HINCRBY(RedisDataType.HINCRBY),

    /**
     * 为指定的键进行加法操作。
     */
    INCRBY(RedisDataType.STRING),
    /**
     * 为指定的键进行加法操作,并在固定时间后过期。
     */
    INCRBY_EX(RedisDataType.STRING),
    /**
     * 为指定的键进行减法操作。
     */
    DECRBY(RedisDataType.STRING),
    /**
     * 为指定的键进行减法操作,并在固定时间后过期。
     */
    DESCRBY_EX(RedisDataType.STRING);
    /**
     * 此命令所属的{@link RedisDataType}。
     */
    private RedisDataType redisDataType;

    RedisCommand(RedisDataType redisDataType) {
    
    
        this.redisDataType = redisDataType;
    }

    /**
     * 获取此命令所属的{@link RedisDataType}。
     *
     * @return {@link RedisDataType}
     */
    public RedisDataType getRedisDataType() {
    
    
        return redisDataType;
    }
}

String data type example

import org.apache.flink.api.java.tuple.Tuple2;
import org.apache.flink.streaming.connectors.redis.common.mapper.RedisCommand;
import org.apache.flink.streaming.connectors.redis.common.mapper.RedisCommandDescription;
import org.apache.flink.streaming.connectors.redis.common.mapper.RedisMapper;

public class RedisStringSink implements RedisMapper<Tuple2<String, String>> {
    
    

    /**
     * 设置redis数据类型
     */
    @Override
    public RedisCommandDescription getCommandDescription() {
    
    
        return new RedisCommandDescription(RedisCommand.SET);
    }

    /**
     * 设置Key
     */
    @Override
    public String getKeyFromData(Tuple2<String, String> data) {
    
    
        return data.f0;
    }

    /**
     * 设置value
     */
    @Override
    public String getValueFromData(Tuple2<String, String> data) {
    
    
        return data.f1;
    }
}

Hash data type example

Create an Order object

import lombok.AllArgsConstructor;
import lombok.Data;
import lombok.NoArgsConstructor;

import java.math.BigDecimal;
@Data
@AllArgsConstructor
@NoArgsConstructor
public class Order {
    
    

    private Integer id;
    private String name;
    private BigDecimal price;
}

Coding example:

import com.alibaba.fastjson.JSON;
import org.apache.flink.streaming.connectors.redis.common.mapper.RedisCommand;
import org.apache.flink.streaming.connectors.redis.common.mapper.RedisCommandDescription;
import org.apache.flink.streaming.connectors.redis.common.mapper.RedisMapper;

public class RedisHashSink implements RedisMapper<Order> {
    
    

    /**
     * 设置redis数据类型
     */
    @Override
    public RedisCommandDescription getCommandDescription() {
    
    
        /**
         * 第二个参数是Hash数据类型, 第二个参数是外面的key
         *
         * @redisCommand: Hash数据类型
         * @additionalKey: 哈希和排序集数据类型时使用(RedisDataType.HASH组或RedisDataType.SORTED_SET), 其他类型忽略additionalKey
         */
        return new RedisCommandDescription(RedisCommand.HSET, "redis");
    }

    /**
     * 设置Key
     */
    @Override
    public String getKeyFromData(Order oder) {
    
    
        return oder.getId() + "";
    }

    /**
     * 设置value
     */
    @Override
    public String getValueFromData(Order oder) {
    
    
        // 从数据中获取Value: Hash的value
        return JSON.toJSONString(oder);
    }
}

Use sink

RedisStringSink

    public static void main(String[] args) throws Exception {
    
    
        // 创建执行环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        // 创建data source
        DataStreamSource<String> dataStreamSource = env.fromElements("Flink", "Spark", "Storm");
        // 应用转换算子
        SingleOutputStreamOperator<Tuple2<String, String>> streamOperator = dataStreamSource.map(new MapFunction<String, Tuple2<String, String>>() {
    
    
            @Override
            public Tuple2<String, String> map(String s) throws Exception {
    
    
                return new Tuple2<>(s, s);
            }
        });

        // Jedis池配置
        FlinkJedisPoolConfig conf = new FlinkJedisPoolConfig.Builder().setHost("localhost").setPort(6379).build();
        // 关联sink:将给定接收器添加到此数据流
        streamOperator.addSink(new RedisSink<>(conf, new RedisStringSink()));

        // 执行
        env.execute("redis sink");
    }

RedisHashSink

public static void main(String[] args) throws Exception {
    
    
        // 创建执行环境
        StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
        // 创建data source
        DataStreamSource<Order> dataStreamSource = env.fromElements(
                new Order(1,"Flink", BigDecimal.valueOf(100)),
                new Order(2,"Spark", BigDecimal.valueOf(200)),
                new Order(3,"Storm", BigDecimal.valueOf(300))
        );


        // Jedis池配置
        FlinkJedisPoolConfig conf = new FlinkJedisPoolConfig.Builder().
                setHost("localhost")
                .setPort(6379)
                // 池可分配的最大对象数,默认值为8
                .setMaxTotal(100)
                // 设置超时,默认值为2000
                .setTimeout(1000 * 10)
                .build();
        // 关联sink:将给定接收器添加到此数据流
        dataStreamSource.addSink(new RedisSink<>(conf, new RedisHashSink()));

        // 执行
        env.execute("redis sink");
    }

verify

Query data in the Redis console

View String data type

本地:0>get Flink
"Flink"
本地:0>get Spark
"Spark"

View Hash data type

Redis:0>keys *
1) "redis"
Redis:0>HGETALL redis
1) "3"
2) "{"id":3,"name":"Storm","price":300}"
3) "2"
4) "{"id":2,"name":"Spark","price":200}"
5) "1"
6) "{"id":1,"name":"Flink","price":100}"

Guess you like

Origin blog.csdn.net/qq_38628046/article/details/133811880