Flink 流处理API2

1. Sink

Flink 没有类似于spark 中foreach 方法,让用户进行迭代的操作。虽有对外的
输出操作都要利用Sink 完成。最后通过类似如下方式完成整个任务最终输出操作。

stream.addSink(new MySink(xxxx))

官方提供了一部分的框架的sink。除此以外,需要用户自定义实现sink。

1.1 kafka

<dependency>
      <groupId>org.apache.flink</groupId>
      <artifactId>flink-connector-kafka-0.11_2.12</artifactId>
      <version>1.10.1</version>
 </dependency>

主函数中添加sink:

dataStream.addSink(new FlinkKafkaProducer011[String]("localhost:9092",
"test", new SimpleStringSchema()))

1.2 Redis

pom.xml

<!-- https://mvnrepository.com/artifact/org.apache.bahir/flink-connector-redis
-->
<dependency>
<groupId>org.apache.bahir</groupId>
<artifactId>flink-connector-redis_2.11</artifactId>
<version>1.0</version>
</dependency>

定义一个redis 的mapper 类,用于定义保存到redis 时调用的命令:

class MyRedisMapper extends RedisMapper[SensorReading]{
override def getCommandDescription: RedisCommandDescription = {
new RedisCommandDescription(RedisCommand.HSET, "sensor_temperature")
}
override def getValueFromData(t: SensorReading): String =
t.temperature.toString
override def getKeyFromData(t: SensorReading): String = t.id
}

在主函数中调用:

val conf = new
FlinkJedisPoolConfig.Builder().setHost("localhost").setPort(6379).build()
dataStream.addSink( new RedisSink[SensorReading](conf, new MyRedisMapper) )

1.3 Elasticsearch

pom.xml

<dependency>
<groupId>org.apache.flink</groupId>
<artifactId>flink-connector-elasticsearch6_2.12</artifactId>
<version>1.10.1</version>
</dependency>

在主函数中调用:

val httpHosts = new util.ArrayList[HttpHost]()
httpHosts.add(new HttpHost("localhost", 9200))
val esSinkBuilder = new ElasticsearchSink.Builder[SensorReading]( httpHosts,
new ElasticsearchSinkFunction[SensorReading] {
override def process(t: SensorReading, runtimeContext: RuntimeContext,
requestIndexer: RequestIndexer): Unit = {
println("saving data: " + t)
val json = new util.HashMap[String, String]()
json.put("data", t.toString)
val indexRequest =
Requests.indexRequest().index("sensor").`type`("readingData").source(json)
requestIndexer.add(indexRequest)
println("saved successfully")
}
} )
dataStream.addSink( esSinkBuilder.build() )

1.4 JDBC 自定义sink

<!-- https://mvnrepository.com/artifact/mysql/mysql-connector-java -->
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>5.1.44</version>
</dependency>

添加MyJdbcSink

class MyJdbcSink() extends RichSinkFunction[SensorReading]{
var conn: Connection = _
var insertStmt: PreparedStatement = _
var updateStmt: PreparedStatement = _
// open 主要是创建连接
override def open(parameters: Configuration): Unit = {
super.open(parameters)
conn = DriverManager.getConnection("jdbc:mysql://localhost:3306/test",
"root", "123456")
insertStmt = conn.prepareStatement("INSERT INTO temperatures (sensor,
temp) VALUES (?, ?)")
updateStmt = conn.prepareStatement("UPDATE temperatures SET temp = ? WHERE
sensor = ?")
}
// 调用连接,执行sql
override def invoke(value: SensorReading, context:
SinkFunction.Context[_]): Unit = {
updateStmt.setDouble(1, value.temperature)'updateStmt.setString(2, value.id)
updateStmt.execute()
if (updateStmt.getUpdateCount == 0) {
insertStmt.setString(1, value.id)
insertStmt.setDouble(2, value.temperature)
insertStmt.execute()
}
}
override def close(): Unit = {
insertStmt.close()
updateStmt.close()
conn.close()
}
}

在main 方法中增加,把明细保存到mysql 中

dataStream.addSink(new MyJdbcSink())

猜你喜欢

转载自blog.csdn.net/weixin_42529756/article/details/114812175