Transformer introduced by Flink streaming computing

0 ready

准备测试数据:
sensor_1,1624006065247,43.92789292115926
sensor_2,1624006065247,97.45845640790921
sensor_3,1624006065247,41.35949935067326
sensor_4,1624006065247,86.68115422056633
sensor_5,1624006065247,52.53673229860578
sensor_6,1624006065247,56.6603508147016
sensor_7,1624006065247,80.31827896634314
sensor_8,1624006065247,85.2968397027334
sensor_9,1624006065247,67.08038287401958
sensor_10 ,1624006065247,58.008757044788
sensor_1,1624006065353,43.49476762604196
// Define sample class, sensor id, timestamp, temperature

case class SensorReading(id: String, timestamp: Long, temperature: Double)

1 split

split demo

import org.apache.flink.streaming.api.scala._
object transformerSensor {
    
    

  def main(args: Array[String]): Unit = {
    
    

    val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment
    val dataStream: DataStream[String] = env.readTextFile("E:\\bigdata\\Flink2\\src\\main\\resources\\sensor.txt")
    val dStreamSensor: DataStream[SensorReading] = dataStream.map(
      line => {
    
    
        val splits = line.split(",")
        SensorReading(splits(0), splits(1).toLong, splits(2).toDouble)
      }
    )

    //分流
    val splitDStream: SplitStream[SensorReading] = dStreamSensor.split(sensorData => {
    
    
      if (sensorData.temperature < 50) Seq("low") else Seq("high")
    })

    //取出对应的流
    val lowDStream: DataStream[SensorReading] = splitDStream.select("low")
    val highDStream: DataStream[SensorReading] = splitDStream.select("high")
    val allDStream: DataStream[SensorReading] = splitDStream.select("low","high")

    lowDStream.print().setParallelism(1)
    env.execute("transformer lowDStream test ")

It can be seen that the data flow is divided into a high flow with a temperature greater than 50 and a low flow with a temperature lower than 50, and the entire data flow composed of high and low. Output low flow results, only data below 50 degrees will be displayed
insert image description here

2 Merge operation

2.1 connect

//合流
val mapLowDStream: DataStream[(String, Double, String)] = lowDStream.map(x => (x.id, x.temperature, "Normal"))
val maphighLowDStream: DataStream[(String, Double, String)] = highDStream.map(x => (x.id, x.temperature, "Warning"))

val coDStream: ConnectedStreams[(String, Double, String), (String, Double, String)] = mapLowDStream.connect(maphighLowDStream)

val result: DataStream[(String, Double, String)] = coDStream.map(
  lowData => (lowData._1, lowData._2, "healthy"),
  warningData => (warningData._1, warningData._2, "warning")

)

result.print().setParallelism(1)
env.execute("transformer connect test")

insert image description here

2.2 Join union

val unionDStream: DataStream[(String, Double, String)] = mapLowDStream.union(maphighLowDStream)
unionDStream.print().setParallelism(1)
env.execute("transformer union test")

insert image description here

2.3 The difference between the two

1. The types of the two streams before Union must be the same, but Connect can be different, and then adjusted to be the same in the subsequent coMap.
2. Connect can only operate two streams, and Union can operate multiple streams.

Guess you like

Origin blog.csdn.net/Keyuchen_01/article/details/118498362