从集合、文件中读取数据
基于Flink快速,灵巧,实时性高的特点以温度传感器的数据采集为场景做练习。
------集合------
第一步:定义传感器样例类
case class SensorReading(id:String,timeStamp:Long,temperature:Double)
第二步:创建执行环境(流式数据)
val env = StreamExecutionEnvironment.getExecutionEnvironment
第三步:定义数据源集合
val sensorList = List (
SensorReading("Sensor_1", 1547718199, 36.0),
SensorReading("Sensor_2", 1547718257, 35.2),
SensorReading("Sensor_3", 1547718358, 34.8),
SensorReading("Sensor_4", 1547718455, 32.8),
SensorReading("Sensor_5", 1547718587, 32.5),
SensorReading("Sensor_6", 1547718698, 31.3),
SensorReading("Sensor_7", 1547718785, 37.9)
)
第四步:从集合中读取数据
val sensorData: DataStream[SensorReading] = env.fromCollection(sensorList)
//也可以使用fromelements方法
val sensorData: DataStream[SensorReading] = env.fromElements(
SensorReading("Sensor_1", 1547718199, 36.0),
SensorReading("Sensor_2", 1547718257, 35.2),
SensorReading("Sensor_3", 1547718358, 34.8),
SensorReading("Sensor_4", 1547718455, 32.8),
SensorReading("Sensor_5", 1547718587, 32.5),
SensorReading("Sensor_6", 1547718698, 31.3),
SensorReading("Sensor_7", 1547718785, 37.9))
fromCollection参数为包含数据的一个seq.
fromElements参数直接为数据。这些数据可以是不同类型的,可用于测试,但是不常用
第五步:打印
sensorData.print()
第六步:执行
env.execute("apitest")
结果:
2> SensorReading(Sensor_1,1547718199,36.0)
3> SensorReading(Sensor_2,1547718257,35.2)
2> SensorReading(Sensor_5,1547718587,32.5)
3> SensorReading(Sensor_6,1547718698,31.3)
1> SensorReading(Sensor_4,1547718455,32.8)
4> SensorReading(Sensor_3,1547718358,34.8)
4> SensorReading(Sensor_7,1547718785,37.9)
虽然代码中只对数据进行了读取和打印,依然需要进行隐式转换
因为里面设计了typeinformation的隐式转换
因为数据源为有界流数据处理完会直接结束
------文件------
第一步:创建执行环境(流式数据)
第二步:创建数据源文件
source下创建txt文件
“Sensor_1”, 1547718199, 36.0
“Sensor_2”, 1547718257, 35.2
“Sensor_3”, 1547718358, 34.8
“Sensor_4”, 1547718455, 32.8
“Sensor_5”, 1547718587, 32.5
“Sensor_6”, 1547718698, 31.3
“Sensor_7”, 1547718785, 37.9
第三步:读取文件
val inputpath = "F:\\idea1\\flinktest\\src\\main\\resources\\sensorfile.txt"
val sensorData2: DataStream[String] = env.readTextFile(inputpath)
第四步:打印
第五步:执行
完整代码
package com.erke.apitest
import org.apache.flink.streaming.api.scala._
//定义样例类
case class SensorReading(id:String,timeStamp:Long,temperature:Double)
object SourceTest {
def main(args: Array[String]): Unit = {
//创建执行环境
val env: StreamExecutionEnvironment = StreamExecutionEnvironment.getExecutionEnvironment
//定义数据源集合
val sensorList = List (
SensorReading("Sensor_1", 1547718199, 36.0),
SensorReading("Sensor_2", 1547718257, 35.2),
SensorReading("Sensor_3", 1547718358, 34.8),
SensorReading("Sensor_4", 1547718455, 32.8),
SensorReading("Sensor_5", 1547718587, 32.5),
SensorReading("Sensor_6", 1547718698, 31.3),
SensorReading("Sensor_7", 1547718785, 37.9)
)
//读取集合中数据
val sensorData: DataStream[SensorReading] = env.fromCollection(sensorList)
//读取文件中数据
val inputpath = "F:\\idea1\\flinktest\\src\\main\\resources\\sensorfile.txt"
val sensorData2: DataStream[String] = env.readTextFile(inputpath)
//打印
sensorData.print()
sensorData2.print()
//执行
env.execute("apitest")
}
}