Flink1.7 SQL 批处理示例

本文为Flink sql Dataset 示例(Scala)
Scan / Select
功能描述: 查询一个表中的所有数据

package flink_sql
import org.apache.flink.api.scala.{ExecutionEnvironment,_}
import org.apache.flink.table.api.TableEnvironment
import org.apache.flink.table.api.scala._

/**
  * Flink_SQL_DataSet Demo
  */
object FlinkSqlDataSet {
  def main(args: Array[String]): Unit = {
    //得到执行环境
    val env = ExecutionEnvironment.getExecutionEnvironment
    env.setParallelism(1)
    //设置数据集
    val dataSet = env.fromElements(("小明",18,"男"),("小红",19,"女"),("张三",9,"男"),("李四",26,"男"))
    //得到 table环境
    val tableEnv =TableEnvironment.getTableEnvironment(env)
    //注册表
    tableEnv.registerDataSet("Student",dataSet,'name,'age,'sex)

    tableEnv.sqlQuery(s"select name,age,sex from Student")
      .first(100)//Creates a new DataSet containing the first 100 elements of this DataSet.
      .print()
 
 /**
      * 输出结果
      * 小明,18,男
        小红,19,女
        张三,9,男
        李四,26,男
      */ 

 }
}

上述SQL均可以换成如下,以相对应功能:

as (table)
功能描述: 给表名取别称

tableEnv.sqlQuery(s"select s1.name,s1.age FROM Studentas s1")

as (column)
功能描述: 给表名取别称

tableEnv.sqlQuery(s"select name a,age as b FROM Student ")

Where / Filter
功能描述:列加条件过滤表中的数据

tableEnv.sqlQuery(s"select name,age,sex FROM Student where sex = '女'")

between and (where)
功能描述: 过滤列中的数据, 开始数据 <= data <= 结束数据

tableEnv.sqlQuery(s"select name,age,sex FROM Student where age between 20 and  35")

Sum
功能描述: 求和所有数据

tableEnv.sqlQuery(s"select sum(age) FROM Student")

max(min)
功能描述: 求最大(最小)值

tableEnv.sqlQuery(s"select max(age) FROM Student")

sum (group by )
功能描述: 按性别分组求和

tableEnv.sqlQuery(s"select sex,sum(age) from Student group by sex")

/**
      * 输出结果:
      *
      * 女,19
        男,53      *
      */

group by having

tableEnv.sqlQuery(s"select sex,sum(age) from Student group by sex having sum(age)>20")

distinct
功能描述: 去重一列或多列

tableEnv.sqlQuery("select distinct name  FROM Student")

猜你喜欢

转载自blog.csdn.net/kzw11/article/details/88576323