Spark SQL--执行模式

版权声明:未经同意,严禁抄袭。 https://blog.csdn.net/qq_36235275/article/details/82502416

DSL风格语法

scala> val peopleDF = rdd.map{x => val strs=x.split(",");People(strs(0),strs(1).trim.toInt)}.toDF
peopleDF: org.apache.spark.sql.DataFrame = [name: string, age: int]
scala> peopleDF.select("name").show
+-------+
|   name|
+-------+
|Michael|
|   Andy|
| Justin|
+-------+
scala> peopleDF.filter($"age">20).show
+-------+---+
|   name|age|
+-------+---+
|Michael| 29|
|   Andy| 30|
+-------+---+
scala> peopleDF.groupBy("age").count.show
+---+-----+                                                                     
|age|count|
+---+-----+
| 19|    1|
| 29|    1|
| 30|    1|
+---+-----+

SQL风格语法

## 创建表
## 借助于SparkSession
## createOrReplaceTempView:Session内可以访问,一旦session 停止,表自动删除
## createGlobalOrReplaceTempView:一个应用级别的访问,多个session之间可以访问,但是一旦SparkContext关闭,也会删除
scala> spark.sql("select * from people").show
+-------+---+
|   name|age|
+-------+---+
|Michael| 29|
|   Andy| 30|
| Justin| 19|
+-------+---+
scala> spark.newSession.sql("select * from people").show
## 出错
scala> spark.newSession.sql("select * from global_temp.people").show
+-------+---+
|   name|age|
+-------+---+
|Michael| 29|
|   Andy| 30|
| Justin| 19|
+-------+---+
scala> spark.sql("select * from global_temp.people").show
+-------+---+
|   name|age|
+-------+---+
|Michael| 29|
|   Andy| 30|
| Justin| 19|
+-------+---+

猜你喜欢

转载自blog.csdn.net/qq_36235275/article/details/82502416
今日推荐