DataFrame映射表的形式

临时表:针对SparkSession

  

  使用DF.createTempView("person")对数据集注册临时表

  通过spark.sql(.....)
  代码说明:
    df_rdd.createTempView("person")
    spark.sql("select * from person where name like '%0%'").show()
      +------+---+-------+
      | name|age|address|
      +------+---+-------+
      | joe| 39| CO|
      |alison| 35| NY|
      | bob| 71| CA|
      +------+---+-------+
      spark.newSession().sql("select * from person where name like '%o%'").show()
      抛异常:Exception in thread "main" org.apache.spark.sql.AnalysisException: Table or view not found: person; line 1 pos 15

      原因:因为临时表不是全局的,它指在当前的SparkSession下可以使用,新创建的就不好使了。

 全局表:针对SparkApplication

  使用DF.createGlobalTempView("person")对数据集注册临时表
  代码说明:
    df_rdd.createGlobalTempView("person")
    spark.sql("select * from global_temp.person where name like '%o%'").show()
    spark.newSession().sql("select * from global_temp.person where name like '%o%'").show()
      +------+---+-------+
      | name|age|address|
      +------+---+-------+
      | joe| 39| CO|
      |alison| 35| NY|
      | bob| 71| CA|
      +------+---+-------+

猜你喜欢

转载自www.cnblogs.com/lyr999736/p/10204524.html