SparkSql API

By using sparksql api

   

Implementation steps:

1) Open the scala IDE development environment, create a scala project

2) introducing spark its dependencies jar package

3) Creating a Package object class path

4) write code

Code schematically:

package cn.tedu.sparksql

   

import org.apache.spark.SparkConf

import org.apache.spark.SparkContext

import org.apache.spark.sql.SQLContext

object Demo01 {

def main(args: Array[String]): Unit = {

val conf=new SparkConf().setMaster("spark://hadoop01:7077").setAppName("sqlDemo01");

val sc=new SparkContext(conf)

val sqlContext=new SQLContext(sc)

 

val rdd=sc.makeRDD(List((1,"zhang"),(2,"li"),(3,"wang")))

 

import sqlContext.implicits._

val df=rdd.toDF("id","name")

df.registerTempTable("tabx")

 

val df2=sqlContext.sql("select * from tabx order by name");

selection RDD2 = df2.toJavaRDD;

// outputs the result to the linux local directory, of course, can also be output to HDFS the

rdd2.saveAsTextFile("file:///home/software/result");

}

}

   

5 ) hit the jar package and upload it to linux virtual machine

6 ) In the spark of bin directory

Execution: SH the Spark-the Submit --class cn.tedu.sparksql.Demo01 ./sqlDemo01.jar

7 ) The last test

Guess you like

Origin www.cnblogs.com/shuzhiwei/p/11323157.html