Big data course K21 - Spark's basic syntax of SparkSQL

Email of the author of the article: [email protected] Address: Huizhou, Guangdong

 ▲ This chapter’s program

⚪ Master the methods of using Spark’s SparkSQL;

⚪ Master Spark’s SparkSQL and call it through sql statements;

1. SparkSQL basic syntax - used through methods

1. Query

df.select("id","name").show();

2. Query with conditions

df.select($"id",$"name").where($"name" === "bbb").show()

3. Sorting query

orderBy/sort($"column name") sort in ascending order

orderBy/sort($"column name".desc) Sort in descending order

orderBy/sort($"Column 1", $"Column 2".desc) Sort by two columns

df.select($"id",$"name").orderBy($"name".desc).show

df.select($"id",$"name").sort($"name".desc).show

tabx.select($"id",$"name").sort($"id",$"name".desc).show

4. Group query

groupBy("column name", ...).max(column name) finds the maximum value

groupBy("column name", ...).min(column name) finds the minimum value

groupBy("column name", ...).avg(column name) find the average

groupBy("column name", ...).sum(column name) sum

groupBy("column name", ...).count() finds the number

groupBy("column name", ...).agg can aggregate multiple methods

scala>val rdd = sc.makeRDD(List((1,"a","bj",100),(2,"b","sh",80),(3,"c","gz",50),(4,"d","bj",45)));

scala>val df = rdd.toDF("id","name","addr","score");

scala>df.groupBy("addr").count().show()

Guess you like

Origin blog.csdn.net/u013955758/article/details/132567593