Learn Spark SQL

Spark SQL Programming Guide (Python)
http://www.cnblogs.com/yurunmiao/p/4685310.html

introduces the Register Function of Spark SQL, which means that functions can be dynamically created for SQL queries, and its actual function is similar to Hive UDF .

Spark SQL provides us with powerful data analysis capabilities, which are mainly reflected in the following three aspects:

(1) Spark RDD can be converted to SchemaRDD by inferring Schema through reflection or encoding a specified Schema. After creating SchemaRDD as a "data table", Allows us to analyze data in the form of SQL statements, saving a lot of coding workload;
(2) Spark SQL allows us to dynamically create custom SQL functions according to requirements during application running, expanding the data processing capabilities of SQL;
(3) SchemaRDD can execute all Spark RDD operations, if SQL cannot express our computing logic, we can complete it through Spark RDD's rich API.

Spark processing data in Json format (Python)
http://www.cnblogs.com/yurunmiao/p/4682315.html

https://spark.apache.org/docs/latest/sql-programming-guide.html

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327104043&siteId=291194637