1. JDBC
Spark SQL can create DataFrame by way of reading data from a relational database JDBC, through a series of calculations DataFrame, can also write data back to a relational database.
1.1. SparkSql load data from the MySQL
1.1.1 SparkSql write the code by IDEA
View execution effect:
1.1.2 run by spark-shell
(1), to start spark-shell (mysql connection must be specified driver package)
spark-shell \
--master spark://hdp-node-01:7077 \
--executor-memory 1g \
--total-executor-cores 2 \
--jars /opt/bigdata/hive/lib/mysql-connector-java-5.1.35.jar \
--driver-class-path /opt/bigdata/hive/lib/mysql-connector-java-5.1.35.jar
(2) loading the data from mysql
val mysqlDF = spark.read.format("jdbc").options(Map("url" -> "jdbc:mysql://192.168.200.150:3306/spark", "driver" -> "com.mysql.jdbc.Driver", "dbtable" -> "iplocation", "user" -> "root", "password" -> "123456")).load()
(3), execute the query
1.2. SparkSql write data written to MySQL 1.2.1 SparkSql codes by IDEA
(1) Write code
(2) with the packaged maven
Tools can be packaged by IDEA
(3) the Jar package submitted to spark cluster
spark-submit \
--class itcast.sql.SparkSqlToMysql \
--master spark://hdp-node-01:7077 \
--executor-memory 1g \
--total-executor-cores 2 \
--jars /opt/bigdata/hive/lib/mysql-connector-java-5.1.35.jar \
--driver-class-path /opt/bigdata/hive/lib/mysql-connector-java-5.1.35.jar \
/root/original-spark-2.0.2.jar /person.txt
(4) View data tables in mysql