SparkSql Hive integration
Hive need metadata, hive metadata stored in the Mysql Lane, sparkSql replaced the yarn, do not start yarn, we need to start hdfs
First you have to have a hive, then you have to have a spark, if there had to be highly available hadoop zookeeper, but also a dfs (hadoop in)
I've got three nodes node01, node02, node03
ps: DATEDIFF (A, B) do difference set
node01
The first copy hive to hive-site.xml config spark of
cp hive-site.xml /export/servers/hive-1.1.0-cdh5.14.0/conf/hive-site.xml /export/servers/spark-2.0.2/conf/
Then spark config directory to other nodes scp
scp hive-site.xml node02:$PWD
scp hive-site.xml node03:$PWD
cp /export/servers/hive-1.1.0-cdh5.14.0/lib/mysql-connector-java-5.1.38.jar /export/servers/spark-2.0.2/jars/
The drive mysql copy to other nodes spark directory
First into the spark / jars directory
cd /export/servers/spark-2.0.2/jars/
Copy (I close with a free login, and the host name mappings ip)
scp mysql-connector-java-5.1.38.jar node02:$PWD
scp mysql-connector-java-5.1.38.jar node03:$PWD
Because the question will be tested in hdfs file, so you need to start dfs, do not start yarn
After entering the hadoop / sbin directory, start
./start-dfs.sh
Start spark cluster (I put them into a package script => If necessary, click I downloaded Pending )
Script Start
./spark-start.sh 2
test
spark-sql \ --master spark://node01:7077 \ --executor-memory 1g \ --total-executor-cores 2 \ --conf spark.sql.warehouse.dir=hdfs://node01:8020/user/hive/warehouse/myhive.db
Failed
This line of reason code is wrong, because before I had to integrate and impala, but I did not start impala.
solution
Enter node01
Open the hive-site.xml the hive / conf
Note
Restart
spark-sql \ --master spark://node01:7077 \ --executor-memory 1g \ --total-executor-cores 2 \ --conf spark.sql.warehouse.dir=hdfs://node01:8020/user/hive/warehouse/myhive.db
success