SparkSql Hive integration

    SparkSql Hive integration

Hive need metadata, hive metadata stored in the Mysql Lane, sparkSql replaced the yarn, do not start yarn, we need to start hdfs

First you have to have a hive, then you have to have a spark, if there had to be highly available hadoop zookeeper, but also a dfs (hadoop in)

I've got three nodes node01, node02, node03

ps: DATEDIFF (A, B) do difference set

 

node01

The first copy hive to hive-site.xml config spark of

cp hive-site.xml /export/servers/hive-1.1.0-cdh5.14.0/conf/hive-site.xml /export/servers/spark-2.0.2/conf/

Then spark config directory to other nodes scp

scp hive-site.xml node02:$PWD
scp hive-site.xml node03:$PWD

 

Copy mysql driver package to spark jars directory (as before mounted Hive hive should exist in mysql metadata, before I Mysql driver package to copy the hive / lib)

cp /export/servers/hive-1.1.0-cdh5.14.0/lib/mysql-connector-java-5.1.38.jar /export/servers/spark-2.0.2/jars/

 

The drive mysql copy to other nodes spark directory

First into the spark / jars directory

cd /export/servers/spark-2.0.2/jars/

Copy (I close with a free login, and the host name mappings ip)

scp mysql-connector-java-5.1.38.jar node02:$PWD
scp mysql-connector-java-5.1.38.jar node03:$PWD

Because the question will be tested in hdfs file, so you need to start dfs, do not start yarn

After entering the hadoop / sbin directory, start

./start-dfs.sh

 

Start spark cluster (I put them into a package script => If necessary, click I downloaded Pending )

Script Start

 

./spark-start.sh 2

test

spark-sql \
--master spark://node01:7077 \
--executor-memory 1g \
--total-executor-cores 2 \
--conf spark.sql.warehouse.dir=hdfs://node01:8020/user/hive/warehouse/myhive.db

Failed

 

 

 

This line of reason code is wrong, because before I had to integrate and impala, but I did not start impala.

 

solution

Enter node01

Open the hive-site.xml the hive / conf

Note

 

 Restart

spark-sql \
--master spark://node01:7077 \
--executor-memory 1g \
--total-executor-cores 2 \
--conf spark.sql.warehouse.dir=hdfs://node01:8020/user/hive/warehouse/myhive.db

success

 

Guess you like

Origin www.cnblogs.com/BigDataBugKing/p/11243613.html
Recommended