spark-sql 集成 hive

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/sunrising_hill/article/details/79476271
Hive目录下的hive-site.xml添加:

<property>
<name>hive.metastore.uris</name>
<value>thrift://hm.clusterdomain.com:9083</value>
<description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>
</property>



将mysql驱动导入 /apps/spark/jars
将hive-site.xml导入 /apps/spark/conf,如果是集群,将配置文件上传至每台spark的配置目录中。

启动Hive的metastore:
nohup /apps/hive/bin/hive --service metastore &

修改spark-sql脚本,具体配置根据集群规模配置:
--executor-memory 4G --total-executor-cores 6 --executor-cores 2
修改后:
exec "${SPARK_HOME}"/bin/spark-submit --class org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver --executor-memory 4G --total-executor-cores 6 --executor-cores 2 "$@"


spark-sql即可直接使用sql语句查询hive数据表。


此时控制台会打印spark-sql翻译的sql job的所有日志信息,可以修改日志级别,只显示重要信息:
cd /apps/spark/conf
cp log4j.properties.template log4j.properties
vi log4j.properties


log4j.rootCategory=INFO, console
改为:
log4j.rootCategory=WARN, console


[hadoop@tomcatapi ~]$ spark-sql
18/03/07 20:23:22 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
18/03/07 20:23:35 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
18/03/07 20:23:35 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
spark-sql> 
         > 
         > select count(*) from log.uservisitlogpartorc;
8984856                                                                         
Time taken: 10.552 seconds, Fetched 1 row(s)
spark-sql> 


猜你喜欢

转载自blog.csdn.net/sunrising_hill/article/details/79476271