搭建Hive on Spark on YARN常见问题及解决方案

1、在Hive cli中往表中插入记录,报错信息如下

错误信息a

Unrecognized Hadoop major version number: 3.2.0

当时环境版本信息:
hadoop版本:3.2.0
spark版本:2.4.0
hive版本:3.1.1
解决方案

版本兼容性问题,通过查看hive源码根目录下的pom.xml文件,发现:
1、hive兼容的hadoop版本为3.1.0,
2、兼容的spark版本为2.3.0,
所以将hadoop版本从3.2.0换为3.1.0,spark版本从2.4.0换为2.3.3,问题解决
注:主要保证大版本一致即可,没有必要保证版本完全一致,尽量选用稳定的版本

错误信息b

Failed to monitor Job[-1] with exception 'java.lang.IllegalStateException(Connection to remote Spark driver was lost)' Last known state = SENT
Failed to execute spark task, with exception 'java.lang.IllegalStateException(RPC channel is closed.)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. RPC channel is closed.

解决方案
通过yarn logs -applicationId xxx查看Application的日志,发现如下错误信息

2019-03-02 12:24:17,549 ERROR yarn.ApplicationMaster: User class threw exception: java.io.FileNotFoundException: File file:/user/hive/tmp/sparkeventlog does not exist

hive-site.xml配置错误,提示文件sparkeventlog不存在,每次涉及HDFS路径参数的时候,添加core-site.xml中 fs.defaultFS 对应的 {hostname}:port 信息。例如,之前关于sparkeventlog的配置参数是这样子的:
/user/hive/tmp/sparkeventlog
现在修改为:
hdfs://hadoopSvr1:8020/user/hive/tmp/sparkeventlog

解决方案
检查resourcemanager的log信息,发现nodemanager启动异常,错误信息如下:

org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.BindException: Problem binding to [hadoopSvr3:8031] java.net.BindException: Cannot assign requested address; For more details see:  http://wiki.apache.org/hadoop/BindException

重启YARN集群,:尽量在resourcemanager所在的节点上通过start-yarn.sh脚本启动YARN集群,这样可以减少YARN集群启动异常的概率。

2、Hive update和delete报错

错误信息如下:

hive> delete from alarm where eid = 8;
FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.

解决方案
默认在hive中没有默认开启支持单条插入(update)、更新以及删除(delete)操作,需要自己配置,可通过在hive-site.xml文件中配置如下属性来开启update与delete功能

hive.support.concurrency = true
hive.enforce.bucketing = true
hive.exec.dynamic.partition.mode = nonstrict
hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
hive.compactor.initiator.on = true
hive.compactor.worker.threads = 1

重启hive服务,而且要求新建的表带有属性(transactional=true)见如下官网原话
If a table is to be used in ACID writes (insert, update, delete) then the table property “transactional=true” must be set on that table

猜你喜欢

转载自blog.csdn.net/wangkai_123456/article/details/88073990
今日推荐