HiveOn Spark环境中执行select count语句时候报错:
Exception in thread"main" java.lang.NoClassDefFoundError: scala/collection/Iterable
如图:
报错完整信息是:
hive> select count(*) from t_hello;
Exception in thread "main"java.lang.NoClassDefFoundError: scala/collection/Iterable
atorg.apache.hadoop.hive.ql.parse.spark.GenSparkProcContext.<init>(GenSparkProcContext.java:163)
at org.apache.hadoop.hive.ql.parse.spark.SparkCompiler.generateTaskTree(SparkCompiler.java:195)
atorg.apache.hadoop.hive.ql.parse.TaskCompiler.compile(TaskCompiler.java:267)
atorg.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10947)
at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:246)
atorg.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:250)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:477)
at org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1242)
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1384)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1171)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1161)
atorg.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:232)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:183)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:399)
atorg.apache.hadoop.hive.cli.CliDriver.executeDriver(CliDriver.java:776)
at org.apache.hadoop.hive.cli.CliDriver.run(CliDriver.java:714)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:641)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
atsun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.util.RunJar.run(RunJar.java:234)
at org.apache.hadoop.util.RunJar.main(RunJar.java:148)
Caused by:java.lang.ClassNotFoundException: scala.collection.Iterable
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 23 more
[root@master bin]#
报错原因
Hive On Spark运行时,需要加载Spark的相关jar包,而Hive的lib目录下没有这个jar包,也没有在hive的启动脚本中写入加载这些jar包的脚本。
解决方法
解决方法主要有两种,最简单的就是将$SPARK_HOME/lib目录下面的spark-assembly开头的那个jar包拷贝到$HIVE_HOME/lib目录下面,例如我的这个包名称是:spark-assembly-1.6.3-hadoop2.4.0.jar,找到这个包,将其加入拷贝到hive的lib目录下。
如图:
说明: Hive On Spark涉及到Hive 版本和Spark是否支持运行Hive On Spark的问题,也涉及到配置是否正确的问题,如果有需要,请参考该博文搭建Hive On Spark运行环境(不需要自己编译,直接用官方的特定版本),博文地址是:http://blog.csdn.net/pucao_cug/article/category/6941532