大数据协作框架之Sqoop

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/zzw0221/article/details/81287066

一、概述:

    1、Sqoop:SQL-to-Hadoop

    2、连接传统关系型数据库和Hadoop的桥梁:

     a、把关系型数据库的数据导入到Hadoop与其相关的系统中(如Hive,Hbase)

     b、把数据从Hadoop系统里抽取并导出到关系型数据库里

   3、利用MapReduce加快数据传输速度,Sqoop中只有Map没有reduce。

二、安装sqoop

1、解压:

tar -zxvf sqoop-1.4.6-cdh5.14.2.tar.gz -C /opt/cdh5.14.2/

2、配置

查看sqoop的conf目录:

将 sqoop-env-template.sh重命名为:sqoop-env.sh。

配置sqoop-env.sh:

# Licensed to the Apache Software Foundation (ASF) under one or more
# contributor license agreements.  See the NOTICE file distributed with
# this work for additional information regarding copyright ownership.
# The ASF licenses this file to You under the Apache License, Version 2.0
# (the "License"); you may not use this file except in compliance with
# the License.  You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# included in all the hadoop scripts with source command
# should not be executable directly
# also should not be passed any arguments, since we need original $*

# Set Hadoop-specific environment variables here.

#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/opt/cdh5.14.2/hadoop-2.6.0

#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/opt/cdh5.14.2/hadoop-2.6.0

#set the path to where bin/hbase is available
#export HBASE_HOME=

#Set the path to where bin/hive is available
export HIVE_HOME=/opt/cdh5.14.2/hive-1.1.0

#Set the path for where zookeper config dir is
#export ZOOCFGDIR=

3、由于涉及到mysql的连接,需要将mysql的驱动包拷贝到sqoop的lib目录下

cp mysql-connector-java-5.1.27-bin.jar /opt/cdh5.14.2/sqoop-1.4.6/lib/

三、具体操作:

1、list-databases

bin/sqoop list-databases --connect jdbc:mysql://master.cdh.com:3306 --username root --password 123456

结果:

[root@master sqoop-1.4.6]# bin/sqoop list-databases --connect jdbc:mysql://master.cdh.com:3306 --username root --password 123456
Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
18/07/28 21:53:40 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.14.2
18/07/28 21:53:40 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/07/28 21:53:40 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
cdhmetastore
mysql
test

2、导入数据:

从数据库中导入数据导HDFS

结果发生了如下的报错:

Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
Warning: /opt/cdh5.14.2/sqoop-1.4.6/bin/../../zookeeper does not exist! Accumulo imports will fail.
Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.
18/07/30 05:29:44 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.14.2
18/07/30 05:29:44 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/07/30 05:29:44 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/07/30 05:29:44 INFO tool.CodeGenTool: Beginning code generation
18/07/30 05:29:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tb_user_info` AS t LIMIT 1
18/07/30 05:29:46 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tb_user_info` AS t LIMIT 1
18/07/30 05:29:46 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/cdh5.14.2/hadoop-2.6.0
Note: /tmp/sqoop-root/compile/fd915b0f18440da413291558f8b3fcd5/tb_user_info.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/07/30 05:29:57 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/fd915b0f18440da413291558f8b3fcd5/tb_user_info.jar
18/07/30 05:29:57 WARN manager.MySQLManager: It looks like you are importing from mysql.
18/07/30 05:29:57 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
18/07/30 05:29:57 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
18/07/30 05:29:57 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
18/07/30 05:29:57 INFO mapreduce.ImportJobBase: Beginning import of tb_user_info
Exception in thread "main" java.lang.NoClassDefFoundError: org/json/JSONObject
	at org.apache.sqoop.util.SqoopJsonUtil.getJsonStringforMap(SqoopJsonUtil.java:43)
	at org.apache.sqoop.SqoopOptions.writeProperties(SqoopOptions.java:784)
	at org.apache.sqoop.mapreduce.JobBase.putSqoopOptionsToConfiguration(JobBase.java:392)
	at org.apache.sqoop.mapreduce.JobBase.createJob(JobBase.java:378)
	at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:256)
	at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
	at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:127)
	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:513)
	at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
	at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
	at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.lang.ClassNotFoundException: org.json.JSONObject
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	... 15 more

这是因为缺少了java-json.jar包.

https://blog.csdn.net/lichangzai/article/details/51601807

下载地址:http://www.java2s.com/Code/Jar/j/Downloadjavajsonjar.htm

把jar包放到sqoop/lib目录下:

cp java-json.jar /opt/cdh5.14.2/sqoop-1.4.6/lib/

再次执行import命令,不再报错。

在HDFS上相应的目录可以看到生成了文件:

查看文件内容,正是所导入的内容:

3、导入数据使用query:

查看结果:

 

4、导入数据使用direct:

发现会报错,如下所示:

Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:29:36 INFO mapreduce.Job:  map 25% reduce 0%
18/07/30 11:29:39 INFO mapreduce.Job: Task Id : attempt_1532757172540_0005_m_000000_0, Status : FAILED
Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:29:40 INFO mapreduce.Job: Task Id : attempt_1532757172540_0005_m_000003_0, Status : FAILED
Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:29:46 INFO mapreduce.Job: Task Id : attempt_1532757172540_0005_m_000001_1, Status : FAILED
Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:29:57 INFO mapreduce.Job: Task Id : attempt_1532757172540_0005_m_000000_1, Status : FAILED
Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:30:04 INFO mapreduce.Job: Task Id : attempt_1532757172540_0005_m_000003_1, Status : FAILED
Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:30:07 INFO mapreduce.Job: Task Id : attempt_1532757172540_0005_m_000001_2, Status : FAILED
Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:30:09 INFO mapreduce.Job: Task Id : attempt_1532757172540_0005_m_000000_2, Status : FAILED
Error: java.io.IOException: Cannot run program "mysqldump": error=2, No such file or directory
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1047)
	at java.lang.Runtime.exec(Runtime.java:617)
	at java.lang.Runtime.exec(Runtime.java:485)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:405)
	at org.apache.sqoop.mapreduce.MySQLDumpMapper.map(MySQLDumpMapper.java:49)
	at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1920)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.io.IOException: error=2, No such file or directory
	at java.lang.UNIXProcess.forkAndExec(Native Method)
	at java.lang.UNIXProcess.<init>(UNIXProcess.java:187)
	at java.lang.ProcessImpl.start(ProcessImpl.java:130)
	at java.lang.ProcessBuilder.start(ProcessBuilder.java:1028)
	... 12 more

18/07/30 11:30:15 INFO mapreduce.Job:  map 75% reduce 0%
18/07/30 11:30:16 INFO mapreduce.Job:  map 100% reduce 0%
18/07/30 11:30:16 INFO mapreduce.Job: Job job_1532757172540_0005 failed with state FAILED due to: Task failed task_1532757172540_0005_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0

解决办法:

根据这个链接的提示http://www.dataguru.cn/thread-346511-1-1.html

在另外两台安装mysql后就不再提示错误了。

5、RDBMS中的数据导入到Hive中:

执行后却报错了:

18/07/30 16:49:44 INFO mapreduce.ImportJobBase: Transferred 42.3291 KB in 55.0343 seconds (787.5991 bytes/sec)
18/07/30 16:49:44 INFO mapreduce.ImportJobBase: Retrieved 197 records.
18/07/30 16:49:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tb_user` AS t LIMIT 1
18/07/30 16:49:45 INFO hive.HiveImport: Loading uploaded data into Hive
18/07/30 16:49:45 ERROR hive.HiveConfig: Could not load org.apache.hadoop.hive.conf.HiveConf. Make sure HIVE_CONF_DIR is set correctly.
18/07/30 16:49:45 ERROR tool.ImportTool: Import failed: java.io.IOException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
	at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:50)
	at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
	at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
	at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
	at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:530)
	at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
	at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
	at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.conf.HiveConf
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:195)
	at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
	... 12 more

这是因为sqoop需要一个hive的包,将hive/lib中的hive-common-1.1.0-cdh5.14.2.jar拷贝到sqoop的lib目录中

再次执行,又是另外一个错误:

18/07/30 17:06:05 INFO hive.HiveImport: Loading uploaded data into Hive
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/shims/ShimLoader
	at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:370)
	at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:108)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:195)
	at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
	at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
	at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
	at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
	at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:530)
	at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
	at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
	at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.shims.ShimLoader
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	... 17 more

将hive的lib目录下的hive-shims-common-1.1.0-cdh5.14.2.jar复制sqoop的lib目录下,又是另外一个错误:

18/07/30 17:14:14 INFO hive.HiveImport: Loading uploaded data into Hive
Exception in thread "main" java.lang.ExceptionInInitializerError
	at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:108)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:195)
	at org.apache.sqoop.hive.HiveConfig.getHiveConf(HiveConfig.java:44)
	at org.apache.sqoop.hive.HiveImport.getHiveArgs(HiveImport.java:392)
	at org.apache.sqoop.hive.HiveImport.executeExternalHiveScript(HiveImport.java:379)
	at org.apache.sqoop.hive.HiveImport.executeScript(HiveImport.java:337)
	at org.apache.sqoop.hive.HiveImport.importTable(HiveImport.java:241)
	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:530)
	at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
	at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
	at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
Caused by: java.lang.RuntimeException: Could not load shims in class org.apache.hadoop.hive.shims.Hadoop23Shims
	at org.apache.hadoop.hive.shims.ShimLoader.createShim(ShimLoader.java:144)
	at org.apache.hadoop.hive.shims.ShimLoader.loadShims(ShimLoader.java:136)
	at org.apache.hadoop.hive.shims.ShimLoader.getHadoopShims(ShimLoader.java:95)
	at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:370)
	... 16 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.shims.Hadoop23Shims
	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
	at java.security.AccessController.doPrivileged(Native Method)
	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
	at java.lang.Class.forName0(Native Method)
	at java.lang.Class.forName(Class.java:195)
	at org.apache.hadoop.hive.shims.ShimLoader.createShim(ShimLoader.java:141)
	... 19 more

都是关于org.apache.hadoop.hive.shims的报错,索性将Hive的lib目录里面关于shims的包都复制到sqoop的lib目录:

cp -r hive-shims* /opt/cdh5.14.2/sqoop-1.4.6/lib/

再次执行,真的就不再报错了 (*^▽^*)

查询tb_user表,数据导入成功.

猜你喜欢

转载自blog.csdn.net/zzw0221/article/details/81287066