Since hadoop has been installed and the application has been successfully configured, let's continue to install and configure sqoop.
Sqoop is an open source tool, mainly used for data transfer between HADOOP (Hive) and traditional databases (mysql, postgresql...) The data in Hadoop is imported into HDFS of Hadoop, and the data of HDFS can also be imported into relational database.
- Install & Configure
Download address: http://www.us.apache.org/dist/sqoop/1.99.3/
download sqoop-1.99.3-bin-hadoop200.tar.gz
Unzip: tar -zxvf sqoop-1.99.3-bin-hadoop200.tar.gz
Configure sqoop environment variables
vi /etc/profile
export SQOOP_HOME=/opt/sqoop-1.99.3-bin-hadoop200
export CATALINA_BASE=$SQOOP_HOME/server
export LOGDIR=$SQOOP_HOME/logs/
export PATH=$SQOOP_HOME/bin:$PATH
Modify sqoop's reference to hadoop shared jar
vi server/conf/catalina.properties
Find the common.loader line and change /usr/lib/hadoop/lib/*.jar to your hadoop jar package directory
/opt/soft-228238/hadoop-2.5.2/share/hadoop/yarn/*.jar,
/opt/soft-228238/hadoop-2.5.2/share/hadoop/yarn/lib/*.jar,
/opt/soft-228238/hadoop-2.5.2/share/hadoop/hdfs/*.jar,
/opt/soft-228238/hadoop-2.5.2/share/hadoop/hdfs/lib/*.jar,
/opt/soft-228238/hadoop-2.5.2/share/hadoop/mapreduce/*.jar,
/opt/soft-228238/hadoop-2.5.2/share/hadoop/mapreduce/lib/*.jar,
/opt/soft-228238/hadoop-2.5.2/share/hadoop/common/*.jar,
/opt/soft-228238/hadoop-2.5.2/share/hadoop/common/lib/*.jar,
/opt/soft-228238/hadoop-2.5.2/share/hadoop/httpfs/tomcat/lib/*.jar,
/opt/soft-228238/hadoop-2.5.2/share/hadoop/kms/tomcat/lib/*.jar,
/opt/soft-228238/hadoop-2.5.2/share/hadoop/tools/lib/*.jar
Note: /opt/soft-228238/hadoop-2.5.2 install hadoop path; for integration with Hive, modify /usr/lib/hive/lib/*.jar to the jar corresponding to the hive installation path
vi server/conf/sqoop.properties
Find: org.apache.sqoop.submission.engine.mapreduce.configuration.directory line, modify the value to your hadoop configuration file directory
如: org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/opt/soft-228238/hadoop-2.5.2/etc/hadoop
Enter the installation directory: /opt/sqoop-1.99.3-bin-hadoop200
新建文件夹 lib: mkdir lib
将数据库驱动包(oracle-jdbc-10.1.0.2.0.jar)放入到 lib 中,
启动 sqoop:
cd /opt/sqoop-1.99.3-bin-hadoop200/bin
执行 ./sqoop.sh server start
执行结果如下:
停止 sqooq : ./ sqoop.sh server stop
使用sqoop客户端:
./sqoop.sh client
输入help可以查询具体使用方法 如下:
设置服务: set server --host supervisor-84 --port 12000 --webapp sqoop
查看错误信息:set option --name verbose --value true
查看 sqoop 版本信息:show version –a
创建数据库连接:根据提示依次数据库连接驱动以及用户名密码,然后点击 enter 键,输入连接数 提示 successfully
创建导入 job:
create job --xid 1 --type import
输入 Schema name, Table name 然后 enter(点回撤键) 红框为要输入的信息
执行 start job –jid 6
注: 6 为创建job 的id ,可以通过 show job 命了进行查看 job信息。
成功:
可以在 eclipse hadoop 插件中浏览到结果:
- 遇到问题解决方案:
- 查看 job 运行状态失败:
<property>
<name>mapreduce.job.tracker</name>
<value>192.168.68.84:9001</value>
</property>
注释 或删除 ,然后重新启动 hadoop
show version –all 报错:
Exception has occurred during processing command
Exception: com.sun.jersey.api.client.UniformInterfaceException Message: GET http://supervisor-41:12000/sqoopServer/version returned a response status of 404 Not Found
解决方法:
将命令更改为:set server --host 安装IP --port 12000 --webapp 安装文件名 如: set server --host supervisor-41 --port 12000 --webapp sqoop
java.net.ConnectException :10020
java.io.IOException: java.net.ConnectException: Call From supervisor-84/192.168.68.84 to supervisor-84:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:331)
at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:416)
at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:522)
at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:183)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:580)
解决:
1.查看 $Hadoo_home(安装路径)/ect/hadoop/mapred-site.xml 是否有一下配置信息
<property>
<name>mapreduce.jobhistory.address</name>
<value>192.168.68.84:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>192.168.68.84:19888</value>
</property>
2. 启动 $Hadoo_home(安装路径)/sbin/mr-jobhistory-daemon.sh start historyserver
使用java 连接 sqoop 引用 jar:
<dependency>
<groupId>org.apache.sqoop</groupId>
<artifactId>sqoop-client</artifactId>
<version>1.99.3</version>
</dependency>
After processing and checking whether there are errors in the log, please pay attention to the red description: Note: sqoop-1.99.3 and hadoop-2 export HDFS execution is to check $Hadoop_home (installation path)/ect/hadoop/mapred-site.xml