sqoop installation & configuration

Since hadoop has been installed and the application has been successfully configured, let's continue to install and configure sqoop.

Sqoop is an open source tool, mainly used for data transfer between HADOOP (Hive) and traditional databases (mysql, postgresql...) The data in Hadoop is imported into HDFS of Hadoop, and the data of HDFS can also be imported into relational database.

 

  •  Install & Configure

Download address: http://www.us.apache.org/dist/sqoop/1.99.3/

download sqoop-1.99.3-bin-hadoop200.tar.gz

Unzip: tar -zxvf sqoop-1.99.3-bin-hadoop200.tar.gz

Configure sqoop environment variables

vi /etc/profile 

export SQOOP_HOME=/opt/sqoop-1.99.3-bin-hadoop200

export CATALINA_BASE=$SQOOP_HOME/server

export LOGDIR=$SQOOP_HOME/logs/

export PATH=$SQOOP_HOME/bin:$PATH

Modify sqoop's reference to hadoop shared jar

vi server/conf/catalina.properties

 

Find the common.loader line and change /usr/lib/hadoop/lib/*.jar to your hadoop jar package directory

/opt/soft-228238/hadoop-2.5.2/share/hadoop/yarn/*.jar,

/opt/soft-228238/hadoop-2.5.2/share/hadoop/yarn/lib/*.jar,

/opt/soft-228238/hadoop-2.5.2/share/hadoop/hdfs/*.jar,

/opt/soft-228238/hadoop-2.5.2/share/hadoop/hdfs/lib/*.jar,

/opt/soft-228238/hadoop-2.5.2/share/hadoop/mapreduce/*.jar,

/opt/soft-228238/hadoop-2.5.2/share/hadoop/mapreduce/lib/*.jar,

/opt/soft-228238/hadoop-2.5.2/share/hadoop/common/*.jar,

/opt/soft-228238/hadoop-2.5.2/share/hadoop/common/lib/*.jar,

/opt/soft-228238/hadoop-2.5.2/share/hadoop/httpfs/tomcat/lib/*.jar,

/opt/soft-228238/hadoop-2.5.2/share/hadoop/kms/tomcat/lib/*.jar,

/opt/soft-228238/hadoop-2.5.2/share/hadoop/tools/lib/*.jar

Note: /opt/soft-228238/hadoop-2.5.2 install hadoop path; for integration with Hive, modify /usr/lib/hive/lib/*.jar to the jar corresponding to the hive installation path

vi server/conf/sqoop.properties

 

Find: org.apache.sqoop.submission.engine.mapreduce.configuration.directory line, modify the value to your hadoop configuration file directory

如: org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/opt/soft-228238/hadoop-2.5.2/etc/hadoop

Enter the installation directory: /opt/sqoop-1.99.3-bin-hadoop200

新建文件夹 lib:  mkdir lib

将数据库驱动包(oracle-jdbc-10.1.0.2.0.jar)放入到 lib 中,

启动 sqoop:

cd  /opt/sqoop-1.99.3-bin-hadoop200/bin

执行  ./sqoop.sh server start

执行结果如下:



 停止 sqooq : ./ sqoop.sh server stop

 

 

使用sqoop客户端:

./sqoop.sh client


输入help可以查询具体使用方法 如下:


 设置服务: set server --host supervisor-84 --port 12000 --webapp sqoop

查看错误信息:set option --name verbose --value true


 查看 sqoop 版本信息:show version –a

 创建数据库连接:根据提示依次数据库连接驱动以及用户名密码,然后点击 enter 键,输入连接数 提示 successfully


 创建导入 job:

create job --xid 1 --type import

输入 Schema name, Table name  然后 enter(点回撤键)  红框为要输入的信息



 

 执行 start job –jid 6   

注: 6 为创建job 的id ,可以通过 show job 命了进行查看 job信息。



 

成功:



 

可以在 eclipse hadoop 插件中浏览到结果:



 

  •  遇到问题解决方案:
  1.        查看 job 运行状态失败:
 

 
处理查看日志是否有错误外后请注意红色说明:
注意: sqoop-1.99.3 和 hadoop-2 导出 HDFS 执行 是 查看
 $Hadoo_home(安装路径)/ect/hadoop/mapred-site.xml 中的

<property>

<name>mapreduce.job.tracker</name>

<value>192.168.68.84:9001</value>

</property>

注释 或删除 ,然后重新启动 hadoop

 

show version –all 报错:

 

Exception has occurred during processing command 

 

Exception: com.sun.jersey.api.client.UniformInterfaceException Message: GET http://supervisor-41:12000/sqoopServer/version returned a response status of 404 Not Found

解决方法:

将命令更改为:set server --host 安装IP --port 12000 --webapp 安装文件名   如: set server --host supervisor-41 --port 12000 --webapp sqoop

 

java.net.ConnectException :10020

java.io.IOException: java.net.ConnectException: Call From supervisor-84/192.168.68.84 to supervisor-84:10020 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused

at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:331)

at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:416)

at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:522)

at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:183)

at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:580)

 

解决:

1.查看 $Hadoo_home(安装路径)/ect/hadoop/mapred-site.xml 是否有一下配置信息

<property>

<name>mapreduce.jobhistory.address</name>

<value>192.168.68.84:10020</value>

</property>

 

<property>

<name>mapreduce.jobhistory.webapp.address</name>

<value>192.168.68.84:19888</value>

</property>

 

2. 启动 $Hadoo_home(安装路径)/sbin/mr-jobhistory-daemon.sh start historyserver

 

 

 

使用java 连接 sqoop 引用 jar:

<dependency>

    <groupId>org.apache.sqoop</groupId>

    <artifactId>sqoop-client</artifactId>

    <version>1.99.3</version>

   </dependency>

 

After processing and checking whether there are errors in the log, please pay attention to the red description: Note: sqoop-1.99.3 and hadoop-2 export HDFS execution is to check  $Hadoop_home (installation path)/ect/hadoop/mapred-site.xml

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326944710&siteId=291194637