HDP 3.1.0 Integrated Sqoop2
This article Original address: https: //sitoi.cn/posts/65261.html
surroundings
- It consists of three main clusters consisting of HDP 3.1.0
- Configure Time Synchronization
step
- Download the
Sqoop2
installation package - Extracting installation package to the
/usr/lib
directory - Modify
sqoop.sh
environment variables - Modify
sqoop.properties
the configuration - Import third-party
jar
package - Configuring third-party
jar
package references path - Modify
Ambari
the component configuration - Verify the configuration is correct
- Turn on the server
Download the installation package Sqoop2
Download: http://mirror.bit.edu.cn/apache/sqoop/1.99.7/
Download Command
cd ~
wget http://mirror.bit.edu.cn/apache/sqoop/1.99.7/sqoop-1.99.7-bin-hadoop200.tar.gz
Extracting installation package to the /usr/lib
directory
Extracting Sqoop2
archive
tar -xvf sqoop-<version>-bin-hadoop<hadoop-version>.tar.gz
Move to the /usr/lib/sqoop
directory
mv sqoop-<version>-bin-hadoop<hadoop version> /usr/lib/sqoop
Modify environment variables sqoop.sh
Edit /usr/lib/sqoop/bin/sqoop.sh
File
sudo vim /usr/lib/sqoop/bin/sqoop.sh
Find function sqoop_server_classpath_set
function, which will change the environment variables click on it, as follows:
function sqoop_server_classpath_set {
HADOOP_COMMON_HOME=${HADOOP_COMMON_HOME:-${HADOOP_HOME}/share/hadoop/common}
HADOOP_HDFS_HOME=${HADOOP_HDFS_HOME:-${HADOOP_HOME}/share/hadoop/hdfs}
HADOOP_MAPRED_HOME=${HADOOP_MAPRED_HOME:-${HADOOP_HOME}/share/hadoop/mapreduce}
HADOOP_YARN_HOME=${HADOOP_YARN_HOME:-${HADOOP_HOME}/share/hadoop/yarn}
These environment variables are commented out, the content can be changed to the following:
function sqoop_server_classpath_set {
HDP=/usr/hdp/3.0.1.0-187
HADOOP_COMMON_HOME=$HDP/hadoop
HADOOP_HDFS_HOME=$HDP/hadoop-hdfs
HADOOP_MAPRED_HOME=$HDP/hadoop-mapreduce
HADOOP_YARN_HOME=$HDP/hadoop-yarn
Modify the configuration sqoop.properties
modify sqoop.properties
sudo vim /usr/lib/sqoop/conf/sqoop.properties
Find org.apache.sqoop.submission.engine.mapreduce.configuration.directory
parameters, as follows:
org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/etc/hadoop/conf/
According to the actual information of the cluster to change it to the following content:
org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/usr/hdp/3.1.0.0-78/hadoop/conf/
Import third-party jar package
mkdir /usr/lib/sqoop/extra
cp /var/lib/ambari-server/resources/mysql-jdbc-driver.jar /usr/lib/sqoop/extra/
cp -r /usr/lib/sqoop/extra/* /usr/lib/sqoop/server/lib/
cp -r /usr/lib/sqoop/extra/* /usr/lib/sqoop/shell/lib/
cp -r /usr/lib/sqoop/extra/* /usr/lib/sqoop/tools/lib/
Configuring third-party jar package reference path
sudo vim ~/.bashrc
Add the environment variable as follows:
export SQOOP_HOME=/usr/lib/sqoop
export SQOOP_SERVER_EXTRA_LIB=$SQOOP_HOME/extra
export PATH=$PATH:$SQOOP_HOME/bin
Run the following command, the environment variable to take effect:
source ~/.bashrc
Modify the component configuration Ambari
Modify the component configuration HDFS
Configuration Item | parameter name | The initial value | Modify the value |
---|---|---|---|
Advanced hdfs-site | dfs.permissions.enabled | True | False |
Custom core-site | hadoop.proxyuser.hive.hosts | * |
|
Custom core-site | hadoop.proxyuser.root.hosts | * |
|
Custom core-site | hadoop.proxyuser.sqoop2.groups | * |
|
Custom core-site | hadoop.proxyuser.sqoop2.hosts | * |
|
Custom core-site | hadoop.proxyuser.yarn.groups | * |
|
Custom core-site | hadoop.proxyuser.yarn.hosts | * |
Modify the component configuration MapRduce2
Will be
${hdp.version}
replaced with the actualhdp
version:3.1.0.0-78
Configuration Item | parameter name | The initial value | Modify the value |
---|---|---|---|
Advanced mapred-site | mapreduce.admin.map.child.java.opts | -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=${hdp.version} |
-server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=3.1.0.0-78 |
Advanced mapred-site | mapreduce.admin.reduce.child.java.opts | -server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=${hdp.version} |
-server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=3.1.0.0-78 |
Advanced mapred-site | mapreduce.admin.user.env | LD_LIBRARY_PATH=/usr/hdp/${hdp.version}/hadoop/lib/native:/usr/hdp/${hdp.version}/hadoop/lib/native/Linux-{{architecture}}-64 |
LD_LIBRARY_PATH=/usr/hdp/3.1.0.0-78/hadoop/lib/native:/usr/hdp/3.1.0.0-78/hadoop/lib/native/Linux-{{architecture}}-64 |
Advanced mapred-site | mapreduce.application.classpath | $PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/${hdp.version}/hadoop/lib/hadoop-lzo-0.6.0.${hdp.version}.jar:/etc/hadoop/conf/secure |
$PWD/mr-framework/hadoop/share/hadoop/mapreduce/*:$PWD/mr-framework/hadoop/share/hadoop/mapreduce/lib/*:$PWD/mr-framework/hadoop/share/hadoop/common/*:$PWD/mr-framework/hadoop/share/hadoop/common/lib/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/*:$PWD/mr-framework/hadoop/share/hadoop/yarn/lib/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/*:$PWD/mr-framework/hadoop/share/hadoop/hdfs/lib/*:$PWD/mr-framework/hadoop/share/hadoop/tools/lib/*:/usr/hdp/3.1.0.0-78/hadoop/lib/hadoop-lzo-0.6.0.3.1.0.0-78.jar:/etc/hadoop/conf/secure |
Advanced mapred-site | mapreduce.application.framework.path | /hdp/apps/${hdp.version}/mapreduce/mapreduce.tar.gz#mr-framework |
/hdp/apps/3.1.0.0-78/mapreduce/mapreduce.tar.gz#mr-framework |
Advanced mapred-site | yarn.app.mapreduce.am.admin-command-opts | -Dhdp.version=${hdp.version} |
-Dhdp.version=3.1.0.0-78 |
Advanced mapred-site | MR AppMaster Java Heap Size | -Xmx819m -Dhdp.version=${hdp.version} |
-Xmx819m -Dhdp.version=3.1.0.0-78 |
Verify the configuration is correct
$ sqoop2-tool verify
Setting conf dir: /usr/lib/sqoop/bin/../conf
Sqoop home directory: /usr/lib/sqoop
Sqoop tool executor:
Version: 1.99.7
Revision: 435d5e61b922a32d7bce567fe5fb1a9c0d9b1bbb
Compiled on Tue Jul 19 16:08:27 PDT 2016 by abefine
Running tool: class org.apache.sqoop.tools.tool.VerifyTool
0 [main] INFO org.apache.sqoop.core.SqoopServer - Initializing Sqoop server.
8 [main] INFO org.apache.sqoop.core.PropertiesConfigurationProvider - Starting config fi
le poller thread
Verification was successful.
Tool class org.apache.sqoop.tools.tool.VerifyTool has finished correctly.
Turn on the server
$ sqoop2-server start
Setting conf dir: /usr/lib/sqoop/bin/../conf
Sqoop home directory: /usr/lib/sqoop
Sqoop tool executor:
Version: 1.99.7
Revision: 435d5e61b922a32d7bce567fe5fb1a9c0d9b1bbb
Compiled on Tue Jul 19 16:08:27 PDT 2016 by abefine
Running tool: class org.apache.sqoop.tools.tool.VerifyTool
0 [main] INFO org.apache.sqoop.core.SqoopServer - Initializing Sqoop server.
8 [main] INFO org.apache.sqoop.core.PropertiesConfigurationProvider - Starting config fi
le poller thread
Verification was successful.
Tool class org.apache.sqoop.tools.tool.VerifyTool has finished correctly.
[root@sandbox-hdp ~]# sqoop2-server start
Setting conf dir: /usr/lib/sqoop/bin/../conf
Sqoop home directory: /usr/lib/sqoop
Starting the Sqoop2 server...
0 [main] INFO org.apache.sqoop.core.SqoopServer - Initializing Sqoop server.
11 [main] INFO org.apache.sqoop.core.PropertiesConfigurationProvider - Starting config fi
le poller thread
Sqoop2 server started.
Check whether a successful start
$ jps | grep Sqoop
30970 SqoopJettyServer
In case of SqoopJettyServer
the process has been started successfully, it said.