Installation and use of Sqoop under CentOS7

background

Sqoop can be used to migrate data between big data components such as mysql, hdfs, hive, hbase, etc.

installation

1. Upload sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tars to CentOS7

2. Unzip and change the name

[root@localhost szc]# tar -zxvf sqoop-1.4.6.bin__hadoop-2.0.4-alpha.tar.gz

[root@localhost szc]# mv sqoop-1.4.6.bin__hadoop-2.0.4-alpha sqoop-1.4.6

3. Enter the sqoop-1.4.6 directory and modify the configuration file

[root@localhost szc]# cd sqoop-1.4.6/

[root@localhost sqoop-1.4.6]# mv conf/sqoop-env-template.sh conf/sqoop-env.sh

[root@localhost sqoop-1.4.6]# vim conf/sqoop-env.sh

Add a few environment variables, you only need to add the components you have installed

#Set path to where bin/hadoop is available

export HADOOP_COMMON_HOME=/home/szc/cdh/hadoop-2.5.0-cdh5.3.6

#Set path to where hadoop-*-core.jar is available

export HADOOP_MAPRED_HOME=/home/szc/cdh/hadoop-2.5.0-cdh5.3.6


#set the path to where bin/hbase is available

#export HBASE_HOME=


#Set the path to where bin/hive is available

export HIVE_HOME=/home/szc/apache-hive-2.3.7


#Set the path for where zookeper config dir is

export ZOOCFGDIR=/home/szc/zookeeper/zookeeper-3.4.9/conf

4. Upload the mysql driver to the lib directory

5. Verify sqoop

[root@localhost sqoop-1.4.6]# bin/sqoop help
Warning: /home/szc/sqoop-1.4.6/bin/../../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /home/szc/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/szc/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
20/05/07 07:39:50 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6

usage: sqoop COMMAND [ARGS]

Available commands:
  codegen            Generate code to interact with database records
  create-hive-table  Import a table definition into Hive
  eval               Evaluate a SQL statement and display the results
  export             Export an HDFS directory to a database table
  help               List available commands
  import             Import a table from a database to HDFS
  import-all-tables  Import tables from a database to HDFS
  import-mainframe   Import datasets from a mainframe server to HDFS
  job                Work with saved jobs
  list-databases     List available databases on a server
  list-tables        List available tables in a database
  merge              Merge results of incremental imports
  metastore          Run a standalone Sqoop metastore
  version            Display version information

See 'sqoop help COMMAND' for information on a specific command.

Query which databases are in mysql

[root@localhost sqoop-1.4.6]# bin/sqoop list-database --connect jdbc:mysql://192.168.0.102 --username root --password root

Warning: /home/szc/sqoop-1.4.6/bin/../../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /home/szc/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/szc/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
No such sqoop tool: list-database. See 'sqoop help'.

[root@localhost sqoop-1.4.6]# bin/sqoop list-databases --connect jdbc:mysql://192.168.0.102 --username root --password root

Warning: /home/szc/sqoop-1.4.6/bin/../../hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
Warning: /home/szc/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /home/szc/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
20/05/07 07:45:11 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
20/05/07 07:45:11 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
20/05/07 07:45:11 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
Loading class `com.mysql.jdbc.Driver'. This is deprecated. The new driver class is `com.mysql.cj.jdbc.Driver'. The driver is automatically registered via the SPI and manual loading of the driver class is generally unnecessary.

information_schema
azkaban
ke
knowlegegraph
mysql
oozie
performance_schema
sys
test

Use Cases

Import all the data of the table from mysql to hdfs

[root@localhost sqoop-1.4.6]# bin/sqoop import --connect jdbc:mysql://192.168.0.102:3306/test --username root --password root --table users --target-dir /user/root/sqoop/users --delete-target-dir --num-mappers 1 --fields-terminated-by "\t"

Import means to import data, and then specify the url of the database, the user name and password, the imported table, the destination path of the import (hdfs), and specify that if the destination path exists, it will be deleted (--delete-target-dir), and the mapper used Quantity, field separator

After completion, you can view the file content as follows

[root@localhost sqoop-1.4.6]# /home/szc/cdh/hadoop-2.5.0-cdh5.3.6/bin/hadoop fs -cat /user/root/sqoop/users/part-m-00000

20/05/07 07:58:43 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

1    songzeceng    szc    [email protected]
2    zeceng    szc    [email protected]
3    szc    sda    fd

Import part of data from mysql to hdfs, use query to import

[root@localhost sqoop-1.4.6]# bin/sqoop import --connect jdbc:mysql://192.168.0.102:3306/test --username root --password root --target-dir /user/root/sqoop/users --delete-target-dir --num-mappers 1 --fields-terminated-by "\t" --query 'select * from users where id <=2 and $CONDITIONS;'

--query specifies the query statement. In addition to the original conditions in the where clause, $CONDITIONS must be added for the transfer of where conditions between mappers. If the --query parameter value is surrounded by ", one must be added before $CONDITIONS \, that is --query "select * from users where id <=2 and \$CONDITIONS;". --table and --query cannot exist at the same time

View Results

[root@localhost sqoop-1.4.6]# /home/szc/cdh/hadoop-2.5.0-cdh5.3.6/bin/hadoop fs -cat /user/root/sqoop/users/part-m-00000

20/05/07 08:05:57 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

1    songzeceng    szc    [email protected]
2    zeceng    szc    [email protected]

Import part of the data from mysql to hdfs, import the specified column

You can use --columns to specify the columns to be imported

[root@localhost sqoop-1.4.6]# bin/sqoop import --connect jdbc:mysql://192.168.0.102:3306/test --username root --password root --table users --target-dir /user/root/sqoop/users --delete-target-dir --num-mappers 1 --fields-terminated-by "\t" --columns username,email

View Results

[root@localhost sqoop-1.4.6]# /home/szc/cdh/hadoop-2.5.0-cdh5.3.6/bin/hadoop fs -cat /user/root/sqoop/users/part-m-00000

20/05/07 08:12:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

songzeceng    [email protected]
zeceng    [email protected]
szc    fd

Import part of data from mysql to hdfs, import query conditions

Use --where to specify conditions

[root@localhost sqoop-1.4.6]# bin/sqoop import --connect jdbc:mysql://192.168.0.102:3306/test --username root --password root --table users --target-dir /user/root/sqoop/users --delete-target-dir --num-mappers 1 --fields-terminated-by "\t" --where "id=1"

View Results

[root@localhost sqoop-1.4.6]# /home/szc/cdh/hadoop-2.5.0-cdh5.3.6/bin/hadoop fs -cat /user/root/sqoop/users/part-m-00000

20/05/07 08:14:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

1    songzeceng    szc    [email protected]

--where can also be used with --columns

[root@localhost sqoop-1.4.6]# bin/sqoop import --connect jdbc:mysql://192.168.0.102:3306/test --username root --password root --table users --target-dir /user/root/sqoop/users --delete-target-dir --num-mappers 1 --fields-terminated-by "\t" --where "id=1" --columns=username,email

View Results

[root@localhost sqoop-1.4.6]# /home/szc/cdh/hadoop-2.5.0-cdh5.3.6/bin/hadoop fs -cat /user/root/sqoop/users/part-m-00000

20/05/07 08:15:04 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

songzeceng    [email protected]

But --where cannot be shared with --query

Import data from mysql to hive

[root@localhost sqoop-1.4.6]# bin/sqoop import --connect jdbc:mysql://192.168.0.102:3306/test --username root --password root --table users --hive-import --hive-overwrite --hive-table users --num-mappers 1 --fields-terminated-by "\t"

--hive-import indicates import to hive, --hive-overwrite indicates table overwrite, --hive-table indicates imported hive table, if this table does not exist, it will be created

View Results

hive> select * from users;
OK

1    songzeceng    szc    [email protected]
2    zeceng    szc    [email protected]
3    szc    sda    fd

Time taken: 0.741 seconds, Fetched: 3 row(s)

Export data from hdfs\hive to mysql

[root@localhost sqoop-1.4.6]# bin/sqoop export --connect jdbc:mysql://192.168.0.102:3306/test --username root --password root --table users_sqoop --num-mappers 1 --export-dir /user/hive/warehouse/users --input-fields-terminated-by "\t"

export indicates that this is an export command, --export-dir indicates the hdfs directory to be exported, and --input-fields-terminated-by indicates the separator of a line of data in the file

MySQL tables are not created by default, and to ensure that the primary key is not repeated, the results are as follows

mysql> select * from users_sqoop;

+----+------------+----------+------------+
| id | name       | password | email      |
+----+------------+----------+------------+
|  1 | songzeceng | szc      | [email protected] |
|  2 | zeceng     | szc      | [email protected] |
|  3 | szc        | sda      | fd         |
+----+------------+----------+------------+

3 rows in set

sqoop execute script

Create a job directory and write the sqoop_test.opt file

[root@localhost sqoop-1.4.6]# mkdir job

[root@localhost sqoop-1.4.6]# vim job/sqoop_test.opt

The content is as follows, the parameters and the parameters are separated by a newline

export
--connect
jdbc:mysql://192.168.0.102:3306/test
--username
root
--password
root
--table
users_sqoop
--num-mappers
1
--export-dir
/user/hive/warehouse/users
--input-fields-terminated-by
"\t"

Execute the script, --options-file specifies the script file

[root@localhost sqoop-1.4.6]# bin/sqoop --options-file job/sqoop_test.opt

View the results in mysql (the users_sqoop table has been emptied beforehand)

Before executing the script

mysql> select * from users_sqoop;
Empty set

After executing the script

mysql> select * from users_sqoop;

+----+------------+----------+------------+
| id | name       | password | email      |
+----+------------+----------+------------+
|  1 | songzeceng | szc      | [email protected] |
|  2 | zeceng     | szc      | [email protected] |
|  3 | szc        | sda      | fd         |
+----+------------+----------+------------+

3 rows in set

Conclusion

Above, HBase has not tried, you can check the official documents by yourself

Guess you like

Origin blog.csdn.net/qq_37475168/article/details/107306286