sqoop data export and import

About a sqoop:
Sqoop is an open source tool, mainly used in Hadoop (Hive) and the traditional database (mysql, postgresql ...)
between data passed, a relational database can be (for example: MySQL, Oracle, Postgres etc.)
data into Hadoop HDFS to guide, it is also possible to enter data HDFS leads to a relational database.
Sqoop project began in 2009, originally as Hadoop presence of a third-party modules, then in order to allow the
wearer to quickly deploy, in order to allow developers to more rapid iterative development, Sqoop independence as an Apache
project.
Sqoop2 latest version is 1.99.7. Note 1 and 2 are not compatible, and features incomplete, it is not intended for
production deployments.
Two sqoop principle:
import or export commands translated into mapreduce program.
In the translated mapreduce in outputformat mainly on inputformat and customization.
Three import data
in Sqoop, the "import" refers to the concept: a large data from a non-clustered (RDBMS) to a large data clusters (the HDFS, the HIVE,
HBASE) for transmitting data, called: import, i.e., the use of import keyword.
1 mysql ---- hdfs:

#语法:
sqoop import --connect JDBCURL --table 表名 --username 帐号 --password 密码 --target-dir 导出至HDFS目标 --fields-terminated-by '\t'  -m  mapper的个数


sqoop import --connect jdbc:mysql://HadoopNode1:3306/test --table emp --username root --password 123456 --target-dir /user/sheng/input/sqoop1/stu  --fields-terminated-by '\t'  -m 1 

2 mysql ----> HDFS introduced increment:

When used with data tables mysql time, the original data in the table increase, but only wants to import Additions recording HDFS

In the actual work among the data in the database tables are growing, such as just the consumer table, so every time you want to import part of an incremental import, do not want the data in the table once (time and effort) in re-import , that is, if the data in the table to increase the content, you import it into Hadoop, if the data in the table does not increase does not import - this is the incremental import.

  • -incremental append: incremental import

  • -check-column :( specify increment delta import criteria - which increments the column as a standard)

  • -last-value must be specified when :( delta import reference column - the last time value of the import, or data in the table will be reintroduced into primary)



    #语法:
    sqoop import --connect JDBCURL --table 表名 --username 帐号 --password 密码 --target-dir 导出至HDFS目标 --fields-terminated-by '\t'  -m  mapper的个数   --incremental append  --check-column  以什么字段来标识增加  --last-value  最大记录数
    
    #例:
    sqoop import --connect    jdbc:mysql://HoodpNode3:3306/test     --table  student                 --username root --password 123456 --target-dir /user/sheng/input/sqoop1/stu --fields-terminated-by '\t'  -m 1   --incremental append  --check-column  empno  --last-value  7934



3 mysql ----> Hive

 语法:
sqoop import --connect JDBCURL --table 表名 --username 帐号 --password 密码 --hive-import --create-hive-table  --hive-table 数据库名.表名 --fields-terminated-by  '\t'  -m  1
 
 
 sqoop import --connect jdbc:mysql://HadoopNode1:3306/test --table emp --username root --password 123456 --hive-import --create-hive-table  --hive-table default.emp --fields-terminated-by  '\t'  -m 1

4 HDFS ----> mysql:

#语法:
sqoop  export  --connect JDBCURL --table 表名 --username 帐号 --password 密码  --table  表名    -export-dir  HDFS的路径 --input-fields-terminated-by  '\t' 


#例:
sqoop export --connect jdbc:mysql://HadoopNode1:3306/test --table hdfs_mysql --username  root --password 123456 --export-dir  /user/sheng/input/sqoop1/stu/part-m-00000
  --fields-terminated-by '\t'  -m 1

5 HIVE ----> mysql:

#语法:
sqoop  export  --connect JDBCURL --table 表名 --username 帐号 --password 密码  --table  表名    -export-dir  hive数据仓库及表的路径 --input-fields-terminated-by  '\t' 

#例:

sqoop  export  --connect jdbc:mysql://HadoopNode1:3306/test  --username root --password 123456 --table  hive_mysql    -export-dir  /user/hive/warehouse/emp/emp.txt --input-fields-terminated-by  '\t'

--table hive_mysql 
--direct 
Published 133 original articles · won praise 53 · views 20000 +

Guess you like

Origin blog.csdn.net/weixin_43599377/article/details/104516046