Sqoop common instructions

Sqoop common instructions

Public parameters –connect Specify the connection url For example: jdbc:mysql://hadoop102:3306
–username Specify login account
–password Specify login password
–driver Specify the driver class [can be inferred by url, you don’t need to write]
import Control parameters
–append Specify whether to append data to the specified directory
–as-textfile The specified data is saved on hdfs and saved in text file format
–as-parquetfile The specified data is saved on hdfs in Parquet (column storage) format
–compress 【-z】 Specify whether to use compression
–compression-codec Specify the compression format [default gzip]
–delete-target-dir Whether to delete the target path (to prevent dirty data after a part of the import fails)
–fetch-size Specify the size of the data pulled from MySQL for each batch
–num-mappers【-m】 Set how many maptasks to use to import data in parallel
–Query 【-e】 Specify export and import data from MySQL through SQL statements
–columns Specify which columns of data to import into MySQL
–table Specify which table to export data
–where Specify the conditions for importing to MySQL
–split-by Set according to which field to assign to MapTask
–target-dir Specify the path to save on HDFS
–null-string When the string column is a null value, what character should be written into HDFS (the null value in hive is stored as \N)
–null-non-string What character should be written to HDFS when a non-string column is a null value (for example: -null-non-string'\N')
Incremental data import
–check-column Specify which field to identify incremental data
–incremental {append/lastmodified} append: Only import new data, lastmodified import new and modified data (usually through the time field to determine whether to modify)
–last-value Specify the maximum value of the last import, the value in the subsequent table is the incremental data
Delimiter
–fields-terminated-by Specify the field separator to be imported into HDFS
–lines-terminated-by Specify the separator between lines when data is saved on HDFS
Import directly into the hive table
–hive-import Specify to import data directly into hive table
–hive-overwrite Specify whether to import as an overwrite
–create-hive-table Specify to import the hive table, if the hive table does not exist, it will be created automatically, if it exists, an error will be reported
–hive-table Specify the table name of the imported hive
–hive-partition-key Specify the partition field name when data is imported into the hive table
–hive-partition-value Specify the partition field value when the data is imported into the hive table
output
Control parameters
–columns Specify which columns to import data into mysql
–num-mappers 【-m】 Specify the number of mappers
–table Specify which mysql table the data is imported into
–Export-dir Specify the path of the data in HDFS
–update-key Specify which column of HDFS data and MySQL data is the same data
–update-mode {updateonly/allowinsert } updateonly: if the data is the same, only update the data, allowinsert: the same update data, different insert
–input-null-string Specifies that the string column data in HDFS is null, in the form of storage in MySQL
–input-null-non-string Specify the non-string column data in HDFS to be null, in the form of storage in MySQL
Delimiter
–input-fields-terminated-by Specify the separator between data fields in HDFS
–input-lines-terminated-by Specify the separator between line fields in HDFS

Download form
https://download.csdn.net/download/qq_38705144/14425591
Insert picture description here

Guess you like

Origin blog.csdn.net/qq_38705144/article/details/112685667