Sqoop common instructions
Public parameters | –connect | Specify the connection url For example: jdbc:mysql://hadoop102:3306 |
---|---|---|
–username | Specify login account | |
–password | Specify login password | |
–driver | Specify the driver class [can be inferred by url, you don’t need to write] | |
import | Control parameters | |
–append | Specify whether to append data to the specified directory | |
–as-textfile | The specified data is saved on hdfs and saved in text file format | |
–as-parquetfile | The specified data is saved on hdfs in Parquet (column storage) format | |
–compress 【-z】 | Specify whether to use compression | |
–compression-codec | Specify the compression format [default gzip] | |
–delete-target-dir | Whether to delete the target path (to prevent dirty data after a part of the import fails) | |
–fetch-size | Specify the size of the data pulled from MySQL for each batch | |
–num-mappers【-m】 | Set how many maptasks to use to import data in parallel | |
–Query 【-e】 | Specify export and import data from MySQL through SQL statements | |
–columns | Specify which columns of data to import into MySQL | |
–table | Specify which table to export data | |
–where | Specify the conditions for importing to MySQL | |
–split-by | Set according to which field to assign to MapTask | |
–target-dir | Specify the path to save on HDFS | |
–null-string | When the string column is a null value, what character should be written into HDFS (the null value in hive is stored as \N) | |
–null-non-string | What character should be written to HDFS when a non-string column is a null value (for example: -null-non-string'\N') | |
Incremental data import | ||
–check-column | Specify which field to identify incremental data | |
–incremental {append/lastmodified} | append: Only import new data, lastmodified import new and modified data (usually through the time field to determine whether to modify) | |
–last-value | Specify the maximum value of the last import, the value in the subsequent table is the incremental data | |
Delimiter | ||
–fields-terminated-by | Specify the field separator to be imported into HDFS | |
–lines-terminated-by | Specify the separator between lines when data is saved on HDFS | |
Import directly into the hive table | ||
–hive-import | Specify to import data directly into hive table | |
–hive-overwrite | Specify whether to import as an overwrite | |
–create-hive-table | Specify to import the hive table, if the hive table does not exist, it will be created automatically, if it exists, an error will be reported | |
–hive-table | Specify the table name of the imported hive | |
–hive-partition-key | Specify the partition field name when data is imported into the hive table | |
–hive-partition-value | Specify the partition field value when the data is imported into the hive table | |
output | ||
Control parameters | ||
–columns | Specify which columns to import data into mysql | |
–num-mappers 【-m】 | Specify the number of mappers | |
–table | Specify which mysql table the data is imported into | |
–Export-dir | Specify the path of the data in HDFS | |
–update-key | Specify which column of HDFS data and MySQL data is the same data | |
–update-mode {updateonly/allowinsert } | updateonly: if the data is the same, only update the data, allowinsert: the same update data, different insert | |
–input-null-string | Specifies that the string column data in HDFS is null, in the form of storage in MySQL | |
–input-null-non-string | Specify the non-string column data in HDFS to be null, in the form of storage in MySQL | |
Delimiter | ||
–input-fields-terminated-by | Specify the separator between data fields in HDFS | |
–input-lines-terminated-by | Specify the separator between line fields in HDFS |
Download form
https://download.csdn.net/download/qq_38705144/14425591