The following article is importing mysql table into partition table in hive TEXTfile storage format
Import mysql table to partition table in hive ORC storage format , please click here to jump
1. The storage location of hdfs imported from mysql to hive table partition data == (mysql-->>hdfs)
sqoop import \
--connect jdbc:mysql://IP:3306/DATABASE \
--username USENAME--password PWD \
--fields-terminated-by ',' \
--m 1 \
--query "select * from TABLE where COLUMN='VALUE' and \$CONDITIONS" \
--target-dir /user/hive/warehouse/DATABASE.db/TABLE/PARTITION_NAME=PARTITION_VALUE/ \
--delete-target-dir
Capitalized places need to be modified by yourself
The and \$CONDITIONS of the where clause should not be modified, it must be added
2. mysql-->>hive partition table (the hive table will be automatically created if it does not exist)
sqoop import \
--connect jdbc:mysql://IP:3306/DB\
--username root --password PWD \
--query "select * from tab_task where task_createTime='2020-12-30' and \$CONDITIONS" \
--fields-terminated-by ',' \
--delete-target-dir \
--hive-import \
--m 1 \
--hive-partition-key dt \
--hive-partition-value 2020-12-30 \
--hive-database DB\
--hive-table tab_task \
--target-dir /user/hive/warehouse/DB.db/tab_task/dt=2020-12-30/ \
--delete-target-dir \
--direct
If the partition field task_createTime is not in yyyy-MM-dd format, you can use date_format(${task_createTime},'%Y-%m-%d')
- To synchronize partition tables, specify query
- query and add \$CONDITIONS at the end