Use sqoop to export the data in hive to mysql

Background: sqoop is mainly used to transfer data between Hadoop (Hive) and traditional databases (mysql, postgresql...), and can import data from a relational database (for example: MySQL, Oracle, Postgres, etc.) In the HDFS of Hadoop, you can also import HDFS data into a relational database. Here mainly take mysql as an example to introduce how to use import and export.

For details, refer to the official document: http://sqoop.apache.org/docs/1.4.5/SqoopUserGuide.html#_purpose_4

1. First install the corresponding version of sqoop
2. Copy the mysql jar package to /home/sqoop/lib

3. Introduction

Import data to hdfs

1) Method one
sqoop import \
--connect jdbc:mysql://localhost/db \
--username root \
--password 123456 \
--table table123 \
--target-dir /user/foo/joinresults \ #hdfs path
--num-mappers 1 \
#map number-- as-parquetfile \ #Set the file format
--columns id, name \ # can set the column to be exported
2) Method two sql method
sqoop import \
--connect jdbc:mysql ://localhost/db \
--username root \
--password 123456 \
--query'select * from table123 WHERE $CONDITIONS' \ #where is required
--target-dir /user/foo/joinresults \ #hdfs path
--delete-target-dir \ #If the target directory already exists, delete
--compress \ #specify compression
--compression-codec org.apache.hadoop.io.compress.snappyCodec \ #Specify the compression method snappy
--fields-terminated-by'|'
3) In addition to importing incremental data in sql using where conditions to filter, you can also have The command options are as follows
sqoop import \
--connect jdbc:mysql://localhost/db \
--username root \
--password 123456 \
--query'select * from table123 WHERE $CONDITIONS' \ #where is required
--target -dir /user/foo/joinresults \ #hdfsPATH
#--delete-target-dir \ #If the target directory already exists, delete it (not through append below, use together)
--check-column \ #check which One column, such as id
--incremental append \
#Append , also optional lastmodified --last-value 7636 \ #id greater than 7636
4) Quickly export data
sqoop import \
--connect jdbc:mysql://localhost/db \
- -username root \
--password 123456 \
--query'select * from table123 WHERE $CONDITIONS' \ #where is necessary
--target-dir /user/foo/joinresults \ #import hdfs path
--delete-target-dir \ #If the target directory already exists, then Delete (not through append below, use together)
--direct \ #Use this command to export from mysql instead of using mapreduce but directly import, the speed is much faster than before.
5) Data export
sqoop export \
--connect jdbc:mysql://localhost/db \
--username root \
--password 123456 \
--table table123 \
--export-dir /user/foo/joinresults \ #export hdfs Path
6) Import and export to hive
sqoop import \
--connect jdbc:mysql://localhost/db \
--username root \
--password 123456 \
--query'select * from table123 WHERE $CONDITIONS' \ #where is required the
--fields-terminated-by '|'
--hive-import \
--hive-database default \
--hive-table table 123 \
export hive to mysql
sqoop export \
--connect jdbc:mysql://localhost/db \
--username root \
--password 123456 \
--table table123 \
--export-dir /user/foo/joinresults \ #export hive table data storage path
--input-fields-terminated-by'|' \ #separator
Note:
The version before 1.4.6 If the database imports data to hive, if the hive storage format is parquet, the import will be wrong. The
script execution method
bin/sqoop --option-file /opt/script/sqoop_test.txt --table tablename (can also be passed in this way Parameters)
vi sqoop_test.txt
import
--connect
jdbc:mysql://localhost/db
--username
root
--password
123456
--table
table123
--target-dir
/ the User / foo / joinresults
--num-mappers
1
---------------------
Author: hh_666
Source: CSDN
Original: https://blog.csdn.net/qq_34485930/article/details/80868017
Copyright statement: This article is the original article of the blogger, please attach the link of the blog post if you reprint it!

Use sqoop to export the data in hive to mysql

Guess you like