Use sqoop to export the data in hive to mysql

Background: sqoop is mainly used to transfer data between Hadoop (Hive) and traditional databases (mysql, postgresql...), and can import data from a relational database (for example: MySQL, Oracle, Postgres, etc.) In the HDFS of Hadoop, you can also import HDFS data into a relational database. Here mainly take mysql as an example to introduce how to use import and export.

For details, refer to the official document: http://sqoop.apache.org/docs/1.4.5/SqoopUserGuide.html#_purpose_4

1. First install the corresponding version of sqoop
2. Copy the mysql jar package to /home/sqoop/lib

3. Introduction

Import data to hdfs

1) Method one
sqoop import \
--connect jdbc:mysql://localhost/db \
--username root \
--password 123456 \
--table table123 \
--target-dir /user/foo/joinresults \ #hdfs path
--num-mappers 1 \
#map number-- as-parquetfile \ #Set the file format
--columns id, name \ # can set the column to be exported
2) Method two sql method
sqoop import \
--connect jdbc:mysql ://localhost/db \
--username root \
--password 123456 \
--query'select * from table123 WHERE $CONDITIONS' \ #where is required
--target-dir /user/foo/joinresults \ #hdfs path
--delete-target-dir \ #If the target directory already exists, delete
--compress \ #specify compression
--compression-codec org.apache.hadoop.io.compress.snappyCodec \ #Specify the compression method snappy
--fields-terminated-by'|'
3) In addition to importing incremental data in sql using where conditions to filter, you can also have The command options are as follows
sqoop import \
--connect jdbc:mysql://localhost/db \
--username root \
--password 123456 \
--query'select * from table123 WHERE $CONDITIONS' \ #where is required
--target -dir /user/foo/joinresults \ #hdfsPATH
#--delete-target-dir \ #If the target directory already exists, delete it (not through append below, use together)
--check-column \ #check which One column, such as id
--incremental append \
#Append , also optional lastmodified --last-value 7636 \ #id greater than 7636
4) Quickly export data
sqoop import \
--connect jdbc:mysql://localhost/db \
- -username root \
--password 123456 \
--query'select * from table123 WHERE $CONDITIONS' \ #where is necessary
--target-dir /user/foo/joinresults \ #import hdfs path
--delete-target-dir \ #If the target directory already exists, then Delete (not through append below, use together)
--direct \ #Use this command to export from mysql instead of using mapreduce but directly import, the speed is much faster than before.
5) Data export
sqoop export \
--connect jdbc:mysql://localhost/db \
--username root \
--password 123456 \
--table table123 \   
--export-dir /user/foo/joinresults \ #export hdfs Path
6) Import and export to hive
sqoop import \
--connect jdbc:mysql://localhost/db \
--username root \
--password 123456 \
--query'select * from table123 WHERE $CONDITIONS' \ #where is required the
--fields-terminated-by '|'
--hive-import \
--hive-database default \
--hive-table table 123 \
export hive to mysql
sqoop export \
--connect jdbc:mysql://localhost/db \
--username root \
--password 123456 \
--table table123 \   
--export-dir /user/foo/joinresults \ #export hive table data storage path
--input-fields-terminated-by'|' \ #separator
Note:
The version before 1.4.6 If the database imports data to hive, if the hive storage format is parquet, the import will be wrong. The
script execution method
bin/sqoop --option-file /opt/script/sqoop_test.txt --table tablename (can also be passed in this way Parameters)
vi sqoop_test.txt
import 
--connect 
jdbc:mysql://localhost/db 
--username 
root 
--password 
123456 
--table 
table123 
--target-dir 
/ the User / foo / joinresults 
--num-mappers 

--------------------- 
Author: hh_666 
Source: CSDN 
Original: https://blog.csdn.net/qq_34485930/article/details/80868017 
Copyright statement: This article is the original article of the blogger, please attach the link of the blog post if you reprint it!

Guess you like

Origin blog.csdn.net/xiaoyutongxue6/article/details/88837582