The import sqoop | Hive | Hbase

Import data (clusters objects)

In Sqoop "Import" concept means: from large non-clustered data (RDBMS) to large data clusters (HDFS, HIVE, HBASE) the transmission of data , called: import, that is, use the import keyword.

1 RDBMS to HDFS

1) to determine the normal open Mysql service

2) Create a table and insert some data Mysql

$ mysql -uroot -p000000

mysql> create database company;

mysql> create table company.staff(id int(4) primary key not null auto_increment, name varchar(255), sex varchar(255));

mysql> insert into company.staff(name, sex) values('Thomas', 'Male');

mysql> insert into company.staff(name, sex) values('Catalina', 'FeMale');

3) Data Import

(1) introducing all

$ bin/sqoop import \

--connect jdbc:mysql://hadoop102:3306/company \

--username root \

--password 000000 \

--table staff \

--target-dir /user/company \

--delete-target-dir \

--num-mappers 1 \

--fields-terminated-by "\t"

(2) Import Query

$ bin/sqoop import \

--connect jdbc:mysql://hadoop102:3306/company \

--username root \

--password 000000 \

--target-dir /user/company \

--delete-target-dir \

--num-mappers 1 \

--fields-terminated-by "\t" \

--query 'select name,sex from staff where id <=1 and $CONDITIONS;'

Tip: must contain '$ CONDITIONS' in WHERE clause, passing parameters to use to ensure the consistency of the data last written.

If the query using double quotes, must be added to the $ character before CONDITIONS transfer, to prevent their identification shell variables.

(3) introduction of the specified column

$ bin/sqoop import \

--connect jdbc:mysql://hadoop102:3306/company \

--username root \

--password 000000 \

--target-dir /user/company \

--delete-target-dir \

--num-mappers 1 \

--fields-terminated-by "\t" \

--columns id,sex \

--table staff

Tip: columns, if it comes to multiple columns, separated by commas, do not add spaces when separated

(4) using the keyword filter query sqoop import data

$ bin/sqoop import \

--connect jdbc:mysql://hadoop102:3306/company \

--username root \

--password 000000 \

--target-dir /user/company \

--delete-target-dir \

--num-mappers 1 \

--fields-terminated-by "\t" \

--table staff \

--where "id=1"

2  the RDBMS to Hive

$ bin/sqoop import \

--connect jdbc:mysql://hadoop102:3306/company \

--username root \

--password 000000 \

--table staff \

--num-mappers 1 \

--hive-import \

--fields-terminated-by "\t" \

--hive-overwrite \

--hive-table staff_hive

Tip: This two-step process, the first step to import data into HDFS, the second step will be imported into HDFS data migration to the Hive warehouse, the first step in the default temporary directory is / user / your user name / table name

. 3  the RDBMS to H B ASE

$ bin/sqoop import \

--connect jdbc:mysql://hadoop102:3306/company \

--username root \

--password 000000 \

--table company \

--columns "id,name,sex" \

--column-family "info" \

--hbase-create-table \

--hbase-row-key "id" \

--hbase-table "hbase_company" \

--num-mappers 1 \

--split-by id

Tip: sqoop1.4.6 only supports the automatic creation function HBase table before HBase1.0.1 version

Solution: Create a manual HBase table

hbase> create 'hbase_company,'info'

(5) obtained in the following scan HBase in this table

hbase> scan ‘hbase_company’

Guess you like

Origin www.cnblogs.com/alexzhang92/p/10927400.html