【史上最详细的sqoop命令讲解(一)】

一、sqoop 命令知多少

[root@hadoop0 bin]# ./sqoop

Warning: /opt/bigdata/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.

Please set $HCAT_HOME to the root of your HCatalog installation.

Warning: /opt/bigdata/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.

Please set $ACCUMULO_HOME to the root of your Accumulo installation.

Try 'sqoop help' for usage.

[root@hadoop0 bin]# ./sqoop help

Warning: /opt/bigdata/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.

Please set $HCAT_HOME to the root of your HCatalog installation.

Warning: /opt/bigdata/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.

Please set $ACCUMULO_HOME to the root of your Accumulo installation.

99/06/23 18:00:29 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6

usage: sqoop COMMAND [ARGS]

Available commands:

  codegen            Generate code to interact with database records

  create-hive-table  Import a table definition into Hive

  eval               Evaluate a SQL statement and display the results

  export             Export an HDFS directory to a database table

  help               List available commands

  import             Import a table from a database to HDFS

  import-all-tables  Import tables from a database to HDFS

  import-mainframe   Import datasets from a mainframe server to HDFS

  job                Work with saved jobs

  list-databases     List available databases on a server

  list-tables        List available tables in a database

  merge              Merge results of incremental imports

  metastore          Run a standalone Sqoop metastore

  version            Display version information

See 'sqoop help COMMAND' for information on a specific command.

二、语法结构解析

1)列出mysql数据库中的所有数据库

 

Table 36. Common arguments

Argument Description
--connect <jdbc-uri> Specify JDBC connect string
--connection-manager <class-name> Specify connection manager class to use
--driver <class-name> Manually specify JDBC driver class to use
--hadoop-home <dir> Override $HADOOP_HOME
--help Print usage instructions
-P Read password from console
--password <password> Set authentication password
--username <username> Set authentication username
--verbose Print more information while working
--connection-param-file <filename> Optional properties file that provides connection parameters
例子:
 
[root@hadoop0 bin]# ./sqoop list-databases --connect jdbc:mysql://192.168.1.101/  -username root --password root
Warning: /opt/bigdata/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/bigdata/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
99/06/23 18:03:22 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
99/06/23 18:03:22 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
99/06/23 18:03:22 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
information_schema
hive
mysql
performance_schema
test
[root@hadoop0 bin]# ./sqoop list-databases --connect jdbc:mysql://192.168.1.101/  -username root --password root --verbose
Warning: /opt/bigdata/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/bigdata/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
99/06/23 18:03:42 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
99/06/23 18:03:42 DEBUG tool.BaseSqoopTool: Enabled debug logging.
99/06/23 18:03:42 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
99/06/23 18:03:42 DEBUG sqoop.ConnFactory: Loaded manager factory: org.apache.sqoop.manager.oracle.OraOopManagerFactory
99/06/23 18:03:42 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory
99/06/23 18:03:42 DEBUG sqoop.ConnFactory: Trying ManagerFactory: org.apache.sqoop.manager.oracle.OraOopManagerFactory
99/06/23 18:03:42 DEBUG oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop can be called by Sqoop!
99/06/23 18:03:42 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory
99/06/23 18:03:42 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:mysql:
99/06/23 18:03:42 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
99/06/23 18:03:42 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.MySQLManager@a1fcba
99/06/23 18:03:42 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.
information_schema
hive
mysql
performance_schema
test
[root@hadoop0 bin]# ./sqoop list-databases --connect jdbc:mysql://192.168.1.101/  -username root --P --verbose
Warning: /opt/bigdata/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/bigdata/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
99/06/23 18:03:57 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
99/06/23 18:03:57 DEBUG tool.BaseSqoopTool: Enabled debug logging.
Enter password: 
99/06/23 18:04:00 DEBUG sqoop.ConnFactory: Loaded manager factory: org.apache.sqoop.manager.oracle.OraOopManagerFactory
99/06/23 18:04:00 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory
99/06/23 18:04:00 DEBUG sqoop.ConnFactory: Trying ManagerFactory: org.apache.sqoop.manager.oracle.OraOopManagerFactory
99/06/23 18:04:00 DEBUG oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop can be called by Sqoop!
99/06/23 18:04:00 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory
99/06/23 18:04:00 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:mysql:
99/06/23 18:04:00 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
99/06/23 18:04:00 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.MySQLManager@1a4e7e0
99/06/23 18:04:00 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.
information_schema
hive
mysql
performance_schema
test
[root@hadoop0 bin]# 
 
2)连接mysql并列出数据库中的表

Table 37. Common arguments

Argument Description
--connect <jdbc-uri> Specify JDBC connect string
--connection-manager <class-name> Specify connection manager class to use
--driver <class-name> Manually specify JDBC driver class to use
--hadoop-home <dir> Override $HADOOP_HOME
--help Print usage instructions
-P Read password from console
--password <password> Set authentication password
--username <username> Set authentication username
--verbose Print more information while working
--connection-param-file <filename> Optional properties file that provides connection parameters
 
例子:
[root@hadoop0 bin]# ./sqoop list-tables --connect jdbc:mysql://192.168.1.101/hive  -username root --password root
Warning: /opt/bigdata/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/bigdata/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
99/06/23 18:11:20 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
99/06/23 18:11:20 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
99/06/23 18:11:20 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
aux_table
bucketing_cols
cds
columns_v2
compaction_queue
types
version
[root@hadoop0 bin]# . /sqoop list-tables --connect jdbc:mysql://192.168.1.101/mysql  -username root --password root
Warning: /opt/bigdata/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.
Please set $HCAT_HOME to the root of your HCatalog installation.
Warning: /opt/bigdata/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
99/06/23 18:11:47 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6
99/06/23 18:11:47 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
99/06/23 18:11:48 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
columns_priv
db
event
func
general_log
help_category
help_keyword
servers
slave_master_info
slave_relay_log_info
slave_worker_info
slow_log
tables_priv
time_zone
time_zone_leap_second
time_zone_name
time_zone_transition
time_zone_transition_type
user
[root@hadoop0 bin]# 

3)将关系型数据的表结构复制到hive中(即根据Mysql表自动在hive中建立表结构)

Table 31. Common arguments

Argument Description
--connect <jdbc-uri> Specify JDBC connect string
--connection-manager <class-name> Specify connection manager class to use
--driver <class-name> Manually specify JDBC driver class to use
--hadoop-home <dir> Override $HADOOP_HOME
--help Print usage instructions
-P Read password from console
--password <password> Set authentication password
--username <username> Set authentication username
--verbose Print more information while working
--connection-param-file <filename> Optional properties file that provides connection parameters

Table 32. Hive arguments:

Argument Description
--hive-home <dir> Override $HIVE_HOME
--hive-overwrite Overwrite existing data in the Hive table.
--create-hive-table If set, then the job will fail if the target hive
  table exits. By default this property is false.
--hive-table <table-name> Sets the table name to use when importing to Hive.
--table The database table to read the definition from.

Table 33. Output line formatting arguments:

Argument Description
--enclosed-by <char> Sets a required field enclosing character
--escaped-by <char> Sets the escape character
--fields-terminated-by <char> Sets the field separator character
--lines-terminated-by <char> Sets the end-of-line character
--mysql-delimiters Uses MySQL’s default delimiter set: fields: , lines: \n escaped-by: \ optionally-enclosed-by: '
--optionally-enclosed-by <char> Sets a field enclosing character

例子:

[root@hadoop0 bin]# ./sqoop create-hive-table --connect jdbc:mysql://192.168.1.101/test  -username root --password root --table people --hive-table emps --fields-terminated-by ',' --verbose

Warning: /opt/bigdata/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.

Please set $HCAT_HOME to the root of your HCatalog installation.

Warning: /opt/bigdata/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.

Please set $ACCUMULO_HOME to the root of your Accumulo installation.

99/06/23 18:32:29 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6

99/06/23 18:32:30 DEBUG manager.SqlManager: Using fetchSize for next query: -2147483648

99/06/23 18:32:30 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `people` AS t LIMIT 1

99/06/23 18:32:30 DEBUG manager.SqlManager: Found column id of type [4, 11, 0]

99/06/23 18:32:30 ERROR manager.SqlManager: Error reading from database: java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@c1d29e is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.

java.sql.SQLException: Streaming result set com.mysql.jdbc.RowDataDynamic@c1d29e is still active. No statements may be issued when any streaming result sets are open and in use on a given connection. Ensure that you have called .close() on any active streaming result sets before attempting more queries.

        at com.mysql.jdbc.MysqlIO.checkForOutstandingStreamingData(MysqlIO.java:2095)

        at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1510)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:236)

99/06/23 18:32:30 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException

java.lang.NullPointerException(解决方案:替换Mysql的jar包,据说是Mysql驱动包的bug

        at org.apache.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:175)

        at org.apache.sqoop.Sqoop.main(Sqoop.java:236)

[root@hadoop0 bin]# ./sqoop create-hive-table --connect jdbc:mysql://192.168.1.101/test  -username root --password root --table people --hive-table emps --fields-terminated-by ',' --verbose

Warning: /opt/bigdata/sqoop-1.4.6/bin/../../hcatalog does not exist! HCatalog jobs will fail.

Please set $HCAT_HOME to the root of your HCatalog installation.

Warning: /opt/bigdata/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.

Please set $ACCUMULO_HOME to the root of your Accumulo installation.

99/06/23 18:33:44 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6

99/06/23 18:33:44 DEBUG tool.BaseSqoopTool: Enabled debug logging.

99/06/23 18:33:44 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.

99/06/23 18:33:44 DEBUG sqoop.ConnFactory: Loaded manager factory: org.apache.sqoop.manager.oracle.OraOopManagerFactory

99/06/23 18:33:44 DEBUG sqoop.ConnFactory: Loaded manager factory: com.cloudera.sqoop.manager.DefaultManagerFactory

99/06/23 18:33:44 DEBUG sqoop.ConnFactory: Trying ManagerFactory: org.apache.sqoop.manager.oracle.OraOopManagerFactory

99/06/23 18:33:45 DEBUG oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop can be called by Sqoop!

99/06/23 18:33:45 DEBUG sqoop.ConnFactory: Trying ManagerFactory: com.cloudera.sqoop.manager.DefaultManagerFactory

99/06/23 18:33:45 DEBUG manager.DefaultManagerFactory: Trying with scheme: jdbc:mysql:

99/06/23 18:33:45 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.

99/06/23 18:33:45 DEBUG sqoop.ConnFactory: Instantiated ConnManager org.apache.sqoop.manager.MySQLManager@a1fcba

99/06/23 18:33:45 DEBUG hive.HiveImport: Hive.inputTable: people

99/06/23 18:33:45 DEBUG hive.HiveImport: Hive.outputTable: emps

99/06/23 18:33:45 DEBUG manager.SqlManager: Execute getColumnInfoRawQuery : SELECT t.* FROM `people` AS t LIMIT 1

99/06/23 18:33:45 DEBUG manager.SqlManager: No connection paramenters specified. Using regular API for making connection.

99/06/23 18:33:45 DEBUG manager.SqlManager: Using fetchSize for next query: -2147483648

99/06/23 18:33:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `people` AS t LIMIT 1

99/06/23 18:33:45 DEBUG manager.SqlManager: Found column id of type [4, 11, 0]

99/06/23 18:33:45 DEBUG manager.SqlManager: Found column extdata of type [12, 1000, 0]

99/06/23 18:33:45 DEBUG manager.SqlManager: Found column start_name of type [12, 20, 0]

99/06/23 18:33:45 DEBUG manager.SqlManager: Found column nickname of type [12, 20, 0]

99/06/23 18:33:45 DEBUG manager.SqlManager: Found column name of type [12, 20, 0]

99/06/23 18:33:45 DEBUG manager.SqlManager: Using fetchSize for next query: -2147483648

99/06/23 18:33:45 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `people` AS t LIMIT 1

99/06/23 18:33:45 DEBUG manager.SqlManager: Found column id of type INT

99/06/23 18:33:45 DEBUG manager.SqlManager: Found column extdata of type VARCHAR

99/06/23 18:33:45 DEBUG manager.SqlManager: Found column start_name of type VARCHAR

99/06/23 18:33:45 DEBUG manager.SqlManager: Found column nickname of type VARCHAR

99/06/23 18:33:45 DEBUG manager.SqlManager: Found column name of type VARCHAR

99/06/23 18:33:45 DEBUG hive.TableDefWriter: Create statement: CREATE TABLE IF NOT EXISTS `emps` ( `id` INT, `extdata` STRING, `start_name` STRING, `nickname` STRING, `name` STRING) COMMENT 'Imported by sqoop on 1999/06/23 18:33:45' ROW FORMAT DELIMITED FIELDS TERMINATED BY '\054' LINES TERMINATED BY '\012' STORED AS TEXTFILE

SLF4J: Class path contains multiple SLF4J bindings.

SLF4J: Found binding in [jar:file:/opt/bigdata/hadoop272/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: Found binding in [jar:file:/opt/bigdata/hbase-1.1.5/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]

SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]

99/06/23 18:33:46 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

99/06/23 18:33:48 DEBUG hive.TableDefWriter: Load statement: LOAD DATA INPATH 'hdfs://hadoop0:9000/user/root/people' INTO TABLE `emps`

99/06/23 18:33:49 INFO hive.HiveImport: Loading uploaded data into Hive

99/06/23 18:33:49 DEBUG hive.HiveImport: Using external Hive process.

99/06/23 18:34:29 INFO hive.HiveImport: SLF4J: Class path contains multiple SLF4J bindings.

99/06/23 18:34:29 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/opt/bigdata/hive2.0/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]

99/06/23 18:34:29 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/opt/bigdata/hbase-1.1.5/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]

99/06/23 18:34:29 INFO hive.HiveImport: SLF4J: Found binding in [jar:file:/opt/bigdata/hadoop272/share/hadoop/common/lib/slf4j-log4j12-1.7.10.jar!/org/slf4j/impl/StaticLoggerBinder.class]

99/06/23 18:34:29 INFO hive.HiveImport: SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.

99/06/23 18:34:34 INFO hive.HiveImport: 

99/06/23 18:34:34 INFO hive.HiveImport: Logging initialized using configuration in file:/opt/bigdata/hive2.0/conf/hive-log4j2.properties

99/06/23 18:35:18 INFO hive.HiveImport: OK

99/06/23 18:35:18 INFO hive.HiveImport: Time taken: 8.382 seconds

99/06/23 18:35:20 INFO hive.HiveImport: Hive import complete.

此条语句执行完毕,Hive里面会自动建立好数据表,表的名字是emps,
hive> show tables;
OK
student
teacher
tt
Time taken: 0.752 seconds, Fetched: 3 row(s)
hive> show tables;
OK
emps
student
teacher
tt
Time taken: 0.127 seconds, Fetched: 4 row(s)
hive> show create table emps;
OK
CREATE TABLE `emps`(
  `id` int, 
  `extdata` string, 
  `start_name` string, 
  `nickname` string, 
  `name` string)
COMMENT 'Imported by sqoop on 1999/06/23 18:33:45'
ROW FORMAT SERDE 
  'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
WITH SERDEPROPERTIES ( 
  'field.delim'=',', 
  'line.delim'='\n', 
  'serialization.format'=',') 
STORED AS INPUTFORMAT 
  'org.apache.hadoop.mapred.TextInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION
  'hdfs://hadoop0:9000/user/hive/warehouse/emps'
TBLPROPERTIES (
  'transient_lastDdlTime'='930134117')
Time taken: 0.854 seconds, Fetched: 21 row(s)
hive> 

猜你喜欢

转载自gaojingsong.iteye.com/blog/2314640