利用Sqoop通过Phoenix将HBase中数据导出

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/weixin_43215250/article/details/90634600
  1. 测试连接

    sqoop eval是用来快速对数据库运行简单的SQL查询并结果将打印到控制台,这可以让用户预览导入查询的结果数据,确保导入的数据符合期望。可以使用它来验证Sqoop中的数据库连接或测试简单查询。它不应该用于生产工作流程。

    参数 描述
    –connect 指定JDBC连接字符串
    –connection-manager 指定要使用的连接管理器类
    –driver 手动指定要使用的JDBC驱动程序类
    –hadoop-mapred-home 覆盖 $HADOOP_MAPRED_HOME
    –help 打印使用说明
    –password-file 设置包含验证密码的文件的路径
    -P 从控制台读取密码
    –password 设置验证密码
    –username 设置认证用户名
    –verbose 工作时打印更多信息
    –connection-param-file 提供连接参数的可选属性文件
    –relaxed-isolation 将连接事务隔离级别设置为读取未提交的映射器
    sqoop eval \
    --connect jdbc:phoenix:192.168.1.101:2181 \
    --driver org.apache.phoenix.jdbc.PhoenixDriver \
    --query "select * from test limit 1"
    

    测试连接日志

    [root@chd01 ~]# sqoop eval --connect jdbc:phoenix:192.168.1.101:2181 --driver org.apache.phoenix.jdbc.PhoenixDriver --query "select * from test limit 1"
    Warning: /opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/bin/../lib/sqoop/../accumulo does not exist! Accumulo imports will fail.
    Please set $ACCUMULO_HOME to the root of your Accumulo installation.
    19/05/28 10:27:40 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-cdh5.12.1
    19/05/28 10:27:40 WARN sqoop.ConnFactory: Parameter --driver is set to an explicit driver however appropriate connection manager is not being set (via --connection-manager). Sqoop is going to fall back to org.apache.sqoop.manager.GenericJdbcManager. Please specify explicitly which connection manager should be used next time.
    19/05/28 10:27:40 INFO manager.SqlManager: Using default fetchSize of 1000
    SLF4J: Class path contains multiple SLF4J bindings.
    SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/jars/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: Found binding in [jar:file:/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/jars/phoenix-4.14.0-cdh5.12.2-client.jar!/org/slf4j/impl/StaticLoggerBinder.class]
    SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
    SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
    19/05/28 10:27:41 INFO log.QueryLoggerDisruptor: Starting  QueryLoggerDisruptor for with ringbufferSize=8192, waitStrategy=BlockingWaitStrategy, exceptionHandler=org.apache.phoenix.log.QueryLoggerDefaultExceptionHandler@14bdbc74...
    19/05/28 10:27:41 INFO query.ConnectionQueryServicesImpl: An instance of ConnectionQueryServices was created.
    19/05/28 10:27:41 INFO zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x7f8a9499 connecting to ZooKeeper ensemble=192.168.1.101:2181
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Client environment:zookeeper.version=3.4.5-cdh5.12.1--1, built on 08/24/2017 16:41 GMT
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Client environment:host.name=chd01
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Client environment:java.version=1.8.0_201
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Client environment:java.vendor=Oracle Corporation
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Client environment:java.home=/usr/lib/jvm/java-8-oracle/jre
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Client environment:java.class.path=...
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Client environment:java.library.path=/opt/cloudera/parcels/CDH-5.12.1-1.cdh5.12.1.p0.3/lib/hadoop/lib/native
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Client environment:os.name=Linux
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Client environment:os.arch=amd64
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Client environment:os.version=4.4.0-62-generic
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Client environment:user.name=root
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Client environment:user.home=/root
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Client environment:user.dir=/opt/cloudera/csd
    19/05/28 10:27:41 INFO zookeeper.ZooKeeper: Initiating client connection, connectString=192.168.1.101:2181 sessionTimeout=60000 watcher=hconnection-0x7f8a94990x0, quorum=192.168.1.101:2181, baseZNode=/hbase
    19/05/28 10:27:41 INFO zookeeper.ClientCnxn: Opening socket connection to server chd01/192.168.1.101:2181. Will not attempt to authenticate using SASL (unknown error)
    19/05/28 10:27:41 INFO zookeeper.ClientCnxn: Socket connection established, initiating session, client: /192.168.1.101:34058, server: chd01/192.168.1.101:2181
    19/05/28 10:27:41 INFO zookeeper.ClientCnxn: Session establishment complete on server chd01/192.168.1.101:2181, sessionid = 0x26ad7e07aa282a3, negotiated timeout = 60000
    19/05/28 10:27:42 INFO query.ConnectionQueryServicesImpl: HConnection established. Stacktrace for informational purposes: hconnection-0x7f8a9499 java.lang.Thread.getStackTrace(Thread.java:1559)
    org.apache.phoenix.util.LogUtil.getCallerStackTrace(LogUtil.java:55)
    org.apache.phoenix.query.ConnectionQueryServicesImpl.openConnection(ConnectionQueryServicesImpl.java:427)
    org.apache.phoenix.query.ConnectionQueryServicesImpl.access$400(ConnectionQueryServicesImpl.java:267)
    org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2515)
    org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2491)
    org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
    org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2491)
    org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255)
    org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
    org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)
    java.sql.DriverManager.getConnection(DriverManager.java:664)
    java.sql.DriverManager.getConnection(DriverManager.java:270)
    org.apache.sqoop.manager.SqlManager.makeConnection(SqlManager.java:902)
    org.apache.sqoop.manager.GenericJdbcManager.getConnection(GenericJdbcManager.java:52)
    org.apache.sqoop.tool.EvalSqlTool.run(EvalSqlTool.java:64)
    org.apache.sqoop.Sqoop.run(Sqoop.java:147)
    org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
    org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
    org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
    org.apache.sqoop.Sqoop.main(Sqoop.java:252)
    
    19/05/28 10:27:42 INFO Configuration.deprecation: hadoop.native.lib is deprecated. Instead, use io.native.lib.available
    --------------------------------------------------------------------------------------------------------------------
    | KEYS                 | F1                   | F2                   | S1                   | S2                   | 
    --------------------------------------------------------------------------------------------------------------------
    | key1                 | f1                   | f2                   | f3                   | f4                   | 
    --------------------------------------------------------------------------------------------------------------------
    
  2. 导出数据

    sqoop导出数据脚本:

    sqoop import \
    --connect jdbc:phoenix:10.101.71.41:2181 \
    --driver org.apache.phoenix.jdbc.PhoenixDriver \
    --query 'select * from test where $CONDITIONS limit 1' \
    --target-dir "/tmp/hbase/test/" \
    --fields-terminated-by '\t' \
    --delete-target-dir \
    --split-by rowkey \
    --hive-drop-import-delims \
    --null-string '\\N' \
    --null-non-string '\\N' \
    --m 1
    

    查看结果:

    [root@chd01 ~]# hdfs dfs -cat /tmp/hbase/test/part-m-00000
    key1	f1	f2	f3	f4
    
  3. 遇到的问题

    1. 问题一

      19/05/28 11:12:43 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
      java.lang.NullPointerException
      	at org.json.JSONObject.<init>(JSONObject.java:144)
      	at org.apache.sqoop.util.SqoopJsonUtil.getJsonStringforMap(SqoopJsonUtil.java:43)
      	at org.apache.sqoop.SqoopOptions.writeProperties(SqoopOptions.java:780)
      	at org.apache.sqoop.mapreduce.JobBase.putSqoopOptionsToConfiguration(JobBase.java:392)
      	at org.apache.sqoop.mapreduce.JobBase.createJob(JobBase.java:378)
      	at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:256)
      	at org.apache.sqoop.manager.SqlManager.importQuery(SqlManager.java:748)
      	at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:515)
      	at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:621)
      	at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
      	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
      	at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
      	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
      	at org.apache.sqoop.Sqoop.runTool(Sqoop.java:243)
      	at org.apache.sqoop.Sqoop.main(Sqoop.java:252)
      

      解决方法:

      下载java-json.jar包,并上传到/opt/cloudera/parcels/CDH/lib/hbase/lib 目录下。
      下载地址: http://www.java2s.com/Code/JarDownload/java-json/java-json.jar.zip

    2. 问题二

      Error: java.lang.RuntimeException: java.lang.RuntimeException: java.sql.SQLException: ERROR 726 (43M10):  Inconsistent namespace mapping properties. Cannot initiate connection as SYSTEM:CATALOG is found but client does not have phoenix.schema.isNamespaceMappingEnabled enabled
      	at org.apache.sqoop.mapreduce.db.DBInputFormat.setDbConf(DBInputFormat.java:170)
      	at org.apache.sqoop.mapreduce.db.DBInputFormat.setConf(DBInputFormat.java:161)
      	at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
      	at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
      	at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:755)
      	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
      	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at javax.security.auth.Subject.doAs(Subject.java:422)
      	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1917)
      	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
      Caused by: java.lang.RuntimeException: java.sql.SQLException: ERROR 726 (43M10):  Inconsistent namespace mapping properties. Cannot initiate connection as SYSTEM:CATALOG is found but client does not have phoenix.schema.isNamespaceMappingEnabled enabled
      	at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:223)
      	at org.apache.sqoop.mapreduce.db.DBInputFormat.setDbConf(DBInputFormat.java:168)
      	... 10 more
      Caused by: java.sql.SQLException: ERROR 726 (43M10):  Inconsistent namespace mapping properties. Cannot initiate connection as SYSTEM:CATALOG is found but client does not have phoenix.schema.isNamespaceMappingEnabled enabled
      	at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:494)
      	at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:150)
      	at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:1099)
      	at org.apache.phoenix.query.ConnectionQueryServicesImpl.createTable(ConnectionQueryServicesImpl.java:1491)
      	at org.apache.phoenix.schema.MetaDataClient.createTableInternal(MetaDataClient.java:2717)
      	at org.apache.phoenix.schema.MetaDataClient.createTable(MetaDataClient.java:1114)
      	at org.apache.phoenix.compile.CreateTableCompiler$1.execute(CreateTableCompiler.java:192)
      	at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:408)
      	at org.apache.phoenix.jdbc.PhoenixStatement$2.call(PhoenixStatement.java:391)
      	at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53)
      	at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:389)
      	at org.apache.phoenix.jdbc.PhoenixStatement.executeMutation(PhoenixStatement.java:378)
      	at org.apache.phoenix.jdbc.PhoenixStatement.executeUpdate(PhoenixStatement.java:1806)
      	at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2528)
      	at org.apache.phoenix.query.ConnectionQueryServicesImpl$12.call(ConnectionQueryServicesImpl.java:2491)
      	at org.apache.phoenix.util.PhoenixContextExecutor.call(PhoenixContextExecutor.java:76)
      	at org.apache.phoenix.query.ConnectionQueryServicesImpl.init(ConnectionQueryServicesImpl.java:2491)
      	at org.apache.phoenix.jdbc.PhoenixDriver.getConnectionQueryServices(PhoenixDriver.java:255)
      	at org.apache.phoenix.jdbc.PhoenixEmbeddedDriver.createConnection(PhoenixEmbeddedDriver.java:150)
      	at org.apache.phoenix.jdbc.PhoenixDriver.connect(PhoenixDriver.java:221)
      	at java.sql.DriverManager.getConnection(DriverManager.java:664)
      	at java.sql.DriverManager.getConnection(DriverManager.java:270)
      	at org.apache.sqoop.mapreduce.db.DBConfiguration.getConnection(DBConfiguration.java:300)
      	at org.apache.sqoop.mapreduce.db.DBInputFormat.getConnection(DBInputFormat.java:216)
      	... 11 more
      

      解决方法:

      1. 开启Phoenix的Schema支持

        HBase --> 配置 --> 高级 --> 搜索 hbase-site.xml --> 添加配置 --> 重启HBase。

        在服务端和客户端均添加下面配置:

        <property>
        <name>phoenix.schema.isNamespaceMappingEnabled</name>
        <value>true</value>
        </property>	
        
        <property>
            <name>phoenix.schema.mapSystemTablesToNamespace</name>
            <value>true</value>
        </property>
        

        在这里插入图片描述
        在这里插入图片描述

      2. 下载客户端配置
        在这里插入图片描述

      3. 下载Phoenix客户端jar包

        [root@chd01 ~]# cd /opt/cloudera/parcels/APACHE_PHOENIX
        [root@chd01 APACHE_PHOENIX]# cd lib/phoenix
        [root@chd01 phoenix]# ll
        total 316480
        drwxr-xr-x 3 root root      4096 Apr 18 15:10 bin/
        drwxr-xr-x 3 root root      4096 Apr 18 12:13 dev/
        drwxr-xr-x 3 root root      4096 Apr 18 12:13 examples/
        drwxr-xr-x 2 root root      4096 Apr 18 12:13 lib/
        -rw-r--r-- 1 root root 170794941 Jun  5  2018 phoenix-4.14.0-cdh5.12.2-client.jar
        -rw-r--r-- 1 root root  77541123 Jun  5  2018 phoenix-4.14.0-cdh5.12.2-hive.jar
        -rw-r--r-- 1 root root   7138747 Jun  5  2018 phoenix-4.14.0-cdh5.12.2-queryserver.jar
        -rw-r--r-- 1 root root  39216558 Jun  5  2018 phoenix-4.14.0-cdh5.12.2-server.jar
        -rw-r--r-- 1 root root  29344301 Jun  5  2018 phoenix-4.14.0-cdh5.12.2-thin-client.jar
        [root@chd01 phoenix]# cp phoenix-4.14.0-cdh5.12.2-client.jar /opt/cloudera/parcels/CDH/jars
        

        将下载的客户端配置并解压,使用压缩软件打开phoenix-4.14.0-cdh5.12.2-client.jar包,将hbase-site.xml放入phoenix-4.14.0-cdh5.12.2-client.jar中。

      4. 上传Phoenix客户端jar包

        将第三步放入hbase-site.xml后的phoenix-4.14.0-cdh5.12.2-client.jar上传到/opt/cloudera/parcels/CDH/jars目录下,然后在HBase的lib中建立软链接。

        [root@chd01 ~]# cd /opt/cloudera/parcels/CDH/jars
        [root@chd01 jars]# rz
        [root@chd01 jars]# cd /opt/cloudera/parcels/CDH/lib/hbase/lib
        [root@chd01 lib]# ln -s ../../../jars/phoenix-4.14.0-cdh5.12.2-client.jar phoenix-4.14.0-cdh5.12.2-client.jar
        

猜你喜欢

转载自blog.csdn.net/weixin_43215250/article/details/90634600
今日推荐