Sqoop导入Mysql中tinyint(1)格式数据到hdfs或hive时问题

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/weixin_43215250/article/details/85234162

sqoop官网解决方案

27.2.5. MySQL: Import of TINYINT(1) from MySQL behaves strangely
Problem: Sqoop is treating TINYINT(1) columns as booleans, which is for example causing issues with HIVE import. This is because by default the MySQL JDBC connector maps the TINYINT(1) to java.sql.Types.BIT, which Sqoop by default maps to Boolean.

Solution: A more clean solution is to force MySQL JDBC Connector to stop converting TINYINT(1) to java.sql.Types.BIT by adding tinyInt1isBit=false into your JDBC path (to create something like jdbc:mysql://localhost/test?tinyInt1isBit=false). Another solution would be to explicitly override the column mapping for the datatype TINYINT(1) column. For example, if the column name is foo, then pass the following option to Sqoop during import: --map-column-hive foo=tinyint. In the case of non-Hive imports to HDFS, use --map-column-java foo=integer.

问题:

Mysql中存在tinyint(1)时,在数据导入到HDFS时,该字段默认会被转化为boolean数据类型。导致数据内容丢失。

解决方案:

  1. 在jdbc的连接后面加上:tinyInt1isBit=false
    –connect jdbc:mysql://192.168.9.80:3306/kgc_behivour_log?tinyInt1isBit=false

  2. 另外,还有一种解决方案是显式覆盖数据类型TINYINT(1)列的列映射。
    例如,如果列名为foo,则在导入期间将以下选项传递给Sqoop:--map-column-hive foo=tinyint。在非Hive导入HDFS的情况下,使用 --map-column-java foo=Integer

如果有两个DATE类型的列DATE_COLUMN_1和DATE_COLUMN_2在Oracle表,那么你可以将以下参数添加到您的sqoop命令:
--map-column-java DATE_COLUMN_1=java.sql.Date,DATE_COLUMN_2=java.sql.Date
如前所述,必须在Hadoop文本文件中使用JDBC格式。但在这种情况下yyyy-mm-dd格式才会奏效。
记录下问题。

猜你喜欢

转载自blog.csdn.net/weixin_43215250/article/details/85234162
今日推荐