Error in hadoop jobs due to hive query error

vks :

Exception:

2017-06-21 22:47:49,993 FATAL ExecMapper (main): org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing writable org.apache.hadoop.dynamodb.DynamoDBItemWritable@2e17578f
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:643)
    at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:149)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:441)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:377)
    at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1132)
    at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.lang.RuntimeException: Exception while processing record: org.apache.hadoop.dynamodb.DynamoDBItemWritable@2e17578f
    at org.apache.hadoop.hive.dynamodb.DynamoDBObjectInspector.getColumnData(DynamoDBObjectInspector.java:136)
    at org.apache.hadoop.hive.dynamodb.DynamoDBObjectInspector.getStructFieldData(DynamoDBObjectInspector.java:97)
    at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters$StructConverter.convert(ObjectInspectorConverters.java:328)
    at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:626)
    ... 9 more
Caused by: java.lang.NumberFormatException: For input string: "17664956244983174066"
    at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65)
    at java.lang.Long.parseLong(Long.java:444)
    at java.lang.Long.parseLong(Long.java:483)
    at org.apache.hadoop.hive.dynamodb.DynamoDBDataParser.getNumberObject(DynamoDBDataParser.java:179)
    at org.apache.hadoop.hive.dynamodb.type.HiveDynamoDBNumberType.getHiveData(HiveDynamoDBNumberType.java:28)
    at org.apache.hadoop.hive.dynamodb.DynamoDBObjectInspector.getColumnData(DynamoDBObjectInspector.java:128)
    ... 12 more

The hive query I am sending is:

INSERT OVERWRITE TABLE temp_1 
         SELECT * FROM temp_2 
         WHERE t_id="17664956244983174066" and t_ts="636214684577250000000";

Is this number too big to be converted to int? I even tried sending 17664956244983174066 without quotes but i get the same exception.

t_id is defined as BIGINT in hive table and N or Number in dynamobd

EDIT:

I tried by defining t_id as string ==> Schema mismatch as dynamodb stores this as int

t_id as double ==>> precision lost. no match.

What can be the solution here?

vks :

The solution as told by AWS people is to

  1. git clone open source emr-dynamodb-connector
  2. modify the code
  3. prepare your own Jar
  4. Using bootstrapper upload it to EMR
  5. In run_job_flow , send configurations for hadoop env appending your own location of jars in HADOOP_CLASSPATH.

Being not so much into Java , modifying emr-dynamodb-connector was not possible for me , but this is the solution .Also 2 things can be done... if you dont use Strings in dynamodb , map string of hive to number of dynamodb, else add mapping and support for decimal from hive to dynamodb number

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=438167&siteId=1