hive with tez: Unable to load aws credentials from any provider in the chain

environmental information

hadoop 3.1.0

hive-3.1.3

quick 0.9.1

Problem Description

The s3a uri can be accessed correctly from the hadoop command line. I can create external tables with commands like:

create external table mytable(a string, b string) location 's3a://mybucket/myfolder/';
select * from mytable limit 20;

executes correctly, but

select count(*) from mytable;

Failure log:

INFO  : Compiling command(queryId=root_20230919030746_7b38e3c8-8429-4d45-8a01-343bd26d8f6e): select count(*) from lyb0
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Semantic Analysis Completed (retrial = false)
INFO  : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, type:bigint, comment:null)], properties:null)
INFO  : Completed compiling command(queryId=root_20230919030746_7b38e3c8-8429-4d45-8a01-343bd26d8f6e); Time taken: 0.257 seconds
INFO  : Concurrency mode is disabled, not creating a lock manager
INFO  : Executing command(queryId=root_20230919030746_7b38e3c8-8429-4d45-8a01-343bd26d8f6e): select count(*) from lyb0
INFO  : Query ID = root_20230919030746_7b38e3c8-8429-4d45-8a01-343bd26d8f6e
INFO  : Total jobs = 1
INFO  : Launching Job 1 out of 1
INFO  : Starting task [Stage-1:MAPRED] in serial mode
INFO  : Subscribed to counters: [] for queryId: root_20230919030746_7b38e3c8-8429-4d45-8a01-343bd26d8f6e
INFO  : Session is already open
INFO  : Dag name: select count(*) from lyb0 (Stage-1)
INFO  : Status: Running (Executing on YARN cluster with App id application_1695092793092_0001)


----------------------------------------------------------------------------------------------
        VERTICES      MODE        STATUS  TOTAL  COMPLETED  RUNNING  PENDING  FAILED  KILLED  
----------------------------------------------------------------------------------------------
Map 1            container  INITIALIZING     -1          0        0       -1       0       0  
Reducer 2        container        INITED      1          0        0        1       0       0  
----------------------------------------------------------------------------------------------
VERTICES: 00/02  [>>--------------------------] 0%    ELAPSED TIME: 9.55 s     
----------------------------------------------------------------------------------------------
ERROR : Status: Failed

ERROR : Vertex failed, vertexName=Map 1, vertexId=vertex_1695092793092_0001_3_00, diagnostics=[Vertex vertex_1695092793092_0001_3_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: lyb0 initializer failed, vertex=vertex_1695092793092_0001_3_00 [Map 1], org.apache.hadoop.fs.s3a.AWSClientIOException: doesBucketExist on hivesql: com.amazonaws.AmazonClientException: No AWS Credentials provided by SimpleAWSCredentialsProvider : org.apache.hadoop.fs.s3a.CredentialInitializationException: Access key or secret key is unset: No AWS Credentials provided by SimpleAWSCredentialsProvider : org.apache.hadoop.fs.s3a.CredentialInitializationException: Access key or secret key is unset

	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
……
	at java.lang.Thread.run(Thread.java:750)

Caused by: com.amazonaws.AmazonClientException: No AWS Credentials provided by SimpleAWSCredentialsProvider : org.apache.hadoop.fs.s3a.CredentialInitializationException: Access key or secret key is unset

	at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:139)
……
	at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:109)

	... 31 more

Caused by: org.apache.hadoop.fs.s3a.CredentialInitializationException: Access key or secret key is unset

	at org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider.getCredentials(SimpleAWSCredentialsProvider.java:75)

	at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.getCredentials(AWSCredentialProviderList.java:117)

	... 45 more

]

ERROR : Vertex killed, vertexName=Reducer 2, vertexId=vertex_1695092793092_0001_3_01, diagnostics=[Vertex received Kill in INITED state., Vertex vertex_1695092793092_0001_3_01 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]

ERROR : DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1

ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1695092793092_0001_3_00, diagnostics=[Vertex vertex_1695092793092_0001_3_00 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: lyb0 initializer failed, vertex=vertex_1695092793092_0001_3_00 [Map 1], org.apache.hadoop.fs.s3a.AWSClientIOException: doesBucketExist on hivesql: com.amazonaws.AmazonClientException: No AWS Credentials provided by SimpleAWSCredentialsProvider : org.apache.hadoop.fs.s3a.CredentialInitializationException: Access key or secret key is unset: No AWS Credentials provided by SimpleAWSCredentialsProvider : org.apache.hadoop.fs.s3a.CredentialInitializationException: Access key or secret key is unset

	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:177)
……

Tried adding all fs.s3a properties from core-site.xml to tez-site.xml and setting fs, s3 a, access.key and fs.s3a.secret.key= inside hive session but still The same error occurs.

Solution

Make sure tez.use.cluster.hadoop-libs is not set in tez-site.xml, or if it is set, the value should be false
but when set to false, tez fails to run.

Got AWS credentials errors when set to true, even though they are set in every possible location or environment variable.
Finally got it working by adding this property to hive-site.xml

<property>
    <name>hive.conf.hidden.list</name>
  <value>javax.jdo.option.ConnectionPassword,hive.server2.keystore.password,fs.s3a.proxy.password,dfs.adls.oauth2.credential,fs.adl.oauth2.credential</value>
</property>

This is the correct solution. But, just to let you know, now you have the S3 key password exposed in various log files. Some files, know as follows;
Hive-> <HIVE_HOME>/logs/<user>/webhcat/webhcat.log.<date>
Hadoop->1 memory 6 memory 1
If you have access to the source code, you can modify this method so that it does not generate the above properties in hive logs.

References:

https://www.saoniuhuo.com/question/detail-2512416.html

https://www.saoniuhuo.com/question/detail-1939018.html

Guess you like

Origin blog.csdn.net/iamonlyme/article/details/133020230
Recommended