hive partitions and partition files do not match the actual problems lead to spark reading file error solution

Under the first interpretation, due to historical reasons hive of partition does not match the ratio hdfs see the folders, the number of partitions hive in there actually is no such hdfs folder.

Reading in the table spark hive by sparkSQL, there will be abnormal.

solution:

1. Check whether the table is the outer table, if not, to modify an attribute external table attributes.

Here amended as external tables, the aim when you delete a partition, do not delete the existing data. If necessary, to make the backup process.

alter table tablename set tblproperties('EXTERNAL'='TRUE');

2. Delete abnormal partition

Here we tried to remove the drop table by way of the table, but found error.

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. MetaException(message:Invalid partition key & values; keys [year, month, day, hour, ], values [2018, ])

So here we used to delete the partition.

alter table tablename drop partition(pk_year=2018);

3. Use the partition command to re-create the partition repair

msck repair table tablename;

 

Guess you like

Origin www.cnblogs.com/30go/p/11414489.html