(转)Failed with exception Unable to rename

hive:MoveTask

2014-02-14 14:58 680 people read  comments (1)  Collection   report

I got an error when running the SQL:

SQL: INSERT OVERWRITE DIRECTORY 'result/testConsole' select count(1) from nutable; 

Error message:

Failed with exception Unable to rename: hdfs://indigo:8020/tmp/hive-root/hive_2013-08-22_17-35-05_006_3570546713731431770/-ext-10000 to: result/testConsole
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
另一个SQL的错误,这是日志中的:
1042 2013-08-22 17:08:54,411 INFO  exec.Task (SessionState.java:printInfo(412)) - Moving data to: result/userName831810250/54cbcd2980a64fe78cf54abb3116d2dc from hdfs://indigo:8020/tmp/hive-hive/hive_2013-08-22_17-08-40_062_3976325306495167351/-ext-10000
1043 2013-08-22 17:08:54,414 ERROR exec.Task (SessionState.java:printError(421)) - Failed with exception Unable to rename: hdfs://indigo:8020/tmp/hive-hive/hive_2013-08-22_17-08-40_062_3976325306495167351/-ext-10000 to: result/userName831810250/54cbcd2980a64fe78cf54abb3116d2dc

 

Let's see where the exception occurs.
When executing SQL, the last task is MoveTask. Its function is to put the Mapeduce task result file generated by running SQL into the path specified in SQL to store query results. The specific method is to rename
the following to org.apache.hadoop.hive A piece of code in .ql.exec.MoveTask that renames the result file:

 

[java]  view plain copy
 
 
  1. //This sourcePath parameter is the directory where the Mapeduce result file is stored, so its value may be  
  2. //hdfs://indigo:8020/tmp/hive-root/hive_2013-08-22_18-42-03_218_2856924886757165243/-ext-10000  
  3. if (fs.exists(sourcePath)) {  
  4.     Path deletePath = null;  
  5.     // If it multiple level of folder are there fs.rename is failing so first  
  6.     // create the targetpath.getParent() if it not exist  
  7.     if (HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_INSERT_INTO_MULTILEVEL_DIRS)) {  
  8.     deletePath = createTargetPath(targetPath, fs);  
  9.     }  
  10.     //The value of targetPath here is the specified directory to place the result file, the value may be result/userName154122639/4e574b5d9f894a70b074ccd3981ca0f1  
  11.     if (!fs.rename(sourcePath, targetPath)) {//The exception generated above is because the rename failed here, entered if, and thrown an exception  
  12.       try {  
  13.         if (deletePath != null) {  
  14.           fs.delete(deletePath, true);  
  15.         }  
  16.       } catch (IOException e) {  
  17.         LOG.info("Unable to delete the path created for facilitating rename"  
  18.             + deletePath);  
  19.       }  
  20.       throw new HiveException("Unable to rename: " + sourcePath  
  21.           + " to: " + targetPath);  
  22.     }  
  23.   }  
The targetPath of rename must exist.
In fact, the targetPath has been checked and created before:

 

 

[java]  view plain copy
 
 
  1. private Path createTargetPath(Path targetPath, FileSystem fs) throws IOException {  
  2.     Path deletePath = null;  
  3.     Path mkDirPath = targetPath.getParent();  
  4.     if (mkDirPath != null & !fs.exists(mkDirPath)) {  
  5.       Path actualPath = mkDirPath;  
  6.       while (actualPath != null && !fs.exists(actualPath)) {  
  7.         deletePath = actualPath;  
  8.         actualPath = actualPath.getParent();  
  9.       }  
  10.       fs.mkdirs(mkDirPath);  
  11.     }  
  12.     return deletePath;//Return to the newly created top-level directory, in case of failure, it is used to delete  
  13.   }  

Apache has had this problem , and it has been solved.
CDH has added a parameter hive.insert.into.multilevel.dirs, the default is false, which means that I still have this bug.
When you are cheated and want to make a patch, you will find that you can change the configuration.
It means that I keep this bug, but if you are cheated, you can't say that I have a bug, you can change the configuration yourself. $+@*^.!"?... I
haven't found this parameter used in other places. The only function here is to limit the depth of the directory specified in SQL where the result file does not exist cannot be greater than 1.
But also Haven't found any benefit to this.

 

After tossing for a long time, you can add a configuration:

 

[html]  view plain copy
 
 
  1. <property>  
  2.   <name>hive.insert.into.multilevel.dirs</name>  
  3.   <value>true</value>  
  4. </property> 

 

Later, the upper-level path was established.

 

 

Transfer: http://blog.csdn.net/johnny_lee/article/details/19200357

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327098735&siteId=291194637