hadoop使用过程中的一些问题

1 如何知道一个文件在HDFS上block的分布情况
http://stackoverflow.com/questions/6372060/how-to-track-which-data-block-is-in-which-data-node-in-hadoop

2 用windows 电脑向linux hadoop集群上提交job失败
org.apache.hadoop.util.Shell$ExitCodeException: /bin/bash: line 0: fg: no job control

这是hadoop的bug https://issues.apache.org/jira/browse/MAPREDUCE-4052
jira上写的在2.4.0已经被fix,我们用的2.5,需要在配置文件中加入
<property>
  <description>If enabled, user can submit an application cross-platform
  i.e. submit an application from a Windows client to a Linux/Unix server or
  vice versa.
  </description>
  <name>mapreduce.app-submission.cross-platform</name>
  <value>false</value>
</property>


3 如何找到各种mapreduce job的日志,方便查看
http://www.iteblog.com/archives/896

4 如何指定hadoop client的用户名
根据UserGroupInformation的源码分析,可以设置HADOOP_USER_NAME环境变量或者系统属性
      //If we don't have a kerberos user and security is disabled, check
      //if user is specified in the environment or properties
      if (!isSecurityEnabled() && (user == null)) {
        String envUser = System.getenv(HADOOP_USER_NAME);
        if (envUser == null) {
          envUser = System.getProperty(HADOOP_USER_NAME);
        }
        user = envUser == null ? null : new User(envUser);
      }

猜你喜欢

转载自kabike.iteye.com/blog/2195194