阿里云hdfs 报错could only be written to 0 of the 1 minReplication nodes

HDFS 的java client报错

一、异常描述

  • hdfs部署在阿里云EC
  • 在远程命令行执行,hadoop fs -put xx /xx可以成功上传文件,本地client报错
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /0529/dashen/test01.txt
 could only be written to 0 of the 1 minReplication nodes. There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
	at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:2312)
	at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:294)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2939)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:908)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:593)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:532)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1070)
	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1020)
	at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:948)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1845)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2952)


	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1562)
	at org.apache.hadoop.ipc.Client.call(Client.java:1508)
	at org.apache.hadoop.ipc.Client.call(Client.java:1405)
	at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:234)
	at org.apache.hadoop.ipc.ProtobufRpcEngine2$Invoker.invoke(ProtobufRpcEngine2.java:119)
	at com.sun.proxy.$Proxy12.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.addBlock(ClientNamenodeProtocolTranslatorPB.java:530)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
	at com.sun.proxy.$Proxy13.addBlock(Unknown Source)
	at org.apache.hadoop.hdfs.DFSOutputStream.addBlock(DFSOutputStream.java:1090)
	at org.apache.hadoop.hdfs.DataStreamer.locateFollowingBlock(DataStreamer.java:1867)
	at org.apache.hadoop.hdfs.DataStreamer.nextBlockOutputStream(DataStreamer.java:1669)
	at org.apache.hadoop.hdfs.DataStreamer.run(DataStreamer.java:715)

二、问题分析

  1. 在阿里云本地没有问题,说明问题在ip或者端口不通。
  1. 测试ip
ping aliyun公网ip    //可以ping通

2)测试端口

telnet 公网ip 9866   //先前已经 在安全组开放了端口  ,可以联通
  1. 推测
    按道理来说是没有问题的。忽然想到阿里云的弹性ip,恍然大悟。namenode返回的datanode列表的地址是阿里云的局域网ip,导致虽然client拿到了datanode的ip,但是无法联通。
    Tips:
    阿里云公网ip:阿里分配可以远程登录的ip地址。
    阿里云内网ip: 远程登录后 ifconfig 时 出来的 v4 地址。

三、解决方案

  1. 让namenode返回datanode结点列表时返回结点的hostname而不是局域网的ip。
  2. windows需要将datanode的hostname映射到公网的ip,问题解决。
  • 修改hostname
sudo vim /etc/hostname  //修改主机名称   需要重启电脑
  • 修改hdfs-site.xml
       <property> //默认返回给client端datanode的主机名称
                <name>dfs.client.use.datanode.hostname</name>
                <value>true</value>
        </property>
  1. 修改windows的hosts
  • 文件路径 C:\Windows\System32\drivers\etc\hosts
47.111.250.185 node01   //公网ip  阿里hostname

猜你喜欢

转载自blog.csdn.net/qq_29012499/article/details/108441638