Hadoop 切换Namenode报错

版权声明:本文为博主原创文章,未经博主允许不得转载。博客地址:http://www.fanlegefan.com/ https://blog.csdn.net/woloqun/article/details/81416576

之前给集群做了HA,master 默认为active,slave3为standby,为了测试高可用,认为kill掉master节点的Namenode进程,发现slave3上的NameNode进程并没有像预想中的称为active状态,查看zkfc日志

tail -100f hadoop-qun-zkfc-slave3.log 

发现报如下错误

com.jcraft.jsch.JSchException: Auth fail
    at com.jcraft.jsch.Session.connect(Session.java:519)
    at org.apache.hadoop.ha.SshFenceByTcpPort.tryFence(SshFenceByTcpPort.java:100)
    at org.apache.hadoop.ha.NodeFencer.fence(NodeFencer.java:97)
    at org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:536)
    at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:509)
    at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:61)
    at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:895)
    at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:985)
    at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:882)
    at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
    at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
    at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
2018-08-04 22:48:35,127 WARN org.apache.hadoop.ha.NodeFencer: Fencing method org.apache.hadoop.ha.SshFenceByTcpPort(null) was unsuccessful.
2018-08-04 22:48:35,127 ERROR org.apache.hadoop.ha.NodeFencer: Unable to fence service by any configured method.
2018-08-04 22:48:35,127 WARN org.apache.hadoop.ha.ActiveStandbyElector: Exception handling the winning of election
java.lang.RuntimeException: Unable to fence NameNode at master/192.168.1.115:8020
    at org.apache.hadoop.ha.ZKFailoverController.doFence(ZKFailoverController.java:537)
    at org.apache.hadoop.ha.ZKFailoverController.fenceOldActive(ZKFailoverController.java:509)
    at org.apache.hadoop.ha.ZKFailoverController.access$1100(ZKFailoverController.java:61)
    at org.apache.hadoop.ha.ZKFailoverController$ElectorCallbacks.fenceOldActive(ZKFailoverController.java:895)
    at org.apache.hadoop.ha.ActiveStandbyElector.fenceOldActive(ActiveStandbyElector.java:985)
    at org.apache.hadoop.ha.ActiveStandbyElector.becomeActive(ActiveStandbyElector.java:882)
    at org.apache.hadoop.ha.ActiveStandbyElector.processResult(ActiveStandbyElector.java:467)
    at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:599)
    at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)

解决办法,在master和slave3节点[所以启动Namenode进程节点]上执行如下命令

sudo yum install psmisc

猜你喜欢

转载自blog.csdn.net/woloqun/article/details/81416576