hadoop error - ResourceManager failed to start

Table of contents

I. Introduction

Two, solve

3. End


I. Introduction

        The problem encountered this time is that after the Hadoop cluster is started, the ResourceManage node is not started during the jps viewing process, resulting in the inability to access http://localhost:8088. By checking the startup log of ResourceManage, the error given in the log is " already Embedded automatic failover is enabled, but yarn.resourcemanager.zk-address is not set” , the author preliminarily judges that the yarn-site.xml file does not fill in the Zookeeper port, and the settings enable automatic recovery and automatic failover.

ResourceManage的日志,报出的问题



2023-04-14 03:56:09,668 INFO org.apache.hadoop.service.AbstractService: Service org.apache.hadoop.yarn.server.resourcemanager.AdminService failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Embedded automatic failover is enabled, but yarn.resourcemanager.zk-address is not set
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Embedded automatic failover is enabled, but yarn.resourcemanager.zk-address is not set
        at org.apache.hadoop.yarn.server.resourcemanager.EmbeddedElectorService.serviceInit(EmbeddedElectorService.java:70)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.resourcemanager.AdminService.serviceInit(AdminService.java:142)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:267)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1185)

Two, solve

修改yarn-site.xml文件,添加以下内容

<!-- 指定 Zookeeper 集群服务器的 Host:Port 列表  -->
      <property>
           <name>yarn.resourcemanager.zk-address</name>
           <value>spark01:2181,spark02:2181,spark03:2181</value>
      </property>

      <!-- 开启自动恢复功能  -->
      <property>
           <name>yarn.resourcemanager.recovery.enabled</name>
           <value>true</value>
      </property>

      <!-- 开启故障自动转移  -->
      <property>
           <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
           <value>true</value>
      </property>

重新启动Hadoop集群后,jps查看进程中,ResourceManage节点启动成功

3. End

        For this problem, the author thinks it is very outrageous. When the author used hadoop-2.10.1 before, it was not configured, and the cluster can still start normally. However, the hadoop version that caused the problem this time is 2.7.4. In the unspecified Under such circumstances, the ResourceManage node cannot be started normally, and the author can only attribute it to version differences for the time being. This article is only for readers’ reference. For specific problems, you should check the logs first, and then analyze them in detail, and do not copy them blindly.

Guess you like

Origin blog.csdn.net/weixin_63507910/article/details/130158965