Foreword
Before building a good high availability for HDFS, based on his point me! ! ! Re-erected Yarn high availability!
Common script train point me! !
Modify the configuration and distribution
yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!--启用resourcemanager ha-->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!--声明两台resourcemanager的地址-->
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>cluster-yarn1</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop103</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>hadoop104</value>
</property>
<!--指定zookeeper集群的地址-->
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop102:2181,hadoop103:2181,hadoop104:2181</value>
</property>
<!--启用自动恢复-->
<property>
<name>yarn.resourcemanager.recovery.enabled</name>
<value>true</value>
</property>
<!--指定resourcemanager的状态信息存储在zookeeper集群-->
<property>
<name>yarn.resourcemanager.store.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<!-- Site specific YARN configuration properties -->
<!-- 日志聚集功能使能 -->
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<!-- 日志保留时间设置7天 -->
<property>
<name>yarn.log-aggregation.retain-seconds</name>
<value>604800</value>
</property>
</configuration>
Distributed to each host, restart the cluster
Group off
zhstop
Rallied together
zhstart
note! resoucemanager can not use a script to rally together, there is no need to manually start up a single point of RM!
xcall jps test
xcall jps
State inspection
yarn rmadmin -getServiceState rm1
yarn rmadmin -getServiceState rm2
to sum up
2020-03-05 00:34:26,697 INFO org.apache.hadoop.service.AbstractService: Service ResourceManager failed in state INITED; cause: org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Invalid configuration! Can not find valid RM_HA_ID. None of yarn.resourcemanager.address.rm1 yarn.resourcemanager.address.rm2 are matching the local address OR yarn.resourcemanager.ha.id is not specified in HA Configuration
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: Invalid configuration! Can not find valid RM_HA_ID. None of yarn.resourcemanager.address.rm1 yarn.resourcemanager.address.rm2 are matching the local address OR yarn.resourcemanager.ha.id is not specified in HA Configuration
at org.apache.hadoop.yarn.conf.HAUtil.throwBadConfigurationException(HAUtil.java:43)
at org.apache.hadoop.yarn.conf.HAUtil.verifyAndSetCurrentRMHAId(HAUtil.java:125)
at org.apache.hadoop.yarn.conf.HAUtil.verifyAndSetConfiguration(HAUtil.java:81)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:223)
at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163)
at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1177)
2020-03-05 00:34:26,699 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Transitioning to standby state
If the startup fails, reformat NN zkfc set up in accordance with HDFS-HA in the initialization step! ! Train links point me! ! here! !
Critical configuration information analysis
Property name | Property Value |
---|---|
yarn.resourcemanager.recovery.enabled | true (start automatically restored!) |