hadoop2.5.0 HA high availability configuration

hadoop2.5.0 HA configuration

1. Modify the configuration file in hadoop

Enter /usr/local/src/hadoop-2.5.0-cdh5.3.6/etc/hadoop directory, modify hadoop-env.sh, core-site.xml, hdfs-site.xml, mapred-site.xml, yarn- site.xml, yarn-env.sh, slaves and other files

1.1 core-site.xml file

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <!--mycluster:配置多个nn,需要为集群取名-->
        <value>hdfs://mycluster</value>
    </property>

    <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/local/src/hadoop-2.5.0-cdh5.3.6/data</value>
    </property>

    <!-- 指定zookeeper地址 -->
    <property>
        <name>ha.zookeeper.quorum</name>
        <value>master:2181,slave1:2181,slave2:2181</value>
    </property>
</configuration>

1.2 hadoop-env.sh file

export JAVA_HOME=/usr/local/src/jdk1.8.0_121

1.3 hdfs-site.xml file

<configuration>
    <!-- 指定数据冗余份数 -->
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>

    <!-- 完全分布式集群名称 -->
    <property>
        <name>dfs.nameservices</name>
        <!--mycluster:该名称与core-site.xml文件中的名称一致-->
        <value>mycluster</value>
    </property>

    <!-- 集群中NameNode节点都有哪些 -->
    <property>
        <name>dfs.ha.namenodes.mycluster</name>
        <value>nn1,nn2</value>
    </property>

    <!-- nn1的RPC通信地址 -->
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn1</name>
        <value>master:8020</value>
    </property>

    <!-- nn2的RPC通信地址 -->
    <property>
        <name>dfs.namenode.rpc-address.mycluster.nn2</name>
        <value>slave1:8020</value>
    </property>

    <!-- nn1的http通信地址 -->
    <property>
        <name>dfs.namenode.http-address.mycluster.nn1</name>
        <value>master:50070</value>
    </property>

    <!-- nn2的http通信地址 -->
    <property>
        <name>dfs.namenode.http-address.mycluster.nn2</name>
        <value>slave1:50070</value>
    </property>

    <!-- 指定NameNode元数据在JournalNode上的存放位置 -->
    <property>
        <name>dfs.namenode.shared.edits.dir</name>
        <value>qjournal://master:8485;slave1:8485;slave2:8485/mycluster</value>
    </property>

    <!-- 配置隔离机制,即同一时刻只能有一台服务器对外响应 -->
    <property>
        <name>dfs.ha.fencing.methods</name>
        <value>shell(/bin/true)</value>
    </property>

    <!-- 使用隔离机制时需要ssh无秘钥登录-->
    <property>
        <name>dfs.ha.fencing.ssh.private-key-files</name>
        <value>/root/.ssh/id_rsa</value>
    </property>

    <!-- 声明journalnode服务器存储目录-->
    <property>
        <name>dfs.journalnode.edits.dir</name>
        <value>/usr/local/src/hadoop-2.5.0-cdh5.3.6/data/jn</value>
    </property>

    <!-- 关闭权限检查-->
    <property>
        <name>dfs.permissions.enable</name>
        <value>false</value>
    </property>

    <!-- 访问代理类:client,mycluster,active配置失败自动切换实现方式-->
    <property>
        <name>dfs.client.failover.proxy.provider.mycluster</name>
        <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
    </property>

    <!--自动故障转移-->
    <property>
        <name>dfs.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>
</configuration>

注意:

<property>
    <name>dfs.ha.fencing.methods</name>
    <!--
        value默认为:sshfence,
        要实现nn主备切换,将其设置为:shell(/bin/true)
    -->
    <value>shell(/bin/true)</value>
</property>

1.4 mapred-site.xml file

<configuration>
    <!-- 通知框架MR使用YARN -->
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

1.5 mapred-env.sh file

export JAVA_HOME=/usr/local/src/jdk1.8.0_121

1.6 yarn-site.xml file

<configuration>

    <!-- Site specific YARN configuration properties -->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

    <property>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
    </property>

    <!--任务历史服务-->
    <property>
        <name>yarn.log.server.url</name>
        <value>http://master:19888/jobhistory/logs/</value>
    </property>

    <property>
        <name>yarn.log-aggregation.retain-seconds</name>
        <value>86400</value>
    </property>

    <!--启用resourcemanager ha-->
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>

    <!--声明两台resourcemanager的地址-->
    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>cluster-yarn1</value>
    </property>

    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>

    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>slave1</value>
    </property>

    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>slave2</value>
    </property>

    <!--指定zookeeper集群的地址-->
    <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>master:2181,slave1:2181,slave2:2181</value>
    </property>

    <!--启用自动恢复-->
    <property>
        <name>yarn.resourcemanager.recovery.enabled</name>
        <value>true</value>
    </property>

    <!--指定resourcemanager的状态信息存储在zookeeper集群-->
    <property>
        <name>yarn.resourcemanager.store.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
    </property>
</configuration>

1.7 yarn-env.sh file

export JAVA_HOME=/usr/local/src/jdk1.8.0_121

1.8 slaves file

  • master
  • slave1
  • slave2

2. Start the HA cluster (single-point manual start)

2.1 On each JournalNode node, enter the following command to start the journalnode service

$ sbin/hadoop-daemon.sh start journalnode

2.2 On [nn1], format it and start

$ bin/hdfs namenode -format

$ sbin/hadoop-daemon.sh start namenode

2.3 Manually set nn1 to active

$ bin/hdfs haadmin -transitionToActive nn1

==Note: The above configuration has been configured in advance which nn of zk automatically selects as active, this command does not take effect==

2.4 View service status

$ bin/hdfs haadmin -getServiceState nn1

2.5 Start the HA cluster (one-time start)

$ hdfs zkfc -formatZK

After zk is started, execute the above command to create a znode on the zk node (otherwise, when Hadoop is started, the two configured namenodes are in the standby state)

$ sbin/start-dfs.sh

jps checks the process to see if there is a ==DFSZKFailoverController== process. This process indicates that nn1 and nn2 are managed by zk scheduling. In case of failure, the node can be automatically switched from standby to active to complete high availability.

2.6 Start the yarn mechanism

  • Start and stop yarn in slave1

    Start command:$ sbin/start-yarn.sh

    stop command:$ sbin/stop-yarn.sh

  • Start and stop yarn separately in slave2

    Start command:$ sbin/yarn-daemon.sh start resourcemanager

    stop command:$ sbin/yarn-daemon.sh stop resourcemanager

  • View service status

    Order:$ bin/yarn rmadmin -getServiceState rm1

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325360783&siteId=291194637