HDFS High Availability High Availability (HA) with QJM

 

What is availability:

1. the ability to continue the service.

           2. To avoid single point of failure single point of failure SPOF

Fail Over

Disaster Recovery

fault tolerent

Fault Tolerance

 

hadoop two ways to achieve HA

1.Using the Quorum Journal Manager (QJM)

2.NFS net FileSystem

3.QJM architecture

          Configure two NN, at the same time only one active node, the other is a standby state (standby).

          Standby role is to maintain a sufficient number of state data to prepare for disaster recovery use.

           active and standby nodes synchronize two separate daemon nodes (JournalNodes)

            Communicate

            active namespace modifications will be recorded in the primary version jn, standby read from the JN,

            When a disaster occurs,

            ensure that all of the read standby edit data, and then switched to the active state, like the name of the node to ensure

            State is synchronous.

            In order to achieve rapid disaster recovery, standby block must have information on all data nodes, so

            All data

           Two nodes need to be configured node name, real-time heart rate information to send two name nodes

           Brain split (split-brain): Name two nodes in the active state. To prevent this, JN same time

           NN engraved only as a writer, when disaster occurs, become the active station to take over the write NN JN

            For, prevent other active write

 

4.QJM disaster tolerant configurations ( core: Share Edit Log )

        1. Hardware resources:

                           a plurality. NN hosts

                           b. multiple JN

                              JN is a lightweight process, and other data nodes can be on the same host, configure at least three JN node.

                           c Note:. standy also running a checkpoint to work, so can not configure 2NN, otherwise it will error

         2. Deploy (apache to see the official documentation):

                            nameservice ID is constituted by a plurality of NN

                            Configuration details [hdfs-site.xml]

                            

1.dfs.nameservices
<property>
<!-- 名称服务逻辑名称 -->
  <name>dfs.nameservices</name>
  <value>mycluster</value>
</property>


2.dfs.ha.namenodes.[nameservice ID]
<property>
<!-- 名称节点逻辑名称,2.x版本只允许两个名称节点 -->
  <name>dfs.ha.namenodes.mycluster</name>
  <value>nn1,nn2</value>
</property>
Note: Currently, only a maximum of two NameNodes may be configured per nameservice.


 3.dfs.namenode.rpc-address.[nameservice ID].[name node ID] 
<!-- 名称节点远端过程调用地址 -->
<property>
  <name>dfs.namenode.rpc-address.mycluster.nn1</name>
  <value>s101:8020</value>
</property>
<property>
  <name>dfs.namenode.rpc-address.mycluster.nn2</name>
  <value>s102:8020</value>
</property>


4.dfs.namenode.http-address.[nameservice ID].[name node ID]
<!-- 名称节点WEBUI地址 -->
<property>
  <name>dfs.namenode.http-address.mycluster.nn1</name>
  <value>s101:50070</value>
</property>
<property>
  <name>dfs.namenode.http-address.mycluster.nn2</name>
  <value>s102:50070</value>
</property>


5.dfs.namenode.shared.edits.dir
<!-- 共享日志目录地址 -->
<property>
  <name>dfs.namenode.shared.edits.dir</name>
  <value>qjournal://s103:8485;s104:8485;s105:8485/mycluster</value>
</property>

6.dfs.client.failover.proxy.provider.[nameservice ID]
<!-- 客户端容灾代理供应商就,客户端通过该类判断哪个NN是active状态 -->
<!-- 固定值 -->
<property>
  <name>dfs.client.failover.proxy.provider.mycluster</name>
  <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>

7.dfs.ha.fencing.methods 
<!-- 当容灾发生时,对active的保护方法 -->
<property>
      <name>dfs.ha.fencing.methods</name>
      <value>sshfence</value>
    </property>

    <property>
      <name>dfs.ha.fencing.ssh.private-key-files</name>
      <value>/home/exampleuser/.ssh/id_rsa</value>
    </property>

8.Fs.defaultFS
[core-site.xml]
<!-- 配置文件系统的主机 -->
<property>
  <name>fs.defaultFS</name>
  <value>hdfs://mycluster</value>
</property>

9.dfs.Journalnode.Editor s.dir
[hdfs-site.xml]
<!-- JournalNode编辑日志的绝对路径 -->
<property>
  <name>dfs.journalnode.edits.dir</name>
  <value>/path/to/journal/node/local/data</value>
</property>

 

 

Guess you like

Origin blog.csdn.net/qq_29082603/article/details/95315292