5、hadoop多节点(HA + Federation)

一、准备

1、4台linux系统
2、检查联网
3、检查各hosts文件
4、检查ssh
5、检查各节点的jvm配置
6、将配置好的hadoop目录拷贝到其他节点：
scp -r itcast hadoop@skx2:/home/hadoop
7、检查各配置文件

federation的应用场景
参看：http://www.infoq.com/cn/articles/hadoop-2-0-namenode-ha-federation-practice-zh/
      http://blog.csdn.net/strongerbit/article/details/7013221/

Federation HDFS与当前HDFS的比较
    当前HDFS只有一个命名空间（Namespace），它使用全部的块。而Federation HDFS中有多个独立的命名空间（Namespace），并且每一个命名空间使用一个块池（block pool）。
当前HDFS中只有一组块。而Federation HDFS中有多组独立的块。块池（block pool）就是属于同一个命名空间的一组块。
     当前HDFS由一个Namenode和一组datanode组成。而Federation HDFS由多个Namenode和一组datanode，每一个datanode会为多个块池（block pool）存储块。

其他配置文件和前节相同，主要是hdfs-site.xml，参看：

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
  <name>dfs.nameservices</name>
  <value>hadoop-cluster1,hadoop-cluster2</value>
  <description>
    Comma-separated list of nameservices.
  </description>
</property>

<!--  hadoop cluster1-->
<property>
  <name>dfs.ha.namenodes.hadoop-cluster1</name>
  <value>nn1,nn2</value>
  <description>
    The prefix for a given nameservice, contains a comma-separated
    list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE).
  </description>
</property>

<property>
  <name>dfs.namenode.rpc-address.hadoop-cluster1.nn1</name>
  <value>SY-0217:8020</value>
  <description>
    RPC address for nomenode1 of hadoop-cluster1
  </description>
</property>

<property>
  <name>dfs.namenode.rpc-address.hadoop-cluster1.nn2</name>
  <value>SY-0355:8020</value>
  <description>
    RPC address for nomenode2 of hadoop-test
  </description>
</property>

<property>
  <name>dfs.namenode.http-address.hadoop-cluster1.nn1</name>
  <value>SY-0217:50070</value>
  <description>
    The address and the base port where the dfs namenode1 web ui will listen on.
  </description>
</property>

<property>
  <name>dfs.namenode.http-address.hadoop-cluster1.nn2</name>
  <value>SY-0355:50070</value>
  <description>
    The address and the base port where the dfs namenode2 web ui will listen on.
  </description>
</property>

<!--  hadoop cluster2 -->
<property>
  <name>dfs.ha.namenodes.hadoop-cluster2</name>
  <value>nn3,nn4</value>
  <description>
    The prefix for a given nameservice, contains a comma-separated
    list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE).
  </description>
</property>

<property>
  <name>dfs.namenode.rpc-address.hadoop-cluster2.nn3</name>
  <value>SY-0226:8020</value>
  <description>
    RPC address for nomenode1 of hadoop-cluster1
  </description>
</property>

<property>
  <name>dfs.namenode.rpc-address.hadoop-cluster2.nn4</name>
  <value>SY-0225:8020</value>
  <description>
    RPC address for nomenode2 of hadoop-test
  </description>
</property>

<property>
  <name>dfs.namenode.http-address.hadoop-cluster2.nn3</name>
  <value>SY-0226:50070</value>
  <description>
    The address and the base port where the dfs namenode1 web ui will listen on.
  </description>
</property>

<property>
  <name>dfs.namenode.http-address.hadoop-cluster2.nn4</name>
  <value>SY-0225:50070</value>
  <description>
    The address and the base port where the dfs namenode2 web ui will listen on.
  </description>
</property>

<property>
  <name>dfs.namenode.name.dir</name>
  <value>file:///home/dongxicheng/hadoop/hdfs/name</value>
  <description>Determines where on the local filesystem the DFS name node
      should store the name table(fsimage).  If this is a comma-delimited list
      of directories then the name table is replicated in all of the
      directories, for redundancy. </description>
</property>

<property>
  <name>dfs.namenode.shared.edits.dir</name>
  <value>qjournal://SY-0355:8485;SY-0225:8485;SY-0226:8485/hadoop-cluster</value>
  <description>A directory on shared storage between the multiple namenodes
  in an HA cluster. This directory will be written by the active and read
  by the standby in order to keep the namespaces synchronized. This directory
  does not need to be listed in dfs.namenode.edits.dir above. It should be
  left empty in a non-HA cluster.
  </description>
</property>

<property>
  <name>dfs.datanode.data.dir</name>
  <value>file:///home/dongxicheng/hadoop/hdfs/data</value>
  <description>Determines where on the local filesystem an DFS data node
  should store its blocks.  If this is a comma-delimited
  list of directories, then data will be stored in all named
  directories, typically on different devices.
  Directories that do not exist are ignored.
  </description>
</property>

<property>
  <name>dfs.ha.automatic-failover.enabled</name>
  <value>false</value>
  <description>
    Whether automatic failover is enabled. See the HDFS High
    Availability documentation for details on automatic HA
    configuration.
  </description>
</property>

<property>
  <name>dfs.journalnode.edits.dir</name>
  <value>/home/dongxicheng/hadoop/hdfs/journal/</value>
</property>

</configuration>

启动：
启动Hadoop集群：
-------------------------------------------------------------------
(1) 启动nn1与nn2
Step1 :
在各个JournalNode节点上，输入以下命令启动journalnode服务：
sbin/hadoop-daemon.sh start journalnode

Step2:
在[nn1]上，对其进行格式化，并启动：
bin/hdfs namenode -format -clusterId hadoop-cluster
sbin/hadoop-daemon.sh start namenode

Step3:
在[nn2]上，同步nn1的元数据信息：
bin/hdfs namenode -bootstrapStandby

Step4:
启动[nn2]：
sbin/hadoop-daemon.sh start namenode

经过以上四步操作，nn1和nn2均处理standby状态
Step5:
将[nn1]切换为Active
bin/hdfs haadmin -ns hadoop-cluster1 -transitionToActive nn1

-------------------------------------------------------------------
(2) 启动nn3与nn4
Step1:
在[nn3]上，对其进行格式化，并启动：
bin/hdfs namenode -format -clusterId hadoop-cluster
sbin/hadoop-daemon.sh start namenode

Step2:
在[nn4]上，同步nn3的元数据信息：
bin/hdfs namenode -bootstrapStandby

Step3:
启动[nn4]：
sbin/hadoop-daemon.sh start namenode

经过以上三步操作，nn3和nn4均处理standby状态
Step4:
将[nn3]切换为Active
bin/hdfs haadmin -ns hadoop-cluster2 -transitionToActive nn3

-------------------------------------------------------------------
（3）启动所有datanode
Step6:
在[nn1]上，启动所有datanode
sbin/hadoop-daemons.sh start datanode

-------------------------------------------------------------------
（4）关闭Hadoop集群：
在[nn1]上，输入以下命令
sbin/stop-dfs.sh

5、hadoop多节点(HA + Federation)

猜你喜欢