-Hadoop HA HA-HDFS distributed cluster set up and use Big Data learning (3)

NameNode HDFS cluster to ensure high availability, in order to allow NameNode safer choice here with ZooKeeper clusters to ensure

Environmental and ready

Ibid articles Edition

Build a zookeeper cluster

  1. Download and unzip the zookeeper
  2. Create a data folder in the root directory zookeeper
  3. Modify the configuration into the conf folder
    3.1 Modify zoo_sample.cfg named zoo.cfg
    3.2 editing zoo.cfg
       dataDir=/opt/install/zookeeper-3.4.5/data

       server.0=hadoop1.msk.com:2888:3888
       server.1=hadoop2.msk.com:2888:3888
       server.2=hadoop3.msk.com:2888:3888
  1. Myid create files in the zookeeper / data
第一台节点myid里面填0  第二台  1  以此类推(三台机器分别为 0,1,2)
  1. With each primary node density log ssh Free three machines (including the master node itself)
    ZooKeeper start and stop commands
bin/zkServer.sh start | stop | restart | status

zookeeper client commands

  • Note: The master node runs the zookeeper
 bin/zkCli.sh

HA-HDFS distributed Cluster Setup

  • If you are using the previous ordinary cluster it is recommended to empty the data / tmp, if the new environment can refer to the previous article to build a foundation based version of the cluster environment
  1. Modify the configuration file
    core-site.xml
<!--  这里的ns随意  只是入口  -->
       <property>		
			<name>fs.defaultFS</name>
			<value>hdfs://ns</value>
	   </property>
	   <property>
			<name>hadoop.tmp.dir</name>
			<value>/opt/install/hadoop-2.5.2/data/tmp</value>
	   </property>
	   <property>
		    <name>ha.zookeeper.quorum</name>
            <value>hadoop1.msk.com:2181,hadoop2.msk.com:2181,hadoop3.msk.com:2181</value>
	   </property>

**hdfs-site.xml **

<property>
		  <name>dfs.permissions.enabled</name>
		  <value>false</value>
	  </property>
		
	  <!--指定hdfs的nameservice为ns,需要和core-site.xml中的保持一致 -->
	  <property>
		  <name>dfs.nameservices</name>
		  <value>ns</value>
	  </property>
	  <!-- ns下面有两个NameNode,分别是nn1,nn2 -->
	  <property>
		  <name>dfs.ha.namenodes.ns</name>
		  <value>nn1,nn2</value>
	  </property>
  <!-- nn1的RPC通信地址 -->
	  <property>
		  <name>dfs.namenode.rpc-address.ns.nn1</name>
		  <value>hadoop1.msk.com:8020</value>
	  </property>
	  <!-- nn1的http通信地址 -->
	  <property>
		  <name>dfs.namenode.http-address.ns.nn1</name>
		  <value>hadoop1.msk.com:50070</value>
	  </property>
  <!-- nn2的RPC通信地址 -->
	  <property>
		  <name>dfs.namenode.rpc-address.ns.nn2</name>
		  <value>hadoop2.msk.com:8020</value>
	  </property>
	  <!-- nn2的http通信地址 -->
	  <property>
		  <name>dfs.namenode.http-address.ns.nn2</name>
		  <value>hadoop2.msk.com:50070</value>
	  </property>

	<!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
	<property>
		<name>dfs.namenode.shared.edits.dir</name>
		<value>qjournal://hadoop1.msk.com:8485;hadoop2.msk.com:8485;hadoop3.msk.com:8485/ns</value>
	</property>
<!-- 指定JournalNode在本地磁盘存放数据的位置 -->
	<property>
		<name>dfs.journalnode.edits.dir</name>
		<value>/opt/install/hadoop-2.5.2/journal</value>
	</property>
	<!-- 开启NameNode故障时自动切换 -->
	<property>
		<name>dfs.ha.automatic-failover.enabled</name>
		<value>true</value>
	</property>
    <!-- 配置失败自动切换实现方式 -->
	<property>
		<name>dfs.client.failover.proxy.provider.ns</name>
		<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
	</property>
	<!-- 配置隔离机制,如果ssh是默认22端口,value直接写sshfence即可 -->
	<property>
		<name>dfs.ha.fencing.methods</name>
		<value>sshfence</value>
	</property>
	<!-- 使用隔离机制时需要ssh免登陆 -->
	<property>
		<name>dfs.ha.fencing.ssh.private-key-files</name>
		<value>/root/.ssh/id_rsa</value>
	</property>

yarn-env.sh

export JAVA_HOME=/usr/java/jdk1.7.0_71
  1. First start zookeeper cluster (three to be executed to start zookeeper Services Directive)
  2. In the master node NameNode formatted zkfc
    bin/hdfs zkfc -formatZK
  1. Journalnode start with the following command in each node journalnode
    sbin/hadoop-daemon.sh start journalnode
  1. In the master node namenode journalnode directory and formatting namenode
     bin/hdfs namenode -format ns
  1. Namenode process started in the master node namenode
     sbin/hadoop-daemon.sh start namenode
  1. Preparation namenode executed first command node, this is the format the directory nodes and the standby namenode metadata over from the master node namenode Copy, and this command will not re-formatted directory journalnode! And then start the backup process with a second command namenode
     bin/hdfs namenode -bootstrapStandby
     sbin/hadoop-daemon.sh start namenode
  1. In two namenode nodes execute the following command
     sbin/hadoop-daemon.sh start zkfc
  1. Start datanode datanode in all nodes execute the following command
    sbin/hadoop-daemon.sh start datanode
  1. Daily start and stop commands
    sbin/start-dfs.sh
    sbin/stop-dfs.sh
Published 19 original articles · won praise 8 · views 4552

Guess you like

Origin blog.csdn.net/M283592338/article/details/90950966