I. Overview
Please reprint from the source: http://eksliang.iteye.com/blog/2226986
1.1 Single point problem of hadoop1.0
The NameNode in Hadoop is like the heart of a person. It is very important and must not stop working. In the hadoop1 era, there was only one NameNode. If the NameNode data is lost or becomes inoperable, the entire cluster cannot be recovered. This is a single point problem in hadoop1 and a manifestation of hadoop1 being unreliable. As shown in the figure below, it is the architecture diagram of hadoop1.0;
1.2 The solution of hadoop2.0 to the single point problem of hadoop1.0
In order to solve the single-point problem in hadoop1, the new NameNode in hadoop2 is no longer only one, there can be multiple (currently only 2 are supported). Each has the same function. One is in active state and the other is in standby state. When the cluster is running, only the NameNode in the active state is working normally, and the NameNode in the standby state is in the standby state, and the data of the NameNode in the active state is synchronized at all times. Once the NameNode in the active state cannot work, the NameNode in the standby state can be changed to the active state by manual or automatic switching, and it can continue to work. This is high reliability.
1.3 Use JournalNode to realize the sharing of NameNode (Active and Standby) data
In Hadoop2.0, the data of two NameNodes is actually shared in real time. The new HDFS adopts a sharing mechanism, Quorum Journal Node (JournalNode) cluster or Nnetwork File System (NFS) for sharing. NFS is at the operating system level, and JournalNode is at the hadoop level. We use the JournalNode cluster for data sharing (this is also the mainstream practice). As shown in the figure below, it is the architecture diagram of JournalNode.
For data synchronization, two NameNodes communicate with each other through a set of independent processes called JournalNodes. Most of the JournalNodes processes are notified when the namespace of the active NameNode is modified. The NameNode in the standby state has the ability to read the change information in the JNs, and always monitor the changes of the edit log, and apply the changes to its own namespace. standby ensures that in the event of a cluster failure, the namespace state is fully synchronized
1.4 Failover between NameNodes
For HA clusters, it is critical to ensure that only one NameNode is active at a time. Otherwise, the data states of the two NameNodes will diverge, data may be lost, or erroneous results may be produced. In order to ensure this, this requires the use of ZooKeeper. First, both NameNodes in the HDFS cluster are registered in ZooKeeper. When the NameNode in the active state fails, ZooKeeper can detect this situation, and it will automatically switch the NameNode in the standby state to the active state.
2. Construction of Hadoop (HA) cluster
2.1 Configuration Details
CPU name | IP | NameNode | DataNode | Year | Zookeeper | JournalNode |
mast1 | 192.168.177.131 | Yes | Yes | no | Yes | Yes |
mast2 | 192.168.177.132 | Yes | Yes | no | Yes | Yes |
mast3 | 192.168.177.133 | no | Yes | Yes | Yes | Yes |
2.2 Install jdk
(Omitted) Install jdk and configure environment variables
2.2 SSH login-free
(omitted), reference: http://eksliang.iteye.com/blog/2187265
2.4 Zookeeper cluster construction
(Omitted), for reference, http://eksliang.iteye.com/blog/2107002, this is my solr cluster deployment, which is also managed by zookeeper. The steps in zookeeper are exactly the same as the operation. Finally, my zoo.cfg file is as follows Show
[hadoop@Mast1 conf]$ cat zoo.cfg # The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit = 10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. dataDir=/home/hadoop/zookeeper/data dataLogDir=/home/hadoop/zookeeper/datalog # the port at which the clients will connect clientPort=2181 server.1=mast1:2888:3888 server.2=mast2:2888:3888 server.3=mast3:2888:3888
2.5 Configure the Hadoop configuration file
First configure the machine mast1, after the configuration, copy the configuration environment to mast2 and mast3!
The configuration of hadoop2.0 is stored in the ~/etc/hadoop directory.
- core.xml
<configuration> <!-- Specify the nameservice of hdfs as ns --> <property> <name>fs.defaultFS</name> <value>hdfs://ns</value> </property> <!--Specify the temporary storage directory of hadoop data--> <property> <name>hadoop.tmp.dir</name> <value>/home/hadoop/workspace/hdfs/temp</value> </property> <property> <name>io.file.buffer.size</name> <value>4096</value> </property> <!--Specify the zookeeper address--> <property> <name>ha.zookeeper.quorum</name> <value>mast1:2181,mast2:2181,mast3:2181</value> </property> </configuration>
- hdfs-site.xml
<configuration> <!--Specify the nameservice of hdfs as ns, which needs to be consistent with that in core-site.xml--> <property> <name>dfs.nameservices</name> <value>ns</value> </property> <!-- There are two NameNodes under ns, nn1, nn2 --> <property> <name>dfs.ha.namenodes.ns</name> <value>nn1,nn2</value> </property> <!-- nn1's RPC address --> <property> <name>dfs.namenode.rpc-address.ns.nn1</name> <value>mast1:9000</value> </property> <!-- http communication address of nn1--> <property> <name>dfs.namenode.http-address.ns.nn1</name> <value>mast1:50070</value> </property> <!-- nn2's RPC address --> <property> <name>dfs.namenode.rpc-address.ns.nn2</name> <value>mast2:9000</value> </property> <!-- http communication address of nn2--> <property> <name>dfs.namenode.http-address.ns.nn2</name> <value>mast2:50070</value> </property> <!-- Specify the storage location of the NameNode's metadata on the JournalNode--> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://mast1:8485;mast2:8485;mast3:8485/ns</value> </property> <!-- Specify the location where JournalNode stores data on the local disk--> <property> <name>dfs.journalnode.edits.dir</name> <value>/home/hadoop/workspace/journal</value> </property> <!-- Enable automatic switchover when NameNode fails --> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <!-- Configuration failure automatic switching implementation--> <property> <name>dfs.client.failover.proxy.provider.ns</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!-- Configure isolation mechanism--> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <!-- SSH login-free is required when using the isolation mechanism--> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> <!-- hdfs and ssh login-free connection timeout time--> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>5000</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:///home/hadoop/workspace/hdfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:///home/hadoop/workspace/hdfs/data</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> <!-- Enable WebHDFS (REST API) function on NN and DN, not required--> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration>
- mapred-site.xml
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> </configuration>
- yarn-site.xml
<configuration> <!-- Specify the way to load server when nodemanager starts as shuffle server --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <!-- Specify resourcemanager address --> <property> <name>yarn.resourcemanager.hostname</name> <value>mast3</value> </property> </configuration>
- slaves
[hadoop@Mast1 hadoop]$ cat slaves mast1 mast2 mast3
- Modify JAVA_HOME
Add the JAVA_HOME configuration to the files hadoop-env.sh and yarn-env.sh respectively
#export JAVA_HOME=${JAVA_HOME} --原来 export JAVA_HOME=/usr/local/java/jdk1.7.0_67
Although the environment variable of ${JAVA_HOME} is configured by default, when hadoop starts, it will prompt that it cannot be found. There is no way to specify the absolute path. This is necessary.
- Configure the environment variables of hadoop, refer to my configuration
[hadoop@Mast1 hadoop]$ vim ~/.bash_profile export HADOOP_HOME="/home/hadoop/hadoop-2.5.2" export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
- Copy the configuration to mast2, mast3
scp -r ~/.bash_profile hadoop@mast2:/home/hadoop/ scp -r ~/.bash_profile hadoop@mast3:/home/hadoop/ scp -r $HADOOP_HOME/etc/hadoop hadoop@mast2:/home/hadoop/hadoop-2.5.2/etc/ scp -r $HADOOP_HOME/etc/hadoop hadoop@mast3:/home/hadoop/hadoop-2.5.2/etc/
So far the Hadoop configuration is complete, the next step is to start the cluster
3. Startup of the cluster
3.1 Start the zookeeper cluster
Execute the following commands on mast1, mast2, and mast3 to start the zookeeper cluster;
[hadoop@Mast1 bin]$ sh zkServer.sh start
To verify whether the cluster zookeeper cluster is started, execute the following commands on mast1, mast2, and mast3 to verify whether the zookeeper cluster is started, the cluster is successfully started, and there are two follower nodes and one leader node;
[hadoop@Mast1 bin]$ sh zkServer.sh status JMX enabled by default Using config: /home/hadoop/zookeeper/zookeeper-3.3.6/bin/../conf/zoo.cfg Mode: follower
3.2 Start the journalnode cluster
Execute the following command on mast1 to start the JournalNode cluster
[hadoop@Mast1 hadoop-2.5.2]$ sbin/hadoop-daemons.sh start journalnode
Execute the jps command, you can view the java process pid of JournalNode
3.3 Format zkfc to generate ha nodes in zookeeper
Execute the following command on mast1 to complete the format
hdfs zkfc –formatZK
(Note that this command is best to be entered manually. There may be problems with direct copy execution. I was in pain for a long time when deploying)
After the format is successful, you can see it in zookeeper
[zk: localhost:2181(CONNECTED) 1] ls /hadoop-ha [ns]
3.4 Format hdfs
hadoop purpose-format
(Note that this command is best to enter manually, there may be problems with direct copy execution)
3.5 Start NameNode
First start the active node on mast1 and execute the following command on mast1
[hadoop@Mast1 hadoop-2.5.2]$ sbin/hadoop-daemon.sh start namenode
Synchronize the data of the namenode on mast2 and start the namenode of the standby at the same time, the command is as follows
#Synchronize the data of NameNode to mast2 [hadoop@Mast2 hadoop-2.5.2]$ hdfs namenode –bootstrapStandby #Start the namenode on mast2 as standby [hadoop@Mast2 hadoop-2.5.2]$ sbin/hadoop-daemon.sh start namenode
3.6 start start datanode
Execute the following command on mast1
[hadoop@Mast1 hadoop-2.5.2]$ sbin/hadoop-daemons.sh start datanode
3.7 Start year
Start on the machine as the resource manager, I am mast3 here, execute the following command to complete the start of year
[hadoop@Mast3 hadoop-2.5.2]$ sbin/start-yarn.sh
3.8 Start ZKFC
Execute the following command on mast1 to complete the startup of ZKFC
[hadoop@Mast1 hadoop-2.5.2]$ sbin/hadoop-daemons.sh start zkfc
After all start, execute jps on mast1, mast2, mast3 respectively, you can see the following processes
java PID process on #mast1 [hadoop@Mast1 hadoop-2.5.2]$ jps 2837 NodeManager 3054 DFSZKFailoverController 4309 Jps 2692 DataNode 2173 QuorumPeerMain 2551 NameNode 2288 JournalNode java PID process on #mast2 [hadoop@Mast2 ~]$ jps 2869 DFSZKFailoverController 2353 DataNode 2235 JournalNode 4522 Jps 2713 NodeManager 2591 NameNode 2168 QuorumPeerMain java PID process on #mast3 [hadoop@Mast3 ~]$ jps 2167 QuorumPeerMain 2337 JournalNode 3506 Jps 2457 DataNode 2694 NodeManager 2590 ResourceManager
4. Test the high availability of HA
After startup, the namenode of mast1 and the namenode of mast2 are as follows:
At this point, execute the following command on mast1 to close the namenode on mast1
[hadoop@Mast1 hadoop-2.5.2]$ sbin/hadoop-daemon.sh stop namenode
Check the namenode on mast1 again and find that it is automatically switched to active! The evidence is as follows: