HDFS-HA mode construction (based on fully distributed mode upgrade)

Description

The HDFS-HA mode construction is based on the previous fully distributed construction. For the fully distributed construction, please refer to the previous content:

Hadoop installation environment preparation and related knowledge analysis

Preliminary analysis of distributed installation and configuration of hadoop (unstoppable)

Generally speaking, it is roughly divided into the following parts:

JDK安装和JAVA_HOME配置

HOSTS映射

SSH免密登陆设置

HADOOP配置文件修改

配置HADOOP_HOME

初始化(格式化)

启动验证

The own modules of the hadoop project often mentioned are yarn, hdfs, and mapreduce. Both hdfs and yarn can be built as high-availability (HA). This article first introduces only the HA build of hdfs.

Machine planning

The fully distributed and HA modes are not only different in configuration, but also in the processes started after running. The previous fully distributed machines and main processes are roughly as follows

Fully distributed

host nomenode datanode resourceManager nodeManager secondaryNameNode
192.168.139.9master
192.168.139.19node01
192.168.139.29node02

HA mode

Compared with the fully distributed mode, HA has added ZOOKEEPER, ZKFC, and JOURNALNODE, but without the secondaryNameNode, the process and node distribution of the HA mode after being built or planned is as follows:

host nomenode datanode journalnode zkfc zk nodeManager resourceManager
192.168.139.9
master
192.168.139.19
node01
192.168.139.29
node02
192.168.139.39
node03

It should be noted that the resourceManager here is on the active node, not two namenodes exist at the same time.

Understanding of HDFS-HA

In the fully distributed mode, there is only one namenode node and multiple datanodes. The namenode is used to manage the source data of the datanode. Once the namenode fails, the entire hdfs will be unavailable.

Therefore, the HA mode of HDFS mentioned here, my personal understanding is actually the HA of the namenode, adding a master-slave mode to the namenode, adding failover and automatic master-slave switching, whether it is the newly added zookeeper middleware or the newly added journalnode process, its The main function is to ensure the high availability of the namenode, so that the entire active node can still provide normal services after failure.

Basic environment preparation

As can be seen from the table above, compared to the fully distributed system, I have added another machine here, mainly because I am afraid that the three will not be able to support it.

So the basic environment preparation here actually refers to the newly added machine, because the basic environment is universal, that is, jkd, ssh, and hosts mapping. For specific operations, please refer to the contents of the two blogs listed at the beginning.

Configuration

In the previous fully distributed mode, a lot of content was configured, such as hdfs-site.xml, hadoop-env.sh, core-site.xml, workers, etc. This time the HA mode construction only needs to modify hdfs-site.xml , Hadoop-env.sh, core-site.xml these three.

hdfs-site.xml

It is said that hadoop3.x can support up to 5 namenodes, but because of limited computer resources, only two are used. In the configuration below, it is mainly divided into four sections, and the overall configuration is as follows:


<!--1-->

<property>

<name>dfs.nameservices</name>

<value>mycluster</value>

</property>

<property>

<name>dfs.ha.namenodes.mycluster</name>

<value>nn1,nn2</value>

</property>

<property>

<name>dfs.namenode.rpc-address.mycluster.nn1</name>

<value>master:8020</value>

</property>

<property>

<name>dfs.namenode.rpc-address.mycluster.nn2</name>

<value>node03:8020</value>

</property>

<property>

<name>dfs.namenode.http-address.mycluster.nn1</name>

<value>master:9870</value>

</property>

<property>

<name>dfs.namenode.http-address.mycluster.nn2</name>

<value>node03:9870</value>

</property>

<!--2-->

<property>

<name>dfs.namenode.shared.edits.dir</name>

<value>qjournal://master:8485;node01:8485;node02:8485/mycluster<

/value>

</property>

<property>

<name>dfs.journalnode.edits.dir</name>

<value>/root/soft/bigdata/journal/data</value>

</property>

<!--3-->

<property>

<name>dfs.client.failover.proxy.provider.mycluster</name>

<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>

</property>

<property>

<name>dfs.ha.fencing.methods</name>

<value>sshfence</value>

</property>

<property>

<name>dfs.ha.fencing.ssh.private-key-files</name>

<value>/root/.ssh/id_rsa</value>

</property>

<!--4-->

<property>

<name>dfs.ha.automatic-failover.enabled</name>

<value>true</value>

</property>

The first part above is to configure the namenode cluster, specifying the cluster name, node name, and node host. The first two can be customized.

The second part specifies the machine and local storage directory where the journalnode node is located. After starting the journalnode, the node configured here will start, and the corresponding data will also be stored in the specified directory.

The third part specifies the HA role switching agent class and method, here the ssh-free method is used.

The fourth part is to set up HA automation, which is actually ZKFC.

core-site.xml

The configuration of core-site is much simpler than that of HDFS. The configuration is as follows:


<property>

<name>fs.defaultFS</name>

<value>hdfs://mycluster</value>

</property>

<property>

<name>ha.zookeeper.quorum</name>

<value>node01:2181,node02:2181,node03:2181</value>

</property>

<property>

<name>hadoop.tmp.dir</name>

<value>/opt/hadoop/hadoopdata</value>

</property>

In the fully distributed mode, the first configuration is the hostname or ip of a specific machine. In HA mode, it needs to be modified to the name of the namenode cluster defined above. Although the specific name can be customized, it is configured here. The needs are the same as above.

The second item is to specify the zookeeper cluster node, and the third item is the previous fully distributed mode.

hadoop-env.sh

It seems that this file is not mentioned in many places on the Internet, but the actual operation found that if this file is not configured, it will fail at startup.

In fully distributed mode, in addition to JAVA_HOME configured in this file, the following content is also configured:


export HDFS_NAMENODE_USER=root

export HDFS_DATANODE_USER=root

export HDFS_SECONDARYNAMENODE=root

Now that journalnode and zkfc are newly added, the following configuration needs to be similarly added:


HDFS_JOURNALNODE_USER=root

HDFS_ZKFC_USER=root

At this point, the configuration of the HDFS itself in HA mode is complete, but you can see that zookeeper is needed above, and completely distributed does not have this, so you need to install zookeeper in the planned nodes first.

Zookeeper build and start

First get the installation package, you can get the latest from the official website, the address is as follows:

https://downloads.apache.org/zookeeper/current/

Then use tar to decompress, the specific steps will not be repeated. After the decompression is complete, you need to rename the zoo_sample.cfg file in the conf directory of the installation directory to zoo.cfg, and then modify this file.

zoo.cfg modification

First of all, there is a default in this file, dataDirwhich is the temporary directory of the Linux system configured as hadoop at the beginning. This directory needs to be modified to a non-temporary directory, for example, I modified it here /root/soft/bigdata/zookeeper/data.

Then, you need to configure each node of the zookeeper cluster. For example, my configuration here is as follows:


server.1=node01:2888:3888

server.2=node02:2888:3888

server.3=node03:2888:3888

The number behind the server above does not have to be the same as here. It can be understood as a weight for the election of zookeeper itself.

Followed by two port numbers, one is the port used for mutual communication when there is a master or leader, and the other is the port number used for communication without a master.

myid

With the above configuration, you also need to add a weight file in the zookeeper data directory, write the weight value configured above to this file, and provide it to zookeeper to read.

For example, I /root/soft/bigdata/zookeeper/dataperform the following operations on node01 here :


echo 1 > myid

Perform the /root/soft/bigdata/zookeeper/datafollowing operations on node02 :


echo 2 > myid

Similarly, /root/soft/bigdata/zookeeper/dataperform the following operations on node03 :


echo 3 > myid

zookeeper environment variables

Before installing jdk, redis, or hadoop, they have configured their own environment variables. The purpose is to easily execute the corresponding commands in any directory and perform the corresponding operations. Zookeepere is the same. It is best to configure the environment variables. .

Startup and verification

In addition to environment variables, hadoop's own installation package and configuration can be operated on only one machine, and then need to be distributed, or copied to two other machines, and then the environment variables are configured in turn, the zookeeper cluster can be started and verified. status.

zkServer.sh startThe zookeeper service can be started, and the zkServer.sh statuszookeeper running status can be viewed.

When we execute start on a machine now, we will see the display immediately with status not running, and we pscan see the zookeeper process when we use it . The reason is that when there is only one zookeeper, elections are not possible, so there are processes that cannot provide services.

When you start another one, you will see that the one with the larger weight value will become leader, and the smaller the weight value will become follower. After all three are started, there will be one leader and two followers.

HDFS initialization and startup in HA mode

As mentioned before, namenode actually manages datanode, while zookeeper and journalnode function to improve the availability of namenode, which can be simply regarded as a management of namenode.

So the normal situation is that the namenode runs before the datanode, and zookeeper and journalnode need to run before the namenode. Therefore, after the zookeeper cluster is running normally, journalnode needs to be started again:


hadoop-daemon.sh start journalnode

Because it is a re-build, it needs to be initialized just like a fully distributed system. As I said before, the function of this initialization is to clear the original data and then create some new files. Normally, these two operations should only be performed when the environment is set up. Once, in fact, it should even be disabled. The formatting operation is as follows:


hdfs namenode -format

The name master-slave cluster communication needs to be based on the same cluster id, and the above initialization operation will generate a cluster id, so another namenode node also needs to store the same id information, which must be synchronized from the top, and the upper namenode is required before passing. Run it first:


hadoop-daemon.sh start namenode

Then it is synchronization. For the above operation, I will operate on the master node, that is, one of the planned namenode nodes, so the following operations require another namenode node, such as node03:


hdfs namenode -bootstrapStandby

After the namenode configuration is synchronized, it needs to be initialized again to ZKFCconnect the namenode and zookeeper. This operation is also on the namenode node at the beginning, such as the master machine here.


hdfs zkfc -formatZK

After the above operations, the basic setup is complete, and then execute the hdfs start command on the active namenode node to start:


start-dfs.sh

If nothing else, after the startup is complete, you can use hdfs related commands to add, delete, modify, and check files.

If you want to stop at this time, you can stop one by one, or you can use the stop-all.shcommand to stop namenode, journalnode, etc. directly . Zookeeper is separate, so you still need to operate it separately.

If you need to start the HDFS cluster in HA mode later, you can also use it directly start-all.shto start it.

HA verification

At the beginning, I said that the purpose of HA is actually to solve the problem of single point of failure of the namenode, so automatic master selection verification is required, that is, when it is determined that a namenode is active, it is unavailable, and then see whether the other can be normal and automatically Become activi.

There are two ways to see which namenode is active. One is to access the web interface, that is, the dfs.namenode.http-addressconfiguration in hdfs-site.xml . For example, I have configured master:9870and node03:9870, when the browser visits, you can see which is active. .

The other is in zookeeper, you can use to zkCli.shenter the client operation interface that comes with zookeeper, and then use get to obtain which node currently holds the lock. For example, the namenode cluster I configured is named mycluster, then my command to query the lock status is:


get /hadoop-ha/mycluster/ActiveStandbyElectorLock

The above command can also see which node holds the lock. If you know the current active node, you can use it hadoop-daemon.sh stop namenodeto stop this namenode, and then observe whether the other one will automatically become active, if it changes, and the entire hdfs service If it is used normally, it can prove that the HA mode is basically successfully built.

Guess you like

Origin blog.csdn.net/tuzongxun/article/details/108347899