01 Introduction
hadoop by Apache Foundation developed a large distributed computing and data storage platform, characterized by: high reliability, easy to expand, computationally efficient and low cost.
It has two core projects, which is responsible for the data stored in HDFS, MapReduce is responsible for distributed computing. This article explains HDFS HA architecture building projects
This switched my personal public number: Tianmu Star, please a lot of attention.
First, the software version and architecture diagram used.
linux:CentOS 7.5.1804
hadoop:hadoop-3.2.0.tar.gz
zookeeper:zookeeper-3.4.14.tar.gz
java:jdk-8u201-linux-x64.rpm
In the previous architecture, namenode is a single point of failure where HDFS clusters, so the official launch JournalNode use to manage namenode the edit log, and automatically switch status namenode two nodes (Standby / Active)
The official document path: http: //hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html
Use three hosts set up a test environment, the role of the following
Character | namenode | DataNode | journalnode | zookeeper |
node1 | √ | × | √ | √ |
node2 | √ | √ | √ | √ |
node3 | × | √ | √ | √ |
NameNode: responsible for managing the fsimage (file system directory) and edits log (operation log)
DataNode: responsible for specific data storage block, and transmits Blockreport NameNode
JournalNode: fsimage edits and manages the consistency of master and slave node
zookeeper: NameNode standby state is responsible for monitoring and failover
Second, the installation
Three node configured hosts file and configure ssh login-free secret
三台主机都需要配置,并拷贝密匙到其他两台主机
$ ssh-keygen
$ ssh-copy-id $hostname
Dependencies: the need to install JDK, the present embodiment uses the installation packages RPM
$ rpm -ivh jdk-8u201-linux-x64.rpm
$ java -version
java version "1.8.0_201"
Java(TM) SE Runtime Environment (build 1.8.0_201-b09)
Java HotSpot(TM) 64-Bit Server VM (build 25.201-b09, mixed mode)
1, the installation zookeeper
Installation in three node zookeeper, has said before installation, please refer ELK cluster components -kafka ZooKeeper installation steps in this article.
2, install hadoop
在官网下载hadoop并解压至相应的路径
$ tar xvf hadoop-3.2.0.tar.gz -C /opt
$ mv /opt/hadoop-3.2.0 /opt/hadoop
新建以下文件夹,用于hadoop相关角色的使用
$ mkdir -p /opt/data/{datanode,journal,namenode,tmp}
3, configure hadoop
Configuring environment settings: hadoop-env.sh
$ cat >> hadoop-env.sh <<EOF
export JAVA_HOME=/usr/java/latest
EOF
Place: Core-site.xml
$ vim /opt/hadoop/etc/hadoop/core-site.xml
<configuration>
<!-- 默认的文件系统名称 -->
<property>
<name>fs.defaultFS</name>
<value>hdfs://tkscluster</value>
</property>
<!-- TMP存放路径,如不配置namenode.dir与datanode.dir将存放此处 -->
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/data/tmp</value>
</property>
<!-- 回收站文件保留时间 -->
<property>
<name>fs.trash.interval</name>
<value>1440</value>
</property>
<!-- 提供故障转移的zookeeper列表 -->
<property>
<name>ha.zookeeper.quorum</name>
<value>node1:2181,node2:2181,node3:2181</value>
</property>
</configuration>
Configuration: HDFS-the site.xml
$ vim /opt/hadoop/etc/hadoop/hdfs-site.xml
<configuration>
<!-- Hadoop 3.x HA Configuration -->
<!-- hdfs集群名称 -->
<property>
<name>dfs.nameservices</name>
<value>tkscluster</value>
</property>
<!-- 设置两个namenode节点 -->
<property>
<name>dfs.ha.namenodes.tkscluster</name>
<value>nn1,nn2</value>
</property>
<!-- namedoe的rpc地址 -->
<property>
<name>dfs.namenode.rpc-address.tkscluster.nn1</name>
<value>node1:9820</value>
</property>
<property>
<name>dfs.namenode.rpc-address.tkscluster.nn2</name>
<value>node2:9820</value>
</property>
<!-- namedoe的http地址 -->
<property>
<name>dfs.namenode.http-address.tkscluster.nn1</name>
<value>node1:9870</value>
</property>
<property>
<name>dfs.namenode.http-address.tkscluster.nn2</name>
<value>node2:9870</value>
</property>
<!-- journalNode的URI地址 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://node1:8485;node2:8485;node3:8485/tkscluster</value>
</property>
<!-- journalNode存放edits目录 -->
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/data/journal</value>
</property>
<!-- 用于dfs客户端连接活动namenode -->
<property>
<name>dfs.client.failover.proxy.provider.tkscluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!-- 用于ha故障转移时使用ssh方式 -->
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<!-- ssh登录时使用的密匙,配合上一条配置 -->
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/root/.ssh/id_rsa</value>
</property>
<!-- 启动HA故障自动切换 -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- 数据块副本的数量,按实际情况设置 -->
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<!-- namenode文件的存储路径 -->
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/data/namenode</value>
</property>
<!-- datanode的数据存储路径 -->
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/data/datanode</value>
</property>
<!-- 测试环境关闭了权限管理,如生产环境需开启 -->
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
<!-- datanode故障时的replace方式 -->
<property>
<name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
<value>false</value>
</property>
<!-- datanode故障时的replace策略 -->
<property>
<name>dfs.client.block.write.replace-datanode-on-failure.policy</name>
<value>DEFAULT</value>
</property>
<!-- 支持文件append,主要是支持hbase -->
<property>
<name>dfs.support.append</name>
<value>true</value>
</property>
</configuration>
Configuration: workers
Add datanode node list of roles
$ vim /opt/hadoop/etc/hadoop/workers
node2
node3
3, start hadoop
Initialization HDFS
先启动三个节点的journalnode进程
$ /opt/hadoop/bin/hdfs --daemon start journalnode
初始化namenode
$ /opt/hadoop/bin/hdfs namenode -format
初始化ZKFC
$ /opt/hadoop/bin/hdfs zkfc -formatZK
HDFS permissions configuration startup script
PS: The following two scripts need to be added
$ vim /opt/hadoop/sbin/start-dfs.sh
$ vim /opt/hadoop/sbin/stop-dfs.sh
在配置文件以下段添加
# defined. e.g., HDFS_DATANODE_USER=root HDFS_DATANODE_SECURE_USER=hdfs
#
HDFS_NAMENODE_USER=root
HDFS_DATANODE_USER=root
HDFS_JOURNALNODE_USER=root
HDFS_ZKFC_USER=root
Ready to open HDFS service
先关闭三个节点之前开启的journalnode进程,因ZKFC服务未启动
$ /opt/hadoop/bin/hdfs --daemon stop journalnode
启动HDFS
$ /opt/hadoop/sbin/start-dfs.sh
关闭HDFS
$ /opt/hadoop/sbin/stop-dfs.sh
Check the status after the start
在node1使用java的jps命令
$ jps
43618 QuorumPeerMain #zk服务
69988 NameNode #namenode服务
70615 JournalNode #journalnode服务
71067 DFSZKFailoverController #ZKFC服务
71628 Jps
在node2使用jps命令查看
$ jps
43324 QuorumPeerMain #zk服务
64563 NameNode #namenode服务
70654 JournalNode #journalnode服务
71342 DFSZKFailoverController #ZKFC服务
65726 DataNode #datanode服务
71642 Jps
使用jps查询node3的情况
43214 QuorumPeerMain #zk服务
70512 JournalNode #journalnode服务
64365 DataNode #datanode服务
71428 Jps
Visit the website to view the status
http://192.168.159.45:9870/