前言
这两天整理了一下原来写的各种文档,准备陆陆续续全部写成博客,以防文档丢失,第一篇,使用zookeeper来实现Hadoop的高可用
主机规划
此处有三台主机,规划如下:
主机名 | P地址 | 运行进程 |
---|---|---|
hadoop | 192.168.201.243 | NameNode、ResourceManager、JournalNode、QuorumPeerMain、DFSZKFailoverController |
slave1 | 192.168.201.242 | NameNode、DataNode、ResourceManager、NodeManager、DFSZKFailoverController、JournalNode、QuorumPeerMain |
slave2 | 192.168.201.241 | DataNode、NodeManager、JournalNode、QuorumPeerMain |
安装配置zookeeper
zookeeper使用3.4.5版本,首先进行解压:
tar -xzvf zookeeper-3.4.5.tar.gz -C ~/software/zk/
对zookeeper进行配置,进入conf目录,复制zoo_sample.cfg一份 改名为zoo.cfg
cd ~/software/zk/conf
cp zoo_sample.cfg zoo.cfg
配置dataDir
dataDir=/home/hadoop/software/zookeeper-3.4.5/data
配置集群列表,添加到文件末尾
server.1=hadoop:2888:3888
server.2=slave1:2888:3888
server.3=slave2:2888:3888
给每个JourNode(配置的dataDir目录下)创建一个myid文件,里面内容是server.N中的N
echo "1" > myid
将zookeeper拷贝到其它主机上
scp -r ~/software/zk/ slave1:/home/hadoop/software/
scp -r ~/software/zk/ slave2:/home/hadoop/software/
修改myid中的值,然后分别启动zookeeper
./bin/zkServer.sh start
最后,分别查询zookeeper的状态
./bin/zkServer.sh status
输出:
xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Mode:follower/leader
安装Hadoop
Hadoop使用zookeeper来实现高可用,其中主备切换机子上运行DFSZKFailoverController进程,进程会对NameNode状态进行监控,如果Active节点宕机,则standby节点进程会出发fencing机制,会给Active节点发送ssh远程指令杀掉Namenode,如果未接到返回值还会执行用户自定义的脚本程序。
NameNode节点的fsimage内容相同,edits内容会存放到JournalNode上,JournalNode依赖zookeeper实现高可用。同时,主备NameNode可以组成nameservice,这个就是Hadoop federation机制。
配置:
core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ns1</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>file:/home/hadoop/software/hadoop-2.6.0/data/tmp/</value>
</property>
<!-- 配置zookeeper地址 -->
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop:2181,slave1:2181,slave2:2181</value>
</property>
</configuration>
配置hdfs-site.xml文件
hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>ns1</value>
</property>
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>hadoop:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.ns1.nn1</name>
<value>hadoop:50070</value>
</property>
<!-- 指定JournalNode的元数据在JournalNode上的存放地址 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop:8485;slave1:8485;slave2:8485/ns1</value>
</property>
<!-- JournalNode本地磁盘存放数据的目录 -->
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/hadoop/software/hadoop-2.6.0/journaldata</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server .namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!-- 隔离机制,一种一行 -->
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<!-- ssh发送远程指令需要无密码登陆 -->
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<!-- ssh指令超长时间 -->
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/home/hadoop/software/hadoop-2.6.0/data/name/</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/software/hadoop-2.6.0/data/data/</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
配置mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
配置yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>wls</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>hadoop</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>slave1</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>hadoop:2181,slave1:2181,slave2:2181</value>
</property>
</configuration>
修改slaves文件
slave1
slave2
其中的ssh无密码登陆、jdk安装等步骤不再赘述。
启动
首先启动journalnode
hadoop-daemon.sh start journalnode
对NameNode进行格式化
hdfs namenode -format
hdfs namenode -bootstrapStandby #在standby namenode上执行此命令
# 格式化ZKFC
dfs zkfc -formatZK
start-dfs.sh
start-yarn.sh