hadoop ha原生态部署

一.部署前准备工作
三台linux:Centos6.5 x64 操作系统(需自行安装,先安装一台,然后克隆两台)

Hadoop 2.7.3  64位安装包

JDK 1.8.0_111

Zookeeper 3.4.9 安装包


二.集群规划
主机名
IP地址
安装的软件
运行的进程
bigdata-sg-a-01 (主节点)
172.21.14.150
JDK、Hadoop
NameNode、ResourceManager、QuorumPeerMain
DataNode、DFSZKFailoverController、NodeManager、JournalNode
bigdata-sg-a-02
172.21.14.151
JDK、Hadoop
NameNode(StandBy)、ResourceManager(StandBy)、QuorumPeerMain
DataNode、DFSZKFailoverController、NodeManager、JournalNode
bigdata-sg-a-03
172.21.14.152
JDK、Hadoop
QuorumPeerMain、DataNode、NodeManager、JournalNode


2.修改主机名,根据集群规划来进行修改(需要在三台机器上都进行配置)
临时修改主机名
[root@ bigdata-sg-a- 01 ~]# hostname
bigdata-sg-a-01
永久修改主机名
[root@bigdata-sg-a-01 ~]# vim /etc/sysconfig/network
NETWORKING=yes
HOSTNAME=bigdata-sg-a-01
3.修改主机名和IP的映射关系/etc/hosts(需要在三台机器上都进行配置)
[root@bigdata-sg-a-01 ~]# vim /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
172.21.14.150 bigdata-sg-a-01
172.21.14.151 bigdata-sg-a-02
172.21.14.152 bigdata-sg-a-03
4.关闭防火墙,关闭防火墙开启自启(需要在三台机器上都进行配置)
[root@bigdata-sg-a-03 ~]# service iptables stop
[root@bigdata-sg-a-03~]# chkconfig iptables off
5.ssh免登陆( bigdata-sg-a-01上进行配置 )
[root@bigdata-sg-a-01~]# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
2e:75:72:ff:54:65:51:e7:e6:cb:42:e9:45:3e:ac:7f root@bigdata-sg-a-01.com
The key’s randomart image is:
+–[ RSA 2048]—-+
| .+|
| .o|
| .=|
| =+.|
| S o o =o|
| o + .o +.o|
| . . .+.o |
| . oo E|
| …|
+—————–+
执行完上面的命令后,会生成两个文件id_rsa(私钥)、id_rsa.pub(公钥),将公钥拷贝到要免登陆的机器上
[root@bigdata-sg-a-03~]# ssh-copy-id bigdata-sg-a-02
[root@bigdata-sg-a-03~]# ssh-copy-id bigdata-sg-a-01
测试一下,ssh免登陆

6.安装JDK
使用winscp软件将JDK包上传到bigdata-sg-a-01机器上/opt目录下
[root@bigdata-sg-a-03~]# cd /opt/
[root@bigdata-sg-a-03 opt]# ls
jdk-8u91-linux-x64.tar.gz
[root@bigdata-sg-a-03 opt]# tar -zxf jdk-8u111-linux-x64.tar.gz
配置JDK环境变量
[root@bigdata-sg-a-03 opt]# vim /etc/profile
#java env
export JAVA_HOME=/opt/jdk
export JRE_HOME=/opt/jdk/jre
export CLASSPATH=$JAVA_HOME/lib
export PATH=:$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:
刷新环境变量,并测试java环境变量配置是否成功
[root@bigdata-sg-a-03 ~]# source /etc/profile
[root@bigdata-sg-a-03 ~]# java -version
java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
配置好JDK后,将配置好的JDK和/etc/profile远程拷贝到另外两台机器上
[root@bigdata-sg-a-03 opt]# scp -r  jdk/ bigdata-sg-a-01:/opt/
[root@bigdata-sg-a-03 opt]# scp -r jdk/ bigdata-sg-a-02:/opt/
[root@bigdata-sg-a-03 opt]# scp /etc/profile bigdata-sg-a-01:/etc/
[root@bigdata-sg-a-03 opt]# scp /etc/profile bigdata-sg-a-02:/etc/
bigdata-sg-a-01,bigdata-sg-a-02 上刷新环境变量,并测试java环境变量是否配置成功
[root@bigdata-sg-a-02 ~]# source /etc/profile
[root@bigdata-sg-a-02 ~]# java -version
java version "1.8.0_111"
Java(TM) SE Runtime Environment (build 1.8.0_111-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.111-b14, mixed mode)
四.Zookeeper集群安装
配置好基础环境后,要进行Zookeeper集群的安装
1.使用winscp软件,将zookeeper-3.4.9.tar.gz安装包上传到/opt目录下
2.解压
[root@bigdata-sg-a-03 opt]# tar -zxf zookeeper-3.4.9.tar.gz
3.配置zookeeper
[root@bigdata-sg-a-03 opt]# cd zookeeper-3.4.9
[root@bigdata-sg-a-03 zookeeper-3.4.9]# cd conf/
[root@bigdata-sg-a-03 conf]# ls
configuration.xsl log4j.properties zoo_sample.cfg
[root@bigdata-sg-a-03conf]# cp zoo_sample.cfg zoo.cfg
修改配置文件
[root@bigdata-sg-a-03 conf]# vim zoo.cfg
修改data目录地址
dataDir=/opt/zookeeper-3.4.9/data/
并在文件最后添加
server.1=bigdata-sg-a-01:2888:3888
server.2=bigdata-sg-a-02:2888:3888
server.3=bigdata-sg-a-03:2888:3888
创建相应的data dir目录和myid文件,其中,这个myid文件必须创建,否则启动会报错
myid一定对应好zoo.cfg中配置的server后面1、2、3
[root@bigdata-sg-a-03 conf]# mkdir /opt/zookeeper-3.4.9/data
[root@bigdata-sg-a-03 data]# vim /opt/zookeeper-3.4.9/data/myid
1
将配置好的zookeeper scp拷贝到另外两台机器
[root@bigdata-sg-a-03 opt]# scp -r /opt/zookeeper-3.4.9 bigdata-sg-a-01:/opt/
[root@bigdata-sg-a-03 opt]# scp -r /opt/zookeeper-3.4.9 bigdata-sg-a-02:/opt/
bigdata-sg-a-01bigdata-sg-a-02上分别修改  /opt/zookeeper-3.4.9/data/myid 文件,改为对应的id号
bigdata-sg-a-02为2,bigdata-sg-a-01为1
4.添加环境变量(在bigdata-sg-a-02bigdata-sg-a-01上也进行配置)
[root@bigdata-sg-a-03 zookeeper-3.4.9]# vim /etc/profile
#zookeeper env
export ZOOKEEPER_HOME=/opt/zookeeper-3.4.9
#hadoop env
export HADOOP_HOME=/opt/hadoop-2.7.3
export PATH=:$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:$ZOOKEEPER_HOME/bin:
保存退出,刷新环境变量
[root@bigdata-sg-a-03 zookeeper-3.4.9]# source /etc/profile
[root@bigdata-sg-a-03 zookeeper-3.4.9]# zk
zkCleanup.sh zkCli.cmd zkCli.sh zkEnv.cmd zkEnv.sh zkServer.cmd zkServer.sh
5.开启zookeeper(在三台机器上执行)
[root@bigdata-sg-a-03 zookeeper-3.4.9]# zkServer.sh start
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg
Starting zookeeper … STARTED
使用jps命令查看zookeeper对应java进程,发现 QuorumPeerMain正在运行,即zookeeper开启成功
[root@bigdata-sg-a-03 zookeeper-3.4.9]# jps
2713 Jps
2671 QuorumPeerMain
检查zookeeper集群是否启动成功,发现三台机器上,有两台是 follower ,一台是leader,即zookeeper安装成功,并成功即启动
[root@bigdata-sg-a-01 zookeeper-3.4.9]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: follower
[root@bigdata-sg-a-03 ~]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: leader
[root@bigdata-sg-a-02~]# zkServer.sh status
ZooKeeper JMX enabled by default
Using config: /opt/zookeeper-3.4.9/bin/../conf/zoo.cfg
Mode: follower
五.安装hadoop
使用winscp软件将hadoop-2.7.3.tar.gz包上传到bigdata-sg-a-01机器的 /opt目录下
1.解压
[root@bigdata-sg-a-01 opt]# tar -zxf hadoop-2.7.3.tar.gz
2.进入解压后目录,修改配置文件
[root@bigdata-sg-a-01opt]# cd /opt/hadoop-2.7.3/etc/hadoop/
先修改hadoop-env.sh修改JAVA HOME
[root@bigdata-sg-a-01 hadoop]# vim hadoop-env.sh
export JAVA_HOME=/opt/jdk
修改core-site.xml添加如下内容:
<configuration>
<!– 指定hdfs的nameservice为ns1 –>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ns1</value>
</property>
<!– 指定hadoop临时目录 –>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop-2.7.3/tmp</value>
</property>
<!– 指定zookeeper地址 –>
<property>
<name>ha.zookeeper.quorum</name>
<value>bigdata-sg-a-01:2181 ,bigdata-sg-a-02:2181,bigdata-sg-a-03:2181</value>
</property>
</configuration>
修改hdfs-site.xml添加如下内容:
<configuration>
<!–指定hdfs的nameservice为ns1,需要和core-site.xml中的保持一致 –>
<property>
<name>dfs.nameservices</name>
<value>ns1</value>
</property>
<!– ns1下面有两个NameNode,分别是nn1,nn2 –>
<property>
<name>dfs.ha.namenodes.ns1</name>
<value>nn1,nn2</value>
</property>
<!– nn1的RPC通信地址 –>
<property>
<name>dfs.namenode.rpc-address.ns1.nn1</name>
<value>bigdata-sg-a-01:9000</value>
</property>
<!– nn1的http通信地址 –>
<property>
<name>dfs.namenode.http-address.ns1.nn1</name>
<value>bigdata-sg-a-01:50070</value>
</property>
<!– nn2的RPC通信地址 –>
<property>
<name>dfs.namenode.rpc-address.ns1.nn2</name>
<value>bigdata-sg-a-02:9000</value>
</property>
<!– nn2的http通信地址 –>
<property>
<name>dfs.namenode.http-address.ns1.nn2</name>
<value>bigdata-sg-a-02:50070</value>
</property>
<!– 指定NameNode的元数据在JournalNode上的存放位置 –>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://bigdata-sg-a-01:8485;bigdata-sg-a-02:8485;bigdata-sg-a-03:8485/ns1</value>
</property>
<!– 指定JournalNode在本地磁盘存放数据的位置 –>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/opt/hadoop-2.7.3/journal</value>
</property>
<!– 开启NameNode失败自动切换 –>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!– 配置失败自动切换实现方式 –>
<property>
<name>dfs.client.failover.proxy.provider.ns1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!– 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行–>
<property>
<name>dfs.ha.fencing.methods</name>
<value>
sshfence
shell(/bin/true)
</value>
</property>
<!– 使用sshfence隔离机制时需要ssh免登陆 –>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<!– 配置sshfence隔离机制超时时间 –>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
</configuration>
复制 mapred-site.xml实例文件
[root@bigdata-sg-a-01 hadoop]# cp mapred-site.xml.template mapred-site.xml
修改mapred-site.xml文件,添加如下内容:
<configuration>
<!– 指定mr框架为yarn方式 –>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
修改yarn-site.xml文件,添加如下内容:
<configuration>
<!– 开启RM高可靠 –>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<!– 指定RM的cluster id –>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yrc</value>
</property>
<!– 指定RM的名字 –>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<!– 分别指定RM的地址 –>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>bigdata-sg-a-01</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>bigdata-sg-a-02</value>
</property>
<!– 指定zk集群地址 –>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>bigdata-sg-a-01:2181,bigdata-sg-a-02:2181,bigdata-sg-a-03:2181</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
修改slaves文件,添加如下内容:
[root@bigdata-sg-a-01 hadoop]# vim slaves
bigdata-sg-a-01
bigdata-sg-a-02
bigdata-sg-a-03
将配置好的hadoop拷贝到另外两个节点
[root@bigdata-sg-a-01 opt]# scp -r /opt/hadoop-2.7.3 bigdata-sg-a-02:/opt/
[root@bigdata-sg-a-01  opt]# scp -r /opt/hadoop-2.7.3 bigdata-sg-a- 03:/opt/
将hadoop添加到环境变量(三台机器上全部操作)
export HADOOP_HOME=/opt/hadoop-2.7.3

export PATH=:$PATH:$JAVA_HOME/bin:$JRE_HOME/bin:

$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$ZOOKEEPER_HOME/bin:

六:开启hadoop集群
1.先开启journalnode(在主节点bigdata-sg-a-02上执行)
[root@bigdata-sg-a-02 opt]# hadoop-daemons.sh start journalnode
bigdata-sg-a-01: starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-bigdata-sg-a-01.out
bigdata-sg-a-02: starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-bigdata-sg-a-02.out
bigdata-sg-a-03: starting journalnode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-journalnode-bigdata-sg-a-03.out
jps查看相应java进程,查看是否三台机器都起来了
[root@bigdata-sg-a-02 opt]# jps
3162 JournalNode
3211 Jps
2671 QuorumPeerMain
2.格式化hdfs,在主节点bigdata-sg-a-01上执行
[root@bigdata-sg-a-01 opt]# hdfs namenode -format
17/12/06 14:18:12 INFO common.Storage: Storage directory /opt/hadoop-2.7.3/tmp/dfs/name has been successfully formatted.
格式化后会在根据core-site.xml中的hadoop.tmp.dir配置生成个文件
这里我配置的是  /opt/hadoop-2.7.3/tmp/ ,然后将  /opt/hadoop-2.7.3/tmp/ 拷贝到  /opt/hadoop-2.7.3/目录 下。
[root@bigdata-sg-a-01 hadoop-2.7.3]# scp -r /opt/hadoop-2.7.3/tmp/ bigdata-sg-a-02:/opt/hadoop-2.7.3/
3.格式化zk,在bigdata-sg-a-01上执行
[root@bigdata-sg-a-01 hadoop-2.7.3]# hdfs zkfc -formatZK
17/12/06 15:05:32 INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/ns1 in ZK.
4.开启hdfs
[root@bigdata-sg-a-01 hadoop-2.7.3]# start-dfs.sh
Starting namenodes on [bigdata-sg-a-01 bigdata-sg-a-02]
bigdata-sg-a-02: starting namenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-namenode-bigdata-sg-a-02.out
bigdata-sg-a-01: starting namenode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-namenode-bigdata-sg-a-01.out
bigdata-sg-a-03: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-bigdata-sg-a-03.out
bigdata-sg-a-01: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-bigdata-sg-a-01.out
bigdata-sg-a-02: starting datanode, logging to /opt/hadoop-2.7.3/logs/hadoop-root-datanode-bigdata-sg-a-02.out
Starting journal nodes [bigdata-sg-a-01 bigdata-sg-a-02 bigdata-sg-a-03]
由于我们前面开启了journalnode,所以这里显示journalnode已运行
bigdata-sg-a-02: journalnode running as process 3259. Stop it first.
bigdata-sg-a-01: journalnode running as process 3162. Stop it first.
bigdata-sg-a-03: journalnode running as process 3273. Stop it first.
Starting ZK Failover Controllers on NN hosts [bigdata-sg-a-01 bigdata-sg-a-02]
bigdata-sg-a-01: starting zkfc, logging to /opt/hadoop-2.7.3/logs/hadoop-root-zkfc-bigdata-sg-a-01.out
bigdata-sg-a-02: starting zkfc, logging to /opt/hadoop-2.7.3/logs/hadoop-root-zkfc-bigdata-sg-a-02.out
jps命令
bigdata-sg-a- 01 :
[root@bigdata-sg-a-01 ~]# jps
3954 Jps
3527 DataNode
3162 JournalNode
3434 NameNode
3838 DFSZKFailoverController
2671 QuorumPeerMain
bigdata-sg-a-02
[root@bigdata-sg-a-02 ~]# jps
3466 DataNode
2939 QuorumPeerMain
3259 JournalNode
3612 DFSZKFailoverController
3727 Jps
3407 NameNode
bigdata-sg-a-03:
[root@bigdata-sg-a-03~]# jps
3520 Jps
3381 DataNode
2937 QuorumPeerMain
3273 JournalNode
5.开启yarn
[root@bigdata-sg-a-01 ~]# start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-resourcemanager-bigdata-sg-a-01.out
bigdata-sg-a-02: starting nodemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-nodemanager-bigdata-sg-a-02.out
bigdata-sg-a-03: starting nodemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-nodemanager-bigdata-sg-a-03.out
bigdata-sg-a-01: starting nodemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-nodemanager-bigdata-sg-a-01.out
开启ResourceManager备节点
bigdata-sg-a-02上执行:
[root@bigdata-sg-a-02 ~]# yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /opt/hadoop-2.7.3/logs/yarn-root-resourcemanager-bigdata-sg-a-02.out
6.查看jps
bigdata-sg-a-01:
[root@bigdata-sg-a-01~]# jps
4100 NodeManager
4005 ResourceManager
3527 DataNode
3162 JournalNode
3434 NameNode
4396 Jps
3838 DFSZKFailoverController
2671 QuorumPeerMain
bigdata-sg-a-02:
[root@bigdata-sg-a-02 ~]# jps
3892 ResourceManager
3770 NodeManager
3466 DataNode
2939 QuorumPeerMain
3259 JournalNode
3612 DFSZKFailoverController
3967 Jps
3407 NameNode
bigdata-sg-a-03:
[root@bigdata-sg-a-03 ~]# jps
3554 NodeManager
3381 DataNode
3654 Jps
2937 QuorumPeerMain
3273 JournalNode
7.在web页面上查看hdfs
一个Active,一个Standby,hdfs HA成功

8.在web页面上查看ya rn

在网页上输入172.21.14.150:8088/cluster/cluster,ResourceManger状态为Active

在网页上输入 172.21.14.151:8088/cluster/cluster ,ResourceManger状态为StandBy

ResourceManger HA成功!


猜你喜欢

转载自blog.csdn.net/qq_33283716/article/details/80867067