一、vmware Centos6.5虚拟机准备
1. 使用同一用户learn ,安装三台CentOS6.5系统虚拟机,分别为vm1, vm2, vm3。
vm1 和vm2 作为 nameNode + DataNode
vm3 只作为DataNode
2. 通过NAT或者桥接模式分别为vm1,vm2, vm3设置网络。文中采用NAT方式, ip地址分别为:
vm1: 192.168.60.128 vm1.learn.com
vm2: 192.168.60.130 vm2.learn.com
vm2: 192.168.60.131 vm3.learn.com
保证相互之间可以ping通, 且/etc/hosts文件配置各节点host。
注释掉以下两行,或者不配置vm1.learn.com和127.0.0.1/::1的映射:
127.0.0.1 vm1.learn.com localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 vm1.learn.com localhost localhost.localdomain localhost6 localhost6.localdomain6
vm1.learn.com 不能映射到127.0.0.1或者::1中, 否则无法启动hadoop集群failed on connection exception: java.net.ConnectException: 拒绝连接。
参考: https://wiki.apache.org/hadoop/ConnectionRefused
3. 为每台机器设置SSH免密登录, 参见:https://blog.csdn.net/zhujq_icode/article/details/82629745
二、Zookeeper集群搭建
参见:https://blog.csdn.net/zhujq_icode/article/details/82687037
三、Hadoop3集群搭建
1. 下载Hadoop3 安装包, 文中下载的是:hadoop-3.0.3.tar.gz
2. vm1中解压安装包至安装目录/home/learn/app/hadoop/:
tar -zxvf hadoop-3.0.3.tar.gz -C /home/learn/app/hadoop/
3. 配置环境变量, 添加至 /etc/profile
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_181
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export HADOOP_HOME=/home/zhujq/app/hadoop/hadoop-3.0.3
export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:${PATH}
若未安装java, 先安装java 8.
4. 配置hadoop
(1)hadoop-env.sh文件设置JAVA_HOME
export JAVA_HOME=/usr/lib/jvm/jdk1.8.0_181
(2)core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/learn/data/hadoopcluster/tmp</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>vm1.learn.com:2181,vm2.learn.com:2181,vm3.learn.com:2181</value>
</property>
</configuration>
(3)hdfs-site.xml
<configuration>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
</property>
<!-- mycluster下面有两个NameNode,分别是nn1,nn2 -->
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>vm1.learn.com:9820</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>vm2.learn.com:9820</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>vm1.learn.com:9870</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>vm2.learn.com:9870</value>
</property>
<!-- 开启NameNode失败自动切换 -->
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<!-- 指定NameNode的元数据在JournalNode上的存放位置 -->
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://vm1.learn.com:8485;vm2.learn.com:8485;vm3.learn.com:8485/mycluster</value>
</property>
<!-- 指定JournalNode在本地磁盘存放数据的位置 -->
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/home/learn/data/hadoopcluster/data/journaldata/jn</value>
</property>
<!-- 配置失败自动切换实现方式 -->
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<!-- 配置隔离机制方法,多个机制用换行分割,即每个机制暂用一行-->
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<!-- 使用sshfence隔离机制时需要ssh免登陆 -->
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/learn/.ssh/id_rsa</value>
</property>
<!-- 配置sshfence隔离机制超时时间 -->
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
</configuration>
(4)mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
采用yarn框架
(5) yarn-site.xml
<configuration>
<!-- 开启RM高可靠 -->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yarn-rm-cluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>vm1.learn.com</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>vm2.learn.com</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>vm1.learn.com:2181,vm2.learn.com:2181,vm3.learn.com:2181</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
</configuration>
(6) workers节点配置
vm1.learn.com
vm2.learn.com
vm3.learn.com
5. 复制安装文件及配置文件至另外两台机器vm2,vm3
四、集群初始化
1. vm2节点上初始化HDFS(格式化文件系统)
hdfs namenode -format
将初始化后的文件复制到同作为nameNode的vm1机器上:
[learn@ ~]# scp -r /home/learn/data/hadoopcluster/tmp/ [email protected]:/home/learn/data/hadoopcluster/
fsimage_0000000000000000000.md5 100% 62 0.1KB/s 00:00
seen_txid 100% 2 0.0KB/s 00:00
fsimage_0000000000000000000 100% 350 0.3KB/s 00:00
VERSION 100% 200 0.2KB/s 00:00
2. vm2节点格式化ZK
hdfs zkfc -formatZK
五、启动hadoop集群
1. vm2启动dfs
[learn@vm2 sbin]$ ./start-dfs.sh
Starting namenodes on [vm1.learn.com vm2.learn.com]
Starting datanodes
Starting journal nodes [vm1.learn.com vm2.learn.com vm3.learn.com]
2018-09-13 15:15:54,975 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Starting ZK Failover Controllers on NN hosts [vm1.learn.com vm2.learn.com]
2. 通过jps观察vm1,vm2,vm3是否都正常启动
//vm1 NameNode DataNode
[learn@vm1 sbin]$ jps
9969 Jps
9618 DataNode
9867 DFSZKFailoverController
9726 JournalNode
3118 QuorumPeerMain
9534 NameNode
//vm2 NameNode DataNode
[learn@vm2 sbin]$ jps
10913 NameNode
2562 QuorumPeerMain
11268 JournalNode
11597 Jps
11037 DataNode
11486 DFSZKFailoverController
//vm3 DataNode
[learn@vm3 sbin]$ jps
6067 JournalNode
2581 QuorumPeerMain
6166 Jps
5963 DataNode
3. WEB UI http://vm1.learn.com:9870/ http://vm2.learn.com:9870/
4. vm1启动yarn
jps发现vm1 和 vm2节点新增进程:
11845 NodeManager
11767 ResourceManager