HDFS分布式集群、高可用集群搭建
一.搭建完全分布式(1.x版本)
hdp-01 | hdp-02 | hdp-03 | hdp-04 | hdp-05 | |
---|---|---|---|---|---|
namenode | √ | ||||
secondarynamenode | √ | ||||
datanode | √ | √ | √ | √ | √ |
Step1:主机相互免秘钥
- 生成自己的秘钥对
- ssh-keygen -t rsa -P ‘’ -f ~/.ssh/id_rsa
- 将自己的公钥拷贝给别人(自己也需要对自己配置免秘钥)
- ssh-copy-id [email protected] ~/.ssh/id_rsa.pub
- ssh-copy-id [email protected] ~/.ssh/id_rsa.pub
- ssh-copy-id [email protected] ~/.ssh/id_rsa.pub
- ssh-copy-id [email protected] ~/.ssh/id_rsa.pub
- -ssh-copy-id [email protected] ~/.ssh/id_rsa.pub
- 将对方的地址加入到known_hosts
- ssh root@hdp-01,输入yes
- ssh root@hdp-02…
- ssh root@hdp-03…
- ssh root@hdp-04
- ssh root@hdp-05
- ssh root@localhost
- ssh [email protected]
Step2:配置Hadoop
-
解压hadoop-2.10.0.tar.gz,并安装到/opt/hadoop
(/opt目录一般为安装自定义软件的目录)- tar -zxvf hadoop-2.10.0.tar.gz -C /opt/hadoop
-
修改JAVA_HOME
- 修改hadoop-env.sh,mapred-env.sh,yarn-env.sh中的JAVA_HOME
-
修改core-site.xml
-
vim core-site.xml
-
<property> <name>fs.defaultFS</name> <value>hdfs://hdp-01:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/var/hadoop/full</value> </property>
-
-
修改hdfs-site.xml
-
vim hdfs-site.xml
-
<property> <name>dfs.namenode.secondary.http-address</name> <value>hdp-02:50090</value> </property> <property> <name>dfs.namenode.secondary.https-address</name> <value>hdp-02:50091</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property>
-
-
修改slaves
-
vim slaves
-
hdp-01 hdp-02 hdp-03 hdp-04 hdp-05
-
-
修改环境变量
-
vim /etc/profile
-
export JAVA_HOME=/opt/jdk/jdk1.8.0_51 export HADOOP_HOME=/opt/hadoop-2.10.0/hadoop-2.10.0 export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
-
source /etc/profile
-
拷贝环境变量到其他节点
扫描二维码关注公众号,回复: 10036185 查看本文章 -
[root@hdp-01 ~]scp -r /etc/profile root@hdp-02:/etc/profile
-
[root@hdp-02 ~] source /etc/profile
-
…
-
-
依次拷贝Hadoop到其他节点
- [root@hdp-02 ~]scp -r root@hdp-01:/opt/hadoop/hadoop2.10.0 /opt/hadoop
- …
-
格式化NameNode节点
-
[root@hdp-01 ~]hdfs namenode -format
-
[root@hdp-01 ~]start-dfs.sh
-
-
浏览HDFS
- http://192.168.183.21:50070/
-
关闭集群
- [1]stop-dfs.sh
二.搭建高可用集群(Hadoop HA)(2.x版本)
hdp-01 | hdp-02 | hdp-03 | hdp-04 | hdp-05 | |
---|---|---|---|---|---|
activenamenode | √ | ||||
standbynamenode | √ | ||||
datanode | √ | √ | √ | √ | √ |
zookeeper | √ | √ | √ | ||
journalnode | √ | √ | √ |
搭建Zookeeper
-
上传Zookeeper,解压,拷贝
- [root@hdp-03 ~]# tar -zxvf zookeeper-3.4.6.tar.gz -C /opt/zookeeper
-
修改配置文件
-
[root@hdp-03 ~]# cd /opt/zookeeper/zookeeper-3.4.6/conf/
-
[root@hdp-03 ~]# cp zoo_sample.cfg zoo.cfg
-
vim zoo.cfg
-
#修改zookeeper数据存放的目录 dataDir=/var/zookeeper # 设置服务器内部通信的地址和zk集群的节点 server.1=hdp-03:2888:3888 server.2=hdp-0:2888:3888 server.3=hdp-05:2888:3888
-
-
创建myid
-
[hdp-03、hdp-04、hdp-05] mkdir -p /var/sxt/zookeeper
-
[root@hdp-03 ~] echo 1 > /var/zookeeper/myid
-
[root@hdp-04 ~] echo 2 > /var/zookeeper/myid
-
[root@hdp-05 ~] echo 3 > /var/zookeeper/myid
-
-
拷贝Zookeeper
- [hdp-04、hdp-05]scp -r root@hdp-01:/opt/zookeeper/zookeeper-3.4.6 /opt/zookeeper/
-
设置环境变量
-
增加ZOOKEEPER_HOME
-
拷贝到hdp-04、hdp-05,并source
-
-
开启集群
- [hdp-03、hdp-04、hdp-05] zkServer.sh start
-
搭建Hadoop-HA
-
在之前搭建的hadoop完全分布式集群(hadoop 1.x 版本的架构)的基础上,再修改以下配置
-
修改core-site.xml
-
vim core-site.xml
-
<property> <name>fs.defaultFS</name> <value>hdfs://myCluster</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>hdp-03:2181,hdp-04:2181,hdp-05:2181</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/var/sxt/hadoop/ha</value> </property>
-
-
修改hdfs-site.xml
-
vim hdfs-site.xml
-
<property> <name>dfs.nameservices</name> <value>myCluster</value> </property> <property> <name>dfs.ha.namenodes.myCluster</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.myCluster.nn1</name> <value>hdp-01:8020</value> </property> <property> <name>dfs.namenode.rpc-address.myCluster.nn2</name> <value>hdp-02:8020</value> </property> <property> <name>dfs.namenode.http-address.myCluster.nn1</name> <value>hdp-01:50070</value> </property> <property> <name>dfs.namenode.http-address.myCluster.nn2</name> <value>hdp-02:50070</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hdp-03:8485;hdp-04:8485;hdp-05:8485/myCluster</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/var/hadoop/ha/jn</value> </property> <property> <name>dfs.client.failover.proxy.provider.myCluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> <value>shell(true)</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/root/.ssh/id_rsa</value> </property> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property>
-
-
修改slaves
-
vim slaves
-
hdp-01 hdp-02 hdp-03 hdp-04 hdp-05
-
-
将hadoop文件夹下的etc目录拷贝到其他节点
-
修改环境变量/etc/profile
-
检查是否有ZOOKEEPER_HOME
-
单独启动JournalNode线程
- [hdp-03、hdp-04、hdp-05] hadoop-daemon.sh start journalnode
- jps
- cd /var/sxt/hadoop/ha/jn/
-
格式化主NameNode
- [root@hdp-01 ~]# hdfs namenode -format
- [root@hdp-01 ~]# hadoop-daemon.sh start namenode
-
启动备用NameNode节点
- [root@hdp-02 ~]# hdfs namenode -bootstrapStandby
-
启动hdp-03~05上的Zookeeper
- zkServer.sh start
- zkServer.sh status
-
格式化ZKFC
- [hdp-01、hdp-02] hdfs zkfc -formatZK
-
重新开启集群时
- 先开启zookeeper集群,再start-dfs.sh即可