hadoop集群搭建(完全分布式详细版)

上篇:三台机器互相免密登录(详情版)


1、hadoop集群配置,需要配置如下文件:

hadoop-env.sh
core-site.xml
hdfs-site.xml
mapred-site.xml.template(先拷贝 mapred-site.xml.template文件并重命名为: mapred-site.xml
yarn-site.xml

操作如下:

(1)进入hadoop-2.7.2/etc/hadoop文件目录下,编辑 hadoop-env.sh文件

hadoop-env.sh配置:

[root@hadoop105 hadoop]# vim hadoop-env.sh 

# The java implementation to use.
export JAVA_HOME=/usr/local/java/module/jdk1.8


(2)进入hadoop-2.7.2/etc/hadoop文件目录下,编辑 core-site.xml文件

core-site.xml配置:

[root@hadoop105 hadoop]# vim core-site.xml 

<configuration>
      
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop105:9000</value>
</property>

<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/module/hadoop-2.7.2/tmp</value>
</property>

</configuration>

配置参考官方文档:http://hadoop.apache.org/docs/r2.7.2/
在这里插入图片描述

(3)进入hadoop-2.7.2/etc/hadoop文件目录下,编辑hdfs-site.xml

hdfs-site.xml配置:

[root@hadoop105 hadoop]# vim hdfs-site.xml

<configuration>

<property>
<name>dfs.replication</name>
<value>2</value>
</property>

<!-- 指定Hadoop辅助名称节点主机配置 -->
<property>
      <name>dfs.namenode.secondary.http-address</name>
      <value>hadoop105:50090</value>
</property>

</configuration>

~                  

2:表示是副本,数据为2,防止机器宕机,默认是三台的

(4)进入hadoop-2.7.2/etc/hadoop文件目录下,先拷贝 mapred-site.xml.template文件并重命名为: mapred-site.xml

mapred-site.xml配置

[root@hadoop105 hadoop]# cp mapred-site.xml.template mapred-site.xml

[root@hadoop105 hadoop]# vim mapred-site.xml

<configuration>

<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

</configuration>

(5)进入hadoop-2.7.2/etc/hadoop文件目录下,编辑 yarn-site.xml文件

[root@hadoop105 hadoop]# vim yarn-site.xml 

<configuration>

<!-- Site specific YARN configuration properties -->

<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop105</value>
</property>

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>

</configuration>

现在hadoop完全分布式配置完成


接下来要做的就是分发到别的机器上

操作步骤如下:

(1)把配置好的第一台(hadoop105)机器分发到别的机器上(第二台机{hadoop}器与第三台机器),执行命令如下:

[root@hadoop105 module]# scp -r hadoop-2.7.2/ hadoop106:/usr/local/hadoop/module/

[root@hadoop105 module]# scp -r hadoop-2.7.2/ hadoop107:/usr/local/hadoop/module/

分发完毕!


接下来,需要做格式化

操作步骤如下:

格式化:

[root@hadoop105 module]# hadoop namenode -format

检验格式化是否成功:

20/01/14 13:17:30 INFO common.Storage: Storage directory /usr/local/hadoop/module/hadoop-2.7.2/tmp/dfs/name has been successfully formatted.

启动namenode

[root@hadoop105 sbin]# ls
distribute-exclude.sh  hdfs-config.sh           refresh-namenodes.sh  start-balancer.sh    start-yarn.cmd  stop-balancer.sh    stop-yarn.cmd
hadoop-daemon.sh       httpfs.sh                slaves.sh             start-dfs.cmd        start-yarn.sh   stop-dfs.cmd        stop-yarn.sh
hadoop-daemons.sh      kms.sh                   start-all.cmd         start-dfs.sh         stop-all.cmd    stop-dfs.sh         yarn-daemon.sh
hdfs-config.cmd        mr-jobhistory-daemon.sh  start-all.sh          start-secure-dns.sh  stop-all.sh     stop-secure-dns.sh  yarn-daemons.sh

[root@hadoop105 sbin]# hadoop-daemon.sh start namenode
starting namenode, logging to /usr/local/hadoop/module/hadoop-2.7.2/logs/hadoop-root-namenode-hadoop105.out

查看启动进程namenode

[root@hadoop105 sbin]# jps
11056 NameNode
11123 Jps
[root@hadoop105 sbin]# 

访问:http://hadoop105:50070

访问之前需要关闭

[root@hadoop105 sbin]# systemctl stop firewalld.service
[root@hadoop105 sbin]# 

在这里插入图片描述
启动第二台机器hadoop106

[root@hadoop106 sbin]# source /etc/profile
[root@hadoop106 sbin]# hadoop-daemon.sh start  datanode
starting datanode, logging to /usr/local/hadoop/module/hadoop-2.7.2/logs/hadoop-root-datanode-hadoop106.out
[root@hadoop106 sbin]# 

刷新后,就有数据
在这里插入图片描述
在这里插入图片描述
接着,启动第三台机器

[root@hadoop107 sbin]# hadoop-daemon.sh start  datanode
starting datanode, logging to /usr/local/hadoop/module/hadoop-2.7.2/logs/hadoop-root-datanode-hadoop107.out

刷新后
在这里插入图片描述


接下来,还需要配置slave,在/hadoop-2.7.2/etc/hadoop文件目录下:

[root@hadoop105 hadoop]# pwd
/usr/local/hadoop/module/hadoop-2.7.2/etc/hadoop
[root@hadoop105 hadoop]#  vim slaves 

hadoop105
hadoop106
hadoop107

启动hadoop105、Hadoop06、hadoop107命令

[root@hadoop105 hadoop]# start-dfs.sh

刷新后
在这里插入图片描述
想要访问8088,还需要yarn-site.xml文件配置

操作步骤

(1)进入hadoop-2.7.2/etc/hadoop文件目录下编辑yarn-site.xml配置文件

[root@hadoop105 hadoop]# vim yarn-site.xml


<configuration>

<!-- Site specific YARN configuration properties -->

<!-- 指定YARN的ResourceManager的地址 -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop105</value>
</property>

<!-- Reducer获取数据的方式 -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>


<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop105:8032</value>
</property>

<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop105:8030</value>
</property>

<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop105:8031</value>
</property>

<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop105:8033</value>
</property>

<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop105:8088</value>
</property>

</configuration>

关闭yarn.sh服务

[root@hadoop105 hadoop-2.7.2]# stop-yarn.sh

重新启动yarn.sh服务

[root@hadoop105 hadoop-2.7.2]# start-yarn.sh

starting yarn daemons
starting resourcemanager, logging to /usr/local/hadoop/module/hadoop-2.7.2/logs/yarn-root-resourcemanager-hadoop105.out
hadoop107: starting nodemanager, logging to /usr/local/hadoop/module/hadoop-2.7.2/logs/yarn-root-nodemanager-hadoop107.out
hadoop106: starting nodemanager, logging to /usr/local/hadoop/module/hadoop-2.7.2/logs/yarn-root-nodemanager-hadoop106.out
hadoop105: starting nodemanager, logging to /usr/local/hadoop/module/hadoop-2.7.2/logs/yarn-root-nodemanager-hadoop105.out

查看监听端口进程

[root@hadoop105 hadoop-2.7.2]# netstat -tpnl | grep java 

tcp        0      0 192.168.219.105:9000    0.0.0.0:*               LISTEN      11056/java          
tcp        0      0 192.168.219.105:50090   0.0.0.0:*               LISTEN      11537/java          
tcp        0      0 127.0.0.1:41485         0.0.0.0:*               LISTEN      11385/java          
tcp        0      0 0.0.0.0:50070           0.0.0.0:*               LISTEN      11056/java          
tcp        0      0 0.0.0.0:50010           0.0.0.0:*               LISTEN      11385/java          
tcp        0      0 0.0.0.0:50075           0.0.0.0:*               LISTEN      11385/java          
tcp        0      0 0.0.0.0:50020           0.0.0.0:*               LISTEN      11385/java          
tcp6       0      0 :::8040                 :::*                    LISTEN      12612/java          
tcp6       0      0 :::8042                 :::*                    LISTEN      12612/java          
tcp6       0      0 192.168.219.105:8088    :::*                    LISTEN      12510/java          
tcp6       0      0 :::13562                :::*                    LISTEN      12612/java          
tcp6       0      0 192.168.219.105:8030    :::*                    LISTEN      12510/java          
tcp6       0      0 192.168.219.105:8031    :::*                    LISTEN      12510/java          
tcp6       0      0 192.168.219.105:8032    :::*                    LISTEN      12510/java          
tcp6       0      0 192.168.219.105:8033    :::*                    LISTEN      12510/java          
tcp6       0      0 :::44898                :::*                    LISTEN      12612/java    

关闭防火墙

systemctl stop firewalld.service 

访问:http://hadoop105:8088
在这里插入图片描述
hadoop完全分布式启动ok!

发布了130 篇原创文章 · 获赞 18 · 访问量 2253

猜你喜欢

转载自blog.csdn.net/weixin_39868387/article/details/103965735