1、前言
首先我们要准备三台centos机器,这三台centos机器的ip、hostname分别为
IP | hostname |
---|---|
192.168.1.31 | master |
192.168.1.32 | slave1 |
192.168.1.33 | slave2 |
相关步骤:
- 设置/etc/hosts
- 关闭防火墙
- 关闭selinux
- 配置ssh免密登录
- 安装jdk
- 安装 hadoop
- 配置hadoop
- 启动hadoop
- 验证hadoop是否安装成功
2、 参考本人的使用vagrant创建多台centos7虚拟机,并使用Docker Swarm建立主从集群 这篇文章,使用vagrant创建三台centos7虚拟机。
Vagrantfile文件内容,如下
Vagrant.configure("1") do |config|
config.vm.define "master" do |vb|
config.vm.provider "virtualbox" do |v|
v.memory = 1024
v.cpus = 1
end
vb.vm.host_name = "master"
vb.vm.network :public_network, ip: "192.168.1.31"
vb.vm.box = "my-centos7"
end
config.vm.define "slave1" do |vb|
config.vm.provider "virtualbox" do |v|
v.memory = 1024
v.cpus = 1
end
vb.vm.host_name = "slave1"
vb.vm.network :public_network, ip: "192.168.1.32"
vb.vm.box = "my-centos7"
end
config.vm.define "slave2" do |vb|
config.vm.provider "virtualbox" do |v|
v.memory = 1024
v.cpus = 1
end
vb.vm.host_name = "slave2"
vb.vm.network :public_network, ip: "192.168.1.33"
vb.vm.box = "my-centos7"
end
end
3、设置/etc/hosts
# 设置命令
vi /etc/hosts
# 设置内容
192.168.1.31 master
192.168.1.32 slave1
192.168.1.33 slave2
4、关闭防火墙
# 查看防火墙状态
firewall-cmd --state
# 停止firewall
systemctl stop firewalld.service
5、关闭selinux
# 修改selinux
vi /etc/selinux/config
# 修改内容
SELINUX=disabled
6、配置ssh免密登录
# 创建私钥
ssh-keygen -t rsa
# 将创建私钥copy到slave1
ssh-copy-id root@slave1
在slave1、slave2做同样的操作
设置完成后,在master中ssh slave1、slave2,就不再需要输入密码了
slave1中ssh master、slave2以及slave2中ssh master、slave1参考上面的操作。
7、安装jdk,参考本人另外一篇文章Centos服务器上安装jdk
8、安装 hadoop
8.1将hadoop-2.6.5.tar.gz安装包上传到/usr目录下,并解压
8.2 创建三个目录
#
mkdir -p data/hadoop/namenode
mkdir -p data/hadoop/data
mkdir -p data/hadoop/tmp
slave1、slave2节点同样的操作。
9、配制 hadoop
9.1 hadoop需要配制7个文件
- core-site.xml
- hdfs-site.xml
- mapred-site.xml
- yarn-site.xml
- slaves
- masters
- hadoop-env.sh
9.2 修改core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9000</value>
<description>namenode通信地址</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/data/hadoop/tmp</value>
<description>临时文件存储路径</description>
</property>
</configuration>
9.3 修改hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/data/hadoop/namenode</value>
<final>true</final>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/data/hadoop/data</value>
<final>true</final>
</property>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:9001</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
</configuration>
9.4 修改mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
9.5 修改yarn-site.xml
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8040</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>master:8088</value>
<description>ResourceManager对外web ui地址。用户可通过该地址在浏览器中查看集群各类信息</description>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>master:8025</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>master:8141</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description>NodeManager上运行的附属服务。需配置成mapreduce_shuffle,才可运行MapReduce程序</description>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.acl.enable</name>
<value>false</value>
</property>
<property>
<name>yarn.admin.acl</name>
<value>*</value>
</property>
</configuration>
9.6 修改slaves
slave1
slave2
9.7 修改masters
master
9.8 修改hadoop-env.sh
添加JAVA_HOME配制
export JAVA_HOME=/usr/java/jdk1.8.0_231
10、启动hadoop
10.1 启动hadoop之前,需要先格式化文件系统
# 到bin目录下
cd /usr/hadoop-2.6.5/bin/
#执行文件格式化命令
./hdfs namenode -format
10.2 启动hdfs
# 到sbin目录下
cd /usr/hadoop-2.6.5/sbin/
# 启动 hdfs
./start-dfs.sh
10.3 验证hadoop是否安装成功
# 启动yarn
./start-yarn.sh
11、验证hadoop是否启动成功
#
hadoop fs -ls /
hadoop fs -mkdir /user
hadoop fs -put /data/jdk-8u231-linux-x64.tar.gz /user