单台机器搭建hadoop集群环境

一 准备

首先创建文件夹,结构如下:

weim@weim:~/myopt$ ls
ubuntu1  ubuntu2  ubuntu3

并且将下载的jdk(版本:8u172),hadoop(版本:hadoop-2.9.1)解压到三个文件夹中,如下:

weim@weim:~/myopt$ ls ubuntu1
hadoop  jdk
weim@weim:~/myopt$ ls ubuntu2
hadoop  jdk
weim@weim:~/myopt$ ls ubuntu3
hadoop  jdk

二 准备三台机器

这里使用docker创建三台机器,使用镜像ubuntu:16.04

weim@weim:~/myopt$ docker image ls
REPOSITORY                                          TAG                 IMAGE ID            CREATED             SIZE
ubuntu                                              16.04               f975c5035748        2 months ago        112MB

启动三台ubuntu容器,将分别将本地的/myopt/ubuntu1,/myopt/ubuntu2,/myopt/ubuntu3加载到容器的/home/software路径下。

ubuntu1

weim@weim:~/myopt$ docker run --hostname ubuntu1 --name ubuntu1 -v /home/weim/myopt/ubuntu1:/home/software -it --rm  ubuntu:16.04 bash
root@ubuntu1:/# ls /home/software/
hadoop  jdk

ubuntu2

weim@weim:~/myopt$ docker run --hostname ubuntu2 --name ubuntu2 -v /home/weim/myopt/ubuntu2:/home/software -it --rm  ubuntu:16.04 bash
root@ubuntu2:/# ls /home/software/
hadoop  jdk

ubuntu3

weim@weim:~/myopt$ docker run --hostname ubuntu3 --name ubuntu3 -v /home/weim/myopt/ubuntu3:/home/software -it --rm  ubuntu:16.04 bash
root@ubuntu3:/# ls /home/software/
hadoop  jdk
root@ubuntu3:/# 

 这样最基本的三台机器就创建好了。

查看机器的信息:

weim@weim:~$ docker ps -a
CONTAINER ID        IMAGE               COMMAND             CREATED              STATUS              PORTS               NAMES
b4c6de2a4326        ubuntu:16.04        "bash"              About a minute ago   Up About a minute                       ubuntu2
53d1f6389710        ubuntu:16.04        "bash"              About a minute ago   Up About a minute                       ubuntu3
0f210a01d47f        ubuntu:16.04        "bash"              About a minute ago   Up About a minute                       ubuntu1
weim@weim:~$ 
weim@weim:~$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ubuntu1
172.17.0.2
weim@weim:~$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ubuntu2
172.17.0.4
weim@weim:~$ docker inspect --format '{{ .NetworkSettings.IPAddress }}' ubuntu3
172.17.0.3
----------------------------------------------------------------------------------
这里是每台机器的ip地址
三台机器在同一个局域网内
----------------------------------------------------------------------------------

三 安装必要的一些软件

在三台机器上安装必要的软件,首先执行apt-get update命令,更新ubuntu软件库。

然后安装软件vim,openssh-server软件即可。

四 环境配置

a 首先配置java环境,在文件下面追加java路径配置

root@ubuntu1:/home/software/jdk# vim /etc/profile
---------------------------------------------------------------
添加下面的配置到profile文件末尾处
#set jdk environment  
export JAVA_HOME=/home/software/jdk 
export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH  
export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH
---------------------------------------------------------------

root@ubuntu1:/home/software/jdk# source /etc/profile  
root@ubuntu1:/home/software/jdk# java -version
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)
root@ubuntu1:/home/software/jdk# 

b 设置ssh无密码访问

root@ubuntu1:/home/software/jdk# ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:hSMrNTp6/1d7L/QZGKdTCPivDJspbY2tcyjke2qjpBI root@ubuntu1
The key's randomart image is:
+---[RSA 2048]----+
|          .      |
|         o .     |
|      + o o . .  |
|     o + o . o o |
|    + . S   . *  |
| E . o . .  .=.. |
|  o ..o . @..o..o|
| . .o. * @.*. o..|
|  .. .++Xo+  . o.|
+----[SHA256]-----+
root@ubuntu1:/home/software/jdk# cd ~/.ssh
root@ubuntu1:~/.ssh# ls
id_rsa  id_rsa.pub
root@ubuntu1:~/.ssh# cat id_rsa.pub >> authorized_keys
root@ubuntu1:~/.ssh# chmod 600 authorized_keys 

配置完成之后,通过ssh localhost验证是否可以无密码访问本机,首先确保ssh服务启动了。如果没有启动可以使用/etc/init.d/ssh start 启动服务。

root@ubuntu1:/home/software# /etc/init.d/ssh start
 * Starting OpenBSD Secure Shell server sshd                                                                                                               [ OK ] 
root@ubuntu1:/home/software# ssh localhost
The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is SHA256:chW/KhKqnlQZ8qMxDy8wgSzBIEZ08pdVycjfgJFkVSY.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.13.0-41-generic x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

The programs included with the Ubuntu system are free software;
the exact distribution terms for each program are described in the
individual files in /usr/share/doc/*/copyright.

Ubuntu comes with ABSOLUTELY NO WARRANTY, to the extent permitted by
applicable law.

root@ubuntu1:~# exit
logout
Connection to localhost closed.
root@ubuntu1:/home/software# 

    将authorized_keys文件拷贝到ubuntu2,ubuntu3容器中。(这里呢,我并不知道ubuntu2 root的密码,所以暂时不知道怎么通过scp命令拷贝过去)通过一种折中的方式即可。

        首先进入~/.ssh文件下,将authorized_keys文件拷贝到/home/software路径下。

root@ubuntu1:~/.ssh# ls
authorized_keys  id_rsa  id_rsa.pub  known_hosts
root@ubuntu1:~/.ssh# cp authorized_keys /home/software/
root@ubuntu1:~/.ssh# ls /home/software/
authorized_keys  hadoop  jdk
root@ubuntu1:~/.ssh# 

然后回到本地系统,在~/myopt/ubuntu1路径下可以看到刚才拷贝的文件,将该文件拷贝到ubuntu2,ubuntu3中。

weim@weim:~/myopt/ubuntu1$ ls
authorized_keys  hadoop  jdk
weim@weim:~/myopt/ubuntu1$ sudo cp authorized_keys ../ubuntu2/
weim@weim:~/myopt/ubuntu1$ sudo cp authorized_keys ../ubuntu3/

 然后分别回到ubuntu2,ubuntu3容器中,将文件拷贝到~/.ssh目录下。

root@ubuntu2:/home/software# cp authorized_keys ~/.ssh
root@ubuntu2:/home/software# ls ~/.ssh
authorized_keys  id_rsa  id_rsa.pub
root@ubuntu2:/home/software# 

验证ubuntu1是否可以无密码访问ubuntu2,ubuntu3(可查看ip通过)

root@ubuntu1:~/.ssh# ssh [email protected]
root@ubuntu1:~/.ssh# ssh [email protected]

五 hadoop 环境配置

我们以ubuntu1为例,2和3都是雷同的。

首先创建hadoop的数据保存目录。

root@ubuntu1:/home/software/hadoop# mkdir data
root@ubuntu1:/home/software/hadoop# cd data/
root@ubuntu1:/home/software/hadoop/data# mkdir tmp
root@ubuntu1:/home/software/hadoop/data# mkdir data
root@ubuntu1:/home/software/hadoop/data# mkdir checkpoint
root@ubuntu1:/home/software/hadoop/data# mkdir name

 进入/home/software/hadoop/etc/hadoop目录

修改hadoop-env.sh文件,设置java

export JAVA_HOME=/home/software/jdk

 配置core-site.xml

<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://172.17.0.2:9000</value>
  </property>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/home/hadoop/data/tmp</value>
  </property>
  <property>
    <name>fs.trash.interval</name>
    <value>1440</value>
  </property>
  <property>
    <name>io.file.buffer.size</name>
    <value>65536</value>
  </property>
</configuration>

配置hdfs-site.xml    

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>3</value>
  </property>
  <property>
    <name>dfs.namenode.name.dir</name>
    <value>/home/hadoop/data/name</value>
  </property>
  <property>
    <name>dfs.blocksize</name>
    <value>67108864</value>
  </property>
  <property>
    <name>dfs.datanode.data.dir</name>
    <value>/home/hadoop/data/data</value>
  </property>
  <property>
    <name>dfs.namenode.checkpoint.dir</name>
    <value>/home/hadoop/data/checkpoint</value>
  </property>
  <property>
    <name>dfs.namenode.handler.count</name>
    <value>10</value>
  </property>
  <property>
    <name>dfs.datanode.handler.count</name>
    <value>10</value>
  </property>
  <property>
    <name>dfs.namenode.rpc-address</name>
    <value>172.17.0.2:9000</value>
  </property>
</configuration>

配置mapred-site.xml

<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>

配置yarn-site.xml

<configuration>
  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>172.17.0.2</value>
  </property>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
</configuration>

配置slaves

172.17.0.2
172.17.0.3
172.17.0.4

六 启动

在ubuntu1中,进入/home/software/hadoop/bin目录,执行hdfs namenode -format 初始化hdfs

root@ubuntu1:/home/software/hadoop/bin# ./hdfs namenode -format

在ubuntu1中,进入/home/software/hadoop/sbin目录,

执行start-all.sh

root@ubuntu1:/home/software/hadoop/sbin# ./start-all.sh
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [ubuntu1]
The authenticity of host 'ubuntu1 (172.17.0.2)' can't be established.
ECDSA key fingerprint is SHA256:chW/KhKqnlQZ8qMxDy8wgSzBIEZ08pdVycjfgJFkVSY.
Are you sure you want to continue connecting (yes/no)? yes
ubuntu1: Warning: Permanently added 'ubuntu1,172.17.0.2' (ECDSA) to the list of known hosts.
ubuntu1: starting namenode, logging to /home/software/hadoop/logs/hadoop-root-namenode-ubuntu1.out
172.17.0.2: starting datanode, logging to /home/software/hadoop/logs/hadoop-root-datanode-ubuntu1.out
172.17.0.4: starting datanode, logging to /home/software/hadoop/logs/hadoop-root-datanode-ubuntu2.out
172.17.0.3: starting datanode, logging to /home/software/hadoop/logs/hadoop-root-datanode-ubuntu3.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is SHA256:chW/KhKqnlQZ8qMxDy8wgSzBIEZ08pdVycjfgJFkVSY.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /home/software/hadoop/logs/hadoop-root-secondarynamenode-ubuntu1.out
starting yarn daemons
starting resourcemanager, logging to /home/software/hadoop/logs/yarn--resourcemanager-ubuntu1.out
172.17.0.2: starting nodemanager, logging to /home/software/hadoop/logs/yarn-root-nodemanager-ubuntu1.out
172.17.0.3: starting nodemanager, logging to /home/software/hadoop/logs/yarn-root-nodemanager-ubuntu3.out
172.17.0.4: starting nodemanager, logging to /home/software/hadoop/logs/yarn-root-nodemanager-ubuntu2.out

查看启动情况

ubuntu1

root@ubuntu1:/home/software/hadoop/sbin# jps
3827 SecondaryNameNode
3686 DataNode
4007 ResourceManager
4108 NodeManager
4158 Jps

ubuntu2

root@ubuntu2:/home/software/hadoop/sbin# jps
3586 Jps
3477 DataNode
3545 NodeManager

ubuntu3

root@ubuntu3:/home/software/hadoop/sbin# jps
3472 DataNode
3540 NodeManager
3582 Jps

接下来我们访问http://172.17.0.2:50070http://172.17.0.2:8088就可以看到一些信息了。

猜你喜欢

转载自my.oschina.net/u/2490316/blog/1819220