Hadoop分布式集群的搭建(Apache 版本)下

部署Hadoop:

在配置之前首先要确定我们的集群节点的分布:

节点分布:

hdfs的节点:主节点:NameNode;            从节点:DataNode;
yarn的节点:主节点:ResourceManager;从节点:NodeManager ;

bigdata-01.superyong.com      NodeManager      DataNode      NameNode(active)
bigdata-02.superyong.com      NodeManager      DataNode      NameNode(standby)
bigdata-03.superyong.com      NodeManager      DataNode      ResourceManager

备注:高可用配置会在配置安装zookeeper时配置。

接下来我会按照hadoop的模块分布来配置,在这之前需要将hadoop的环境配置完成,在Hadoop分布式集群的搭建(Apache 版本)上中有写到。

common模块:

core-site.xml

<configuration>

    <!--指定 HDFS 的 NameNode 运行主机名和端口号-->
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://bigdata-01.superyong.com:8020</value>
    </property>
    
    <!--指定 HDFS 本地临时存储目录,默认linuxn系统的 /tmp 目录-->
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/opt/modules/hadoop-2.7.3/data/tmpData</value>
    </property>
    
</configuration>

HDFS模块:

hdfs-site.xml

<configuration>

   <!-- HDFS 会将文件分为多个块,每个块会默认保存三份副本,在这里就可以配置块的存储个数-->
    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>
   
   <!-- 指定secondarynamenode在哪台机器上运行,一般和namenode在同一台机器上,协助namenode工作-->
    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>bigdata-01.superyong.com:50090</value>
    </property>

</configuration>

slaves

说明:这个文件指定了 DataNode 在那台主机上运行, slaves 文件中一行代表一个主机名。配置所有从节点的地址。

bigdata-01.superyong.com
bigdata-02.superyong.com
bigdata-03.superyong.com

YARN模块:

  扩展一下:
    分布式的资源管理和任务调度框架,在yarn上可以运行多种应用程序 
         -》MapReduce
                  并行数据处理框架
         -》Spark
                  基于内存的分布式计算框架
         -》stom/flink
                  实时的流式计算框架

yarn-site.xml

    <!-- 告知 YARN ,MapReduce 程序将运行在它上面 -->
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

    <!-- 指定YARN主节点ResourceManager运行在哪台主机上 -->
    <property>
        <name>yarn.nodemanager.hostname</name>
        <value>bigdata-03.superyong.com</value>
    </property>

MapReduce模块

并行计算框架

mapred-site.xml

在配置这个文件的时候发现没有这个文件,但是有他的模板文件:mapred-site.xml.template,需要将这个模板文件修改为mapred-site.xml文件名,然后进行配置:

    <!-- 指定MapReduce程序运行在yarn上,默认是在local本地上 -->
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>

配置到这里就基本结束了。现在开始分发。

====================================华丽的分割线==================================================

之前我们在bigdata-01.superyong.com主机上配置好了hadoop框架,现在将配置好的框架分发给所有的节点机器上:

使用命令分别向另外两台主机分发配置好的hadoop文件:

scp -r hadoop-2.7.3/ [email protected]:/opt/modules/
scp -r hadoop-2.7.3/ [email protected]:/opt/modules/

配置到这里就结束了,接下来测试搭建好的集群:

测试:

在启动测试之前,首先要将hdfs文件系统进行格式化操作:

bin/hdfs  namenode -format

成功的标志:

util.ExitUtil: Exiting with status 0

 然后启动namenode,使用守护进程命令脚本启动:

[super-yong@bigdata-01 hadoop-2.7.3]$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-super-yong-namenode-bigdata-01.superyong.com.out
[super-yong@bigdata-01 hadoop-2.7.3]$ jps
3591 Jps
3516 NameNode
[super-yong@bigdata-01 hadoop-2.7.3]$

可以发现NameNode已经启动了;

接下来启动从节点:

bigdata-01.superyong.com:

[super-yong@bigdata-01 hadoop-2.7.3]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-super-yong-datanode-bigdata-01.superyong.com.out
[super-yong@bigdata-01 hadoop-2.7.3]$ jps
3704 Jps
3625 DataNode
3516 NameNode
[super-yong@bigdata-01 hadoop-2.7.3]$
[super-yong@bigdata-01 hadoop-2.7.3]$ sbin/hadoop-daemon.sh start secondarynamenode
starting secondarynamenode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-super-yong-secondarynamenode-bigdata-01.superyong.com.out
[super-yong@bigdata-01 hadoop-2.7.3]$ jps
4769 NameNode
4991 Jps
4943 SecondaryNameNode
[super-yong@bigdata-01 hadoop-2.7.3]$

bigdata-02.superyong.com:


[super-yong@bigdata-02 hadoop-2.7.3]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-super-yong-datanode-bigdata-02.superyong.com.out
[super-yong@bigdata-02 hadoop-2.7.3]$ jps
2821 DataNode
2894 Jps
[super-yong@bigdata-02 hadoop-2.7.3]$

 bigdata-03.superyong.com:

[super-yong@bigdata-03 hadoop-2.7.3]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-super-yong-datanode-bigdata-03.superyong.com.out
[super-yong@bigdata-03 hadoop-2.7.3]$ jps
2864 Jps
2820 DataNode
[super-yong@bigdata-03 hadoop-2.7.3]$
 

也可以通过Web UI查看,通过50070端口:(你的namenode在那台机器上,就在那台机器上启动namenode进程)

http://bigdata-01.superyong.com:50070

当然也可以一步到位全部启动起来:

[super-yong@bigdata-01 hadoop-2.7.3]$ sbin/start-dfs.sh
Starting namenodes on [bigdata-01.superyong.com]
bigdata-01.superyong.com: starting namenode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-super-yong-namenode-bigdata-01.superyong.com.out
bigdata-01.superyong.com: starting datanode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-super-yong-datanode-bigdata-01.superyong.com.out
bigdata-02.superyong.com: starting datanode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-super-yong-datanode-bigdata-02.superyong.com.out
bigdata-03.superyong.com: starting datanode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-super-yong-datanode-bigdata-03.superyong.com.out
Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
RSA key fingerprint is a6:8d:2e:4d:dd:ef:20:d3:d7:87:db:7a:33:5a:04:e3.
Are you sure you want to continue connecting (yes/no)? yes
0.0.0.0: Warning: Permanently added '0.0.0.0' (RSA) to the list of known hosts.
0.0.0.0: starting secondarynamenode, logging to /opt/modules/hadoop-2.7.3/logs/hadoop-super-yong-secondarynamenode-bigdata-01.superyong.com.out
[super-yong@bigdata-01 hadoop-2.7.3]$ jps
3926 NameNode
4234 SecondaryNameNode
4364 Jps
4063 DataNode
[super-yong@bigdata-01 hadoop-2.7.3]$

这样所有的 HDFS 进程就都启动起来了;

接下来启动 YARN 进程:(你的resourcemanager在那台机器上,就在那台机器上启动resourcemanager进程)

[super-yong@bigdata-03 hadoop-2.7.3]$ sbin/start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/modules/hadoop-2.7.3/logs/yarn-super-yong-resourcemanager-bigdata-03.superyong.com.out
bigdata-03.superyong.com: starting nodemanager, logging to /opt/modules/hadoop-2.7.3/logs/yarn-super-yong-nodemanager-bigdata-03.superyong.com.out
bigdata-02.superyong.com: starting nodemanager, logging to /opt/modules/hadoop-2.7.3/logs/yarn-super-yong-nodemanager-bigdata-02.superyong.com.out
bigdata-01.superyong.com: starting nodemanager, logging to /opt/modules/hadoop-2.7.3/logs/yarn-super-yong-nodemanager-bigdata-01.superyong.com.out
[super-yong@bigdata-03 hadoop-2.7.3]$

测试完成结束!

猜你喜欢

转载自blog.csdn.net/superme_yong/article/details/86524442