linux上软件的安装
二进制程序的安装
Hadoop02:192.168.61.128
Hadoop01:192.168.61.129
Rpm程序安装(.rpm)
rpm -ivh nc-1.84-24.el6.x86_64.rpm
Yum在线安装(本质是rpm,只是Rpm的一种安装方式)
源码安装
Java -version 软件版本
Bashdb-4.4-0.93.tar.gz
Gcc-4.4.7-23.el6.x86_64.rpm
Jdk-7u79-linux-x64.tar.gz
nc-1.84-24.el6.x86_64.rpm
yum -y install gcc。。。
shell运行环境和运行方式
- Bourne Shell(/bin/sh)在linux下:
- Bourne Again Shell:/bin/bash
- C Shell:/bin/tcsh
- K Shell:c Shell的超集
Shell命令 变量和等号间不能有空格,变量首字符为字母
Case:
#!/bin/bash
Class=‘hello world’
Class2=`expr 3 + 3`
Class1=`date` //取时间
//Ech o ${变量名}
Echo $class ${class2} //output: hello world 6
Echo ${#class} //取字符串长度
hadoop环境配置
Apache Hadoop的版本: hadoop-2.7.3.tar.gz
- 下载指定版本并解压
- 为hadoop配置环境变量
- 配置hadoop的安装目录下的/etc/Hadoop/Hadoop-env.Sh
测试
- Which Hadoop Hadoop version
Hdfs Yarn resource manger、container、application master、
克隆虚拟机
克隆后的配置hadoop2:IP:192.168.61.128
Hadoop3:ip 192.168.61.130
Hadoop1:192.168.61.131
修改网卡信息:vi /etc/udev/rules.d/70-persisten**-net**
、修改主机名 :vi /etc/sysconfig/network
、修改ip信息vi /etc/sysconfig/network-scripts/ifcfg-eth0
、修改映射:vi /etc/hosts
hdfs模块NameNode&SNN&DataNode
yarn模块resourcemanager,nodemanager
Hadoop01 NameNode&SNN&DataNode
Hadoop02 DataNode
Hadoop03 DataNode
搭建hadoop的集群
Local(Stand alone单机版)mode
Pseudo-Distributed mode伪分布式
Fully-Distributed mode
Hadoop全分布式搭建:
规划
主机名 IP地址 功能
Hadoop01 192.168.61.131 namenode,snn,datanode,resourcemanager,nodemanager
Hadoop02 192.168.61.128 DataNode、nodemanger
Hadoop02 192.168.61.130 DataNode、nodemanager
JDK ssh Hadoop集群
配置hadoop相关配置文件
vi ./etc/hadoop/hadoop-env.sh
vi ./etc/hadoop/core-site.xml
<configuration>
<!--配置hafs文件系统的命名空间-->
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop01:9000</value>
</property>
<!--配置操作hafs的缓存大小-->
<property>
<name>io.file.buffer.size</name>
<value>4096</value>
</property>
<!--配置临时数据存储目录-->
<property>
<name>hadoop.tmp.dir</name>
<value>/home/bigdata/tmp</value>
</property>
Vi ./etc/Hadoop/hdfs-site.xml
<!--配置副本数-->
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<!--块大小-->
<property>
<name>dfs.block.size</name>
<value>134217728</value>
</property>
<!--hdfs的元数据存储位置-->
<property>
<name>dfs.namenode.name.dir</name>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoopdata/dfs/data</value>
</property>
<name>fs.checkpoint.dir</name>
<value>/home/hadoopdata/checkpoint/dfs/cname</value>
</property>
<!--hdsfd的namenode的web ui地址-->
<property>
<name>dfs.http.address</name>
<value>hadoop01:50070</value>
</property>
<!--hdfs的snn的web ui地址-->
<property>
<name>dfs.secondary.http.address</name>
<value>hadoop01:50090</value>
</property>
<!--是否开启web操作hdfs-->
<property>
<name>dfs.webhdfs.enabled</name>
<value>false</value>
</property>
<!--是否启用hdfs的权限(ac1)-->
<property>
<name>dfs.permissions</name>
<value>false</value>
</propert>
Vi ./etc/Hadoop/mapred-site.xml
<!--指定mapreduce运行框架-->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<final>true</final>
</property>
<!--历史服务的通讯地址-->
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop01:10020</value>
</property>
<!--历史服务的web ui地址-->
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop01:19888</value>
</property>
vi ./etc/hadoop/yarn-site.xml
<configuration>
<!--指定rm所启动的服务主机名-->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop01</value>
</property>
<!--指定rm的shuffle-->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<!--指定rm的内部通讯地址-->
<property>
<name>yarn.resourcemanager.address</name>
<value>hadoop01:8032</value>
</property>
<!--指定rm的scheduler的内部通讯地址-->
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>hadoop01:8030</value>
</property>
<!--指定rm的resource-tracker的内部通讯地址-->
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>hadoop01:8031</value>
</property>
<!--指定rm的admin的内部通讯地址-->
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>hadoop01:8033</value>
</property>
<!--指定rm的web ui监控地址-->
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>hadoop01:8088</value>
</property>
Vi ./etc/Hadoop/slaves
Hadoop01
Hadoop02
Hadoop03
远程分发到别的服务器上面:
Scp -r ../Hadoop-2.7.1/ Hadoop02:/usr/local
Scp -r ../Hadoop-2.7.1/ Hadoop03:/usr/local
Vi /etc/hosts 更改映射
192.168.61.131 hadoop01 www.hadoop01.com
192.168.61.128 hadoop02 www.hadoop02.com
192.168.61.130 hadoop03 www.hadoop03.com
//启动之前,在namenode服务器上先格式化,只需要一次即可
#hadoop namenode -format
//启动 namenode、datanode、resourceManager、nodemanager节点
全启动:start-all.sh
模块启动:
start-dfs.sh
start-yarn.sh
单个进程启动:
hadoop-daemon.Sh start/stop namenode
Hadoop-daemons.sh start/stop datanode
yarn-daemon.sh start/stop namenode
yarn-daemons.sh start/stop datanode
mr-jobhistory-daemon.sh start/stop historyserver
测试:
1.查看进程是否按照规划启动起来
2.查看对应模块的web ui监控是否正常
http://192.168.61.131:8088
3.上传和下载文件(测试hdfs)、跑一个mapreduce的作业
免密登陆:ssh-copy-id Hadoop02
Ssh-copy-id Hadoop03