基本环境及软件:
软件版本 | 软件包 |
---|---|
centos-6.4 | |
JDK-1.8 | jdk-8u191-linux-x64.tar.gz |
hadoop-2.6.0 | hadoop-2.6.0-cdh5.7.0.tar.gz |
软件安装包官网下载地址 :http://archive-primary.cloudera.com/cdh5/cdh/5/
设置免密码登录
1.生成本机的公钥,过程中不断敲回车即可,ssh-keygen命令默认会将公钥放在/root/.ssh目录下
# ssh-keygen -t rsa
2.将公钥复制为authorized_keys文件,此时使用ssh连接本机就不需要输入密码了
# cd /root/.ssh
# cp id_rsa.pub authorized_keys
3.设置ip和hostname的对应关系
# vi /etc/hosts
192.168.0.104 bigdata
4.配置是否配置成功
# ssh bigdata
安装JDK1.8
1.在虚拟机中的创建app目录,存放安装的软件
# cd /usr/local/
# mkdir app
2.将下载好的jdk-8u191-linux-x64.tar.gz安装包上传到虚拟机的 /use/local/app/ 目录下
3.对安装包进行解压缩
# tar -zxvf jdk-8u191-linux-x64.tar.gz
4.配置环境变量,
# vi /etc/profile
#set java environment
export JAVA_HOME=/usr/local/app/jdk1.8.0_191
export JRE_HOME=$JAVA_HOME/jre
export PATH=$PATH:$JAVA_HOME/bin
export CLASSPATH=./:$JAVA_HOME/lib:$JAVA_HOME/jre/lib
5.使环境变量生效
# source /etc/profile
6.验证是否安装成功
# java -version
# java
# javac
安装hadoop
1、将下载好的hadoop-2.6.0-cdh5.7.0.tar.gz,上传到虚拟机的/usr/local/app目录下。(http://archive.cloudera.com/cdh5/cdh/5/)
2、将hadoop包进行解压缩:
# tar -zxvf hadoop-2.6.0-cdh5.7.0.tar.gz
3、对hadoop目录进行重命名:
# mv hadoop-2.6.0-cdh5.7.0 hadoop-2.6.0
4、配置hadoop相关环境变量
# vi ~/.bashrc
#set hadoop environment
export HADOOP_HOME=/usr/local/app/hadoop-2.6.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
# source ~/.bashrc
5、创建/usr/local/app/data目录 ,并且进入到/usr/local/app/hadoop-2.6.0/etc/hadoop 目录下
# mkdir /usr/local/app/data
# cd /usr/local/app/hadoop-2.6.0/etc/hadoop
6.修改hadoop-env.sh
export JAVA_HOME=/usr/local/app/jdk1.8.0_191
export HADOOP_PID_DIR=/usr/local/app/data/tmp
7.配置mapred-env.sh
export HADOOP_MAPRED_PID_DIR=/usr/local/app/data/tmp
8.配置yarn-env.sh
export YARN_PID_DIR=/usr/local/app/data/tmp
9.配置core-site.xml
<property>
<name>fs.default.name</name>
<value>hdfs://bigdata:9000</value>
</property>
10.配置hdfs-site.xml
<property>
<name>dfs.name.dir</name>
<value>/usr/local/app/data/namenode</value>
</property>
<property>
<name>dfs.data.dir</name>
<value>/usr/local/app/data/datanode</value>
</property>
<property>
<name>dfs.tmp.dir</name>
<value>/usr/local/app/data/tmp</value>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
11.配置mapred-site.xml,
# cp mapred-site.xml.template mapred-site.xml
# vim mapred-site.xml
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
12.配置yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>bigdata</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
启动hdfs集群
1、格式化namenode:
# hdfs namenode -format
2、启动hdfs集群:
# start-dfs.sh
3、验证启动是否成功:
a. jps验证:namenode、datanode、secondarynamenode
b.50070验证端口:http://192.168.0.104:50070
c、读写操作测试验证:
# vi hello.txt
hadoop hive spark
# hdfs dfs -mkdir /test
# hdfs dfs -put hello.txt /test/hello.txt
# hdfs dfs -text /test/hello.txt
启动yarn集群
1、启动yarn集群:
# start-yarn.sh
2、验证启动是否成功:
a.jps验证:resourcemanager、nodemanager
b.8088端口验证:http://192.168.0.104:8088
到此Hadoop伪分布式集群安装完成,有错误之处,请多多指教