hadoop大数据平台搭建与应用(1)hadoop安装

前言:

hadoop主要应用场景:

数据分析平台;

推荐系统;

业务系统的底层存储系统;

业务监控系统。

实际应用:电子商务,能源开采,节能,在线旅游,诈骗检测,图像处理,IT安全等。

大三学校开设了hadoop的搭建和应用的课程,教材印刷2020年3月。随着网络的更新和不断发展,以后的搭建过程和应用也会不断地变化,当作一份记忆把学习的过程写在这里。

本节搭建工具和环境:

vm虚拟机,ubuntu系统,JAVA包(),hadoop包()

辅助工具:xshell6 and xftp   (关于两个工具的使用请自行查找我的博客)

前期环境准备:

CPU:开启虚拟机cpu虚拟化

网络:开启NAT模式

真实主机网络:配置网卡和NAT模式相匹配

以下内容采用离线安装教程

设置cpu的虚拟化:

开启网卡:

设置网卡:

到虚拟机中设置NAT网络:

xshell工具和xftp,与服务器建立连接。(方便我们后期的操作)

使用工具在服务器上创建文件夹,并上传文件(这里的hbase压缩包暂时存放到/home目录下,稍后在hbase的配置过程中做讲解)

要想实现hadoop的搭建需要jdk包的环境配置:

解压刚才上传的jdk文件包,不做解压目录的变化

root@user01:/home# cd jdk/

root@user01:/home/jdk# ls

jdk-8u171-linux-x64.tar.gz

root@user01:/home/jdk# tar -xzvf jdk-8u171-linux-x64.tar.gz

配置jdk的环境变量:

export JAVA_HOME=/home/jdk/jdk1.8.0_171

export CLASSPATH=$JAVA_HOME/lib/

export PATH=$JAVA_HOME/bin:$PATH

export PATH JAVA_HOME CLASSPATH

root@user01:/home/jdk# vi /root/.bashrc

(在该文件末尾添加上述变量)

使环境变量生效:

root@user01:/home/jdk# source /root/.bashrc

检查jdk是否安装正常:

java -version //查看java的版本

echo $JAVA_HOME //校验变量值

接下来前期工作准备完成。准备安装hadoop


root@user01:/home# cd hadoop/

root@user01:/home/hadoop# ls

hadoop-2.7.3.tar.gz

root@user01:/home/hadoop# tar -xzvf hadoop-2.7.3.tar.gz

配置hadoop环境变量:

root@user01:/home# vi /root/.bashrc

配置Hadoop的搜索路径

vi /root/.bashrc

export HADOOP_HOME=/home/hadoop/hadoop-2.7.3

export CLASSPATH=$CLASSPATH:$HADOOP_HOME/lib/

export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

变量生效:

root@user01:/home# source /root/.bashrc

检查安装是否正常:

root@user01:/home# hadoop version

大致会显示的信息:

Hadoop 2.7.3

Subversion https://git-wip-us.apache.org/repos/asf/hadoop.git -r baa91f7c6bc9cb92be5982de4719c1c8af91ccff

Compiled by root on 2016-08-18T01:41Z

Compiled with protoc 2.5.0

From source with checksum 2e4ce5f957ea4db193bce3734ff29ff4

This command was run using /home/hadoop/hadoop-2.7.3/share/hadoop/common/hadoop-common-2.7.3.jar

伪分布式模式配置:

修改Hadoop的配置文件

1.修改/home/hadoop/hadoop-2.7.3/etc/hadoop/hadoop-env.sh

vi hadoop-env.sh

export JAVA_HOME=/home/jdk/jdk1.8.0_171  //添加的变量语句

删除荧光绿前环境变量的光标高亮路径区域:

将jdk路径粘贴到该位置:

伪分布式配置

2.配置/home/hadoop/hadoop-2.7.3/etc/hadoop/core-site.xml,增加以下内容

vi core-site.xml

<property>

  <name>fs.defaultFS</name>

  <value>hdfs://localhost:9000</value>

</property>

<property>

  <name>hadoop.tmp.dir</name>

  <value>file:/home/hadoop/tmp</value>

  <description>Abase for other temporary directories.</description>

</property>

<property>

  <name>dfs.permissions</name>

  <value>false</value>

</property>

将上述代码填入该图片显示的位置

3.配置/home/hadoop/hadoop-2.7.3/etc/hadoop/hdfs-site.xml

vi hdfs-site.xml

<property>

  <name>dfs.replication</name>

  <value>1</value>

</property>

<property>

  <name>dfs.namenode.name.dir</name>

  <value>file:/home/hadoop/tmp/dfs/name</value>

</property>

<property>

  <name>dfs.datanode.data.dir</name>

  <value>file:/home/hadoop/tmp/dfs/data</value>

</property>

4.格式化namenode

hadoop namenode –format

查看是否成功:

successfully formatted 和 Exiting with status0 表示成功,Exiting with status1表示格式化出现错误

5.手动启动namenode,datanode,secondarynamenode

start-dfs.sh

root@user01:/home/hadoop/hadoop-2.7.3/etc/hadoop# start-dfs.sh

Starting namenodes on [localhost]

The authenticity of host 'localhost (127.0.0.1)' can't be established.

ECDSA key fingerprint is SHA256:No+kRFk4mIW/DdRFxPw7Y1ylSLKji1k3lzWBcklqDmA.

Are you sure you want to continue connecting (yes/no)? ys

Please type 'yes' or 'no': yes

localhost: Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.

root@localhost's password:

localhost: starting namenode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-root-namenode-user01.out

root@localhost's password:

localhost: starting datanode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-root-datanode-user01.out

Starting secondary namenodes [0.0.0.0]

The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.

ECDSA key fingerprint is SHA256:No+kRFk4mIW/DdRFxPw7Y1ylSLKji1k3lzWBcklqDmA.

Are you sure you want to continue connecting (yes/no)? yes

0.0.0.0: Warning: Permanently added '0.0.0.0' (ECDSA) to the list of known hosts.

[email protected]'s password:

0.0.0.0: starting secondarynamenode, logging to /home/hadoop/hadoop-2.7.3/logs/hadoop-root-secondarynamenode-user01.out

6.手动启动yarn管理端口

start-yarn.sh

7.jps查看启动的进程

8.浏览器输入网址http://localhost:50070测试

总结:

整体的安装过程可能太过于麻烦,老师教过一遍后自己摸索着搭建的,一开始学习hadoop时不是怎么了解hadoop的结构和原理,问了老师hadoop能做什么,老师说以后你就知道了,可能这不是我门的主修课程吧,哈哈哈.

自己又查了百度,和其他大佬写的博客,发现有位大佬把hadoop比作切菜和炒菜,这样抽象的比喻就特别好理解.

做一个菜鸟虚心学习.

猜你喜欢

转载自blog.csdn.net/qq_43575090/article/details/108798742