二:hadoop2.x伪分布式集群安装

环境:
centos6.6 /jdk1.8
hadoop2.6.0-CDH5.6.0:http://archive.cloudera.com/cdh5/cdh/5/
1.基本环境准备(jdk环境配置略过)
1.1.修改主机名
#vim /etc/sysconfig/network


NETWORKING=yes
HOSTNAME=hadoop-vm

1.2 修改IP
第一种:通过Linux图形界面进行修改
进入Linux图形界面 -> 右键点击右上方的两个小电脑 -> 点击Edit connections -> 选中当前网络System eth0 -> 点击edit按钮 -> 选择IPv4 -> method选择为manual -> 点击add按钮 -> 添加IP:192.168.1.230 子网掩码:255.255.255.0 网关:192.168.1.1 -> apply
第二种:修改配置文件方式(屌丝程序猿专用)
vim /etc/sysconfig/network-scripts/ifcfg-eth0
DEVICE=”eth0”
BOOTPROTO=”static” ###
HWADDR=”00:0C:29:3C:BF:E7”
IPV6INIT=”yes”
NM_CONTROLLED=”yes”
ONBOOT=”yes”
TYPE=”Ethernet”
UUID=”ce22eeca-ecde-4536-8cc2-ef0dc36d4a8c”
IPADDR=”192.168.1.230” ###
NETMASK=”255.255.255.0” ###
GATEWAY=”192.168.1.1” ###
1.3 修改主机名和IP的映射关系

#vim /etc/hosts
192.168.1.230  hadoop-vm
192.168.1.230  localhost

1.4关闭防火墙
#查看防火墙状态
service iptables status
#关闭防火墙
service iptables stop
#查看防火墙开机启动状态
chkconfig iptables –list
#关闭防火墙开机启动
chkconfig iptables off

1.5 创建一个hadoop用户并重启Linux
reboot

2.安装hadoop2.6.0 CDH版本
先上传hadoop的安装包到服务器上去/home/hadoop/
注意:hadoop2.x的配置文件$HADOOP_HOME/etc/hadoop
伪分布式需要修改5个配置文件
2.1 配置hadoop

第一个配置:hadoop-env.sh 在/home/hadoop/bigdata/hadoop-2.6.0-cdh5.6.0/etc/hadoop
目录下

vim hadoop-env.sh
#第27行
# The java implementation to use.
export JAVA_HOME=/usr/java/jdk1.8.0_112

第二个配置:core-site.xml
指定HADOOP所使用的文件系统schema(URI),HDFS的老大(NameNode)的地址

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
   <property>
     <name>fs.defaultFS</name>
     <value>hdfs://hadoop-vm:9000</value>
   </property>
   <property>
     <name>hadoop.tmp.dir</name>
     <value>/home/hadoop/bigdata/hadoop-2.6.0-cdh5.6.0/hdfs/tmp</value>
   </property>
</configuration>

第三个:hdfs-site.xml hdfs-default.xml
指定HDFS副本的数量,这里是伪分布式集群副本数量就是1,还要数据存储的目录

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
    <property>
      <name>dfs.namenode.name.dir</name>
      <value>/home/hadoop/bigdata/hadoop-2.6.0-cdh5.6.0/hdfs/name</value>
    </property>
    <property>
       <name>dfs.datanode.data.dir</name>
       <value>/home/hadoop/bigdata/hadoop-2.6.0-cdh5.6.0/hdfs/data</value>
    </property>
    <property>
      <name>dfs.permissions.enabled</name>
      <value>false</value>
    </property>
</configuration>

第四个:mapred-site.xml (mv mapred-site.xml.template mapred-site.xml)

#mv mapred-site.xml.template mapred-site.xml
#vim mapred-site.xml

指定MR运行在yarn上

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
   <property>
      <name>mapreduce.framework.name</name>
      <value>yarn</value>
   </property>
</configuration>

第5个:yarn-site.xml
指定YARN的老大(ResourceManager)的地址

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>
   <property>
      <name>yarn.resourcemanager.hostname</name>
      <value>hadoop-vm</value>
   </property>
   <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce_shuffle</value>
   </property>

</configuration>

以上5个文件配置好以后,基本配置就算完成了
3.将hadoop添加到环境变量中

vim /etc/proflie
export HADOOP_HOME=/home/hadoop/bigdata/hadoop-2.6.0-cdh5.6.0
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
 保存生效
source /etc/profile

4.格式化namenode -对namenode进行初始化

hdfs namenode -format (hadoop namenode -format)

5、启动hadoop

先启动HDFS(进入hadoop sbin目录下)
$sbin/start-dfs.sh
再启动YARN
$sbin/start-yarn.sh

或者直接启动start-all.sh

查看启动进程,看到如下几个进程说明就启动成功了

[hadoop@hadoop-vm sbin]$ jps
3376 NameNode
3840 ResourceManager
3954 NodeManager
4293 Jps
3686 SecondaryNameNode
3484 DataNode

通过浏览器访问HDFS管理界面查看是否启动成功
http://192.168.1.230:50070

6.测试HDFS是否可用

$ hadoop  fs -put data_sharding.pdf hdfs://hadoop-vm:9000/
(将data_sharding.pdf 传到hdfs根目录下)
$ hadoop  fs -get  hdfs://hadoop-vm:9000/data_sharding.pdf
(获取下载data_sharding.pdf)
测试mapreduce是否可用:
cd /home/hadoop/bigdata/hadoop-2.6.0-cdh5.6.0/share/hadoop/mapreduce
$ hadoop jar hadoop-mapreduce-examples-2.6.0.jar  pi 5 5
(自带的测试程序,计算圆周率)

备注以上操作都是在创建 的hadoop用户下操作的

猜你喜欢

转载自blog.csdn.net/zyshappy/article/details/74135917