Hadoop 本地集成环境搭建

1. 相关软件

VMwave6,RedHat5 32bit,JDK6,Hadoop1.2

2. 集成环境搭建步骤

2.1 安装虚拟机

安装过程出现错误:

setup has detected vmware software running on this machine

解决方式:

打开TASK管理器,终止进程里VM相关的进程

出处:

http://zhidao.baidu.com/question/206989601.html?fr=qrl&cid=89&index=1

虚拟机配置:

网络:NAT

硬盘:30G

内存:512M

2.2 安装RedHat系统

关闭防火墙,创建用户hadoop,配置固定IP 192.168.153.128,在/etc/hosts中加入下列内容

192.168.153.1 host
192.168.153.128 master
192.168.153.129 node0
192.168.153.130 node1

2.3 安装JDK6和Hadoop1.2到/usr/local目录下

配置环境变量,在/etc/profile中加入下列内容,执行source profile

#set JDK evn
JAVA_HOME=/usr/local/jdk1.6.0_25
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
PATH=$JAVA_HOME/bin:$PATH

export JAVA_HOME PATH CLASSPATH

#set Hadoop evn
HADOOP_HOME=/usr/local/hadoop-1.2.1
PATH=$PATH:$HADOOP_HOME/bin

export PATH

2.4 配置Hadoop($Hadoop/conf/)

hadoop-env.sh

export JAVA_HOME=/usr/local/jdk1.6.0_25

masters

node0

slaves

node0
node1

core-site.xml

<configuration>
	<property>
	  <name>fs.default.name</name>
	  <value>hdfs://master:8280</value>
	  <final>true</final>
	</property>
</configuration>

hdfs-site.xml

<configuration>
	<property>
	  <name>dfs.name.dir</name>
	  <value>/home/hadoop/hdfs/name</value>
	  <final>true</final>
	</property>
	<property>
	  <name>dfs.data.dir</name>
	  <value>/home/hadoop/hdfs/data</value>
	  <final>true</final>
	</property>
	<property>
	  <name>dfs.checkpoint.dir</name>
	  <value>/home/hadoop/hdfs/checkpoint</value>
	  <final>true</final>
	</property>
	<property>
	  <name>dfs.permissions</name>
	  <value>false</value>
  </property>
  <property>
	  <name>dfs.replication</name>
	  <value>2</value>
  </property>
  <property>
  	  <name>dfs.http.address</name>
  	  <value>master:50070</value>
  </property>
  <property>
	  <name>dfs.secondary.http.address</name>
	  <value>node0:50090</value>
  </property>
</configuration>

mapred-site.xml

<configuration>
	<property>
	  <name>mapred.job.tracker</name>
	  <value>master:8021</value>
	</property>
	
	<property>
	  <name>mapred.local.dir</name>
	  <value>/home/hadoop/mapred/local</value>
	</property>
  	
	<property>
	  <name>mapred.system.dir</name>
	  <value>/home/hadoop/mapred/system</value>
	</property>
	
	<property>
		<name>mapred.child.java.opts</name>
	  <value>-Xmx200m</value>
	</property>
	
	<property>
	  <name>mapred.tasktracker.map.tasks.maximum</name>
	  <value>2</value>
	</property>
  
  <property>
	  <name>mapred.tasktracker.reduce.tasks.maximum</name>
	  <value>2</value>
	</property>

</configuration>

2.4 克隆两份虚拟机node0,node1,更改网络配置

Linux关机命令

shutdown -h now

2.5 配置SSH免登录

-- 登录master,构建.ssh目录

mkdir .ssh
cd .ssh

-- 构建SSH公匙/私匙对,不输入密码直接按回车

ssh-keygen -t rsa

-- 复制一份公匙

cat id_rsa.pub >> authorized_keys

-- 将公匙传送到slaves中

scp ~/.ssh/authorized_keys hadoop@node0:/home/hadoop/
scp ~/.ssh/authorized_keys hadoop@node1:/home/hadoop/

-- 将slave中的公匙复制到指定位置

mkdir .ssh
cat ~/authorized_keys >> ~/.ssh/authorized_keys
rm authorized_keys

-- 权限设置

chmod 755 ~
chmod 755 ~/.ssh
chmod 644 ~/.ssh/authorized_keys
chmod 644 ~/.ssh/id_rsa.pub
chmod 600 ~/.ssh/id_rsa

-- 测试

ssh node0
ssh node1

2.6 HDFS初始化

hadoop namenode -format

2.7 启动Hadoop

start-all.sh

-- 验证

jps

-- 验证结果

master

3439 NameNode
3679 Jps
3591 JobTracker

node0

3475 TaskTracker
3391 SecondaryNameNode
3322 DataNode
3530 Jps

node1

3422 Jps
3369 TaskTracker
3293 DataNode

2.8 运行基准测试 

-- 使用TestNFSIO来测试HDFS

hadoop jar $HADOOP_HOME/hadoop-test-*.jar TestDFSIO -write -nrFiles 2  -fileSize 10
hadoop jar $HADOOP_HOME/hadoop-test-*.jar TestDFSIO -read -nrFiles 2  -fileSize 10
hadoop jar $HADOOP_HOME/hadoop-test-*.jar TestDFSIO -clean

-- 使用Sort程序测试MapReduce

hadoop jar $HADOOP_HOME/hadoop-examples-*.jar randomwriter -Dtest.randomwriter.maps_per_host=1 -Dtest.randomwrite.bytes_per_map=1048576 random-data
hadoop jar $HADOOP_HOME/hadoop-examples-*.jar sort random-data sorted-data
hadoop jar $HADOOP_HOME/hadoop-test-*.jar testmapredsort -sortInput random-data -sortOutput sorted-data

3. 使用HTTP访问Hadoop

Map/Reduce http://master:50030/

HDFS http://master:50070/

4. eclipse插件安装

-- 下载eclipse hadoop plugin 1.2.1 (版本一定要匹配)  

-- 安装插件

-- 配置



 

-- 查看HDFS 



 

猜你喜欢

转载自siyuan-zhu.iteye.com/blog/2021397