pc上ubuntu环境下搭建hadoop开发环境

实践环境:

机器名           ip             作用                 机器配置

主要过程

1.修改hosts

cat /etc/hosts

在master和slave上都修改,添加机器名和对应ip

2.ssh无密码登录本机和其他机器

3.JDK环境

http://nemogu.iteye.com/blog/1542361

4.安装hadoop

5.配置hadoop

6.运行hadoop

1) 在master上启动

bin下运行 ./start-all.sh

如果一切正常启动的话,jps应如下面示

在master上,jps

7786 NameNode
9969 Jps
8102 SecondaryNameNode
8196 JobTracker
 

在slave上,jps


6654 TaskTracker
6707 Jps
6463 DataNode

问题:

master上jps如下示

13136 SecondaryNameNode
13227 JobTracker
13530 Jps

slave上jps如下示

6654 TaskTracker

6707 Jps

6463 DataNode


master上jps看不到namenode,可能namenode有问题,查看logs下namenode日志,如下示

2012-05-31 22:04:31,872 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = guxu-Lenovo-B460/127.0.1.1
STARTUP_MSG: args = []
STARTUP_MSG: version = 1.0.3
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1335192; compiled by 'hortonfo' on Tue May 8 20:31:25 UTC 2012
************************************************************/
2012-05-31 22:04:32,034 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2012-05-31 22:04:32,046 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2012-05-31 22:04:32,048 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2012-05-31 22:04:32,048 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: NameNode metrics system started
2012-05-31 22:04:32,237 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2012-05-31 22:04:32,242 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2012-05-31 22:04:32,258 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
2012-05-31 22:04:32,259 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source NameNode registered.
2012-05-31 22:04:32,298 INFO org.apache.hadoop.hdfs.util.GSet: VM type = 32-bit
2012-05-31 22:04:32,298 INFO org.apache.hadoop.hdfs.util.GSet: 2% max memory = 17.77875 MB
2012-05-31 22:04:32,298 INFO org.apache.hadoop.hdfs.util.GSet: capacity = 2^22 = 4194304 entries
2012-05-31 22:04:32,298 INFO org.apache.hadoop.hdfs.util.GSet: recommended=4194304, actual=4194304
2012-05-31 22:04:32,335 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: fsOwner=guxu
2012-05-31 22:04:32,335 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: supergroup=supergroup
2012-05-31 22:04:32,335 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isPermissionEnabled=true
2012-05-31 22:04:32,342 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: dfs.block.invalidate.limit=100
2012-05-31 22:04:32,342 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
2012-05-31 22:04:32,674 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Registered FSNamesystemStateMBean and NameNodeMXBean
2012-05-31 22:04:32,695 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: Caching file names occuring more than 10 times
2012-05-31 22:04:32,700 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory /home/guxu/hadoop_tmp_dir/dfs/name does not exist.
2012-05-31 22:04:32,702 ERROR org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem initialization failed.
org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/guxu/hadoop_tmp_dir/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:303)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:362)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:496)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288)
2012-05-31 22:04:32,704 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: org.apache.hadoop.hdfs.server.common.InconsistentFSStateException: Directory /home/guxu/hadoop_tmp_dir/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:303)
at org.apache.hadoop.hdfs.server.namenode.FSDirectory.loadFSImage(FSDirectory.java:100)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initialize(FSNamesystem.java:388)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.<init>(FSNamesystem.java:362)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:276)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:496)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1279)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1288)

2012-05-31 22:04:32,705 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at guxu-Lenovo-B460/127.0.1.1
************************************************************/

注:先前我在配置hadoop时,将conf/core-site.xml里的hadoop.tmp.dir的value设为/home/guxu/hadoop_tmp_dir,然后format了。

现在将conf/core-site.xml里的hadoop.tmp.dir改为 /tmp/hadoop-guxu/dfs/name,

再次执行hadoop namenode -format。

输出如下:

12/06/09 11:07:33 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = guxu-Lenovo-B460/127.0.1.1
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 1.0.3
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1335192; compiled by 'hortonfo' on Tue May 8 20:31:25 UTC 2012
************************************************************/
12/06/09 11:07:33 INFO util.GSet: VM type = 32-bit
12/06/09 11:07:33 INFO util.GSet: 2% max memory = 17.77875 MB
12/06/09 11:07:33 INFO util.GSet: capacity = 2^22 = 4194304 entries
12/06/09 11:07:33 INFO util.GSet: recommended=4194304, actual=4194304
12/06/09 11:07:33 INFO namenode.FSNamesystem: fsOwner=guxu
12/06/09 11:07:33 INFO namenode.FSNamesystem: supergroup=supergroup
12/06/09 11:07:33 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/06/09 11:07:33 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
12/06/09 11:07:33 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
12/06/09 11:07:33 INFO namenode.NameNode: Caching file names occuring more than 10 times
12/06/09 11:07:34 INFO common.Storage: Image file of size 110 saved in 0 seconds.
12/06/09 11:07:34 INFO common.Storage: Storage directory /tmp/hadoop-guxu/dfs/name/dfs/name has been successfully formatted.
12/06/09 11:07:34 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at guxu-Lenovo-B460/127.0.1.1
************************************************************/
 

问题

在执行 ./start-all.sh时,

namenode的日志()输出如下错误

2012-06-09 10:27:02,534 ERROR org.apache.hadoop.security.UserGroupInformation: PriviledgedActionException as:guxu cause:java.io.IOException: File /tmp/hadoop-guxu/dfs/name/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
2012-06-09 10:27:02,537 INFO org.apache.hadoop.ipc.Server: IPC Server handler 5 on 9000, call addBlock(/tmp/hadoop-guxu/dfs/name/mapred/system/jobtracker.info, DFSClient_1766397871, null) from 192.168.0.105:45917: error: java.io.IOException: File /tmp/hadoop-guxu/dfs/name/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
java.io.IOException: File /tmp/hadoop-guxu/dfs/name/mapred/system/jobtracker.info could only be replicated to 0 nodes, instead of 1
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1558)
at org.apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.java:696)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:563)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1388)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1384)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1382)

然后通过对master和slave jps,发现一切正常!

上面出现的异常是什么原因导致的呢?

a.关闭master和slave的防火墙

b.

$./hadoop dfs -ls 查看当前用户路径/user/guxu下的内容

$./hadoop dfs -ls xx_dir 查看xx_dir路径下文件

$./hadoop dfs -cat xx_file 查看文件内容

$./hadoop dfs -mkdir newDIr 在hdfs中新建一个目录

Q:output能否多次建?

7.测试hellowrold

//在hdfs中建立一个input目录

$ hadoop fs -mkdir hello_input

$ hadoop dfs -ls

Found 1 items
drwxr-xr-x   - guxu supergroup          0 2012-06-09 19:17 /user/guxu/hello_input

//删除hdfs中已有的目录(如hello_input)

$ hadoop dfs -rmr /user/guxu/hello_input

Deleted hdfs://192.168.0.105:9000/user/guxu/hello_input

hdfs里面文件的路径与linux文件路径是两套系统。

//将file0*拷贝到hdfs中

hadoop dfs -copyFromLocal /home/guxu/JavaLib/hadoop/test/hellowrold/file0* hello_input

//运行

$hadoop jar hadoop-examples-1.0.3.jar wordcount hello_input hello_output

//查看文件内容

$./hadoop dfs -cat xx_file

web管理

namenode:http://guxu-lenovo-b460:50070/dfshealth.jsp

map reduce:http://guxu-lenovo-b460:50030/jobtracker.jsp

参考

http://www.infoq.com/cn/articles/hadoop-config-tip

猜你喜欢

转载自nemogu.iteye.com/blog/1546076