linux上搭建Hadoop

linux环境

1.

/etc/hosts (是否必须)

http://hi.baidu.com/2enjoy/blog/item/28e4e721a24d62419922ed75.html

注意:机器是动态IP

cat ./a >> ./b

3.建立ssh无密码登录

在namenode上无密码登录本机

[djboss@DevStation24 hdtest]$ pwd
/home/djboss/hdtest

[djboss@DevStation24 hdtest]$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
Generating public/private dsa key pair.
Your identification has been saved in /home/djboss/.ssh/id_dsa.
Your public key has been saved in /home/djboss/.ssh/id_dsa.pub.
The key fingerprint is:
9e:1d:39:87:dc:7f:e4:31:8d:df:82:ff:7a:fb:83:ab djboss@DevStation24

[djboss@DevStation24 .ssh]$ pwd
/home/djboss/.ssh
[djboss@DevStation24 .ssh]$ ls -a
.  ..  id_dsa  id_dsa.pub  known_hosts

[djboss@DevStation24 hdtest]$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
[djboss@DevStation24 .ssh]$ ls -a
.  ..  authorized_keys  id_dsa  id_dsa.pub  known_hosts



[djboss@DevStation24 .ssh]$ ssh localhost
The authenticity of host 'localhost (192.168.123.24)' can't be established.
RSA key fingerprint is f5:ba:aa:82:fd:e2:cb:34:03:9b:4d:69:bf:66:3e:a9.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (RSA) to the list of known hosts.
Last login: Thu May 17 13:43:49 2012 from 172.16.10.24

[djboss@DevStation24 ~]$ ssh localhost
Last login: Thu May 17 15:29:15 2012 from devstation24


namenode无密码登录datanode

[djboss@DevStation24 ~]$ ssh 192.168.123.61
Last login: Thu May 17 15:43:12 2012 from teststation61
[djboss@TestStation61 ~]$ 

datanode暂不能访问namenode

设置环境变量

djboss ~/.bash_profile

source ~/.bash_profile 使更改生效

$vi ~/.bash_profile

export HADOOP_HOME=/home/djboss/hd_test/hadoop-1.0.2
export PATH=$PATH:$ANT_HOME/bin:$HADOOP_HOME/bin

$source ~/.bash_profile
 

配置文件1:core-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
    <name>hadoop.tmp.dir</name>
    <value>/home/djboss/hdtest/tmp/</value>
</property>
<property>
   <name>fs.default.name</name>
   <value>hdfs://192.168.123.24:54310/</value>
</property>
<property>
  <name>dfs.block.size</name>
  <value>5120000</value>
  <description>The default block size for new files.</description>
</property>
</configuration>

 配置文件2:hdfs-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
   <name>dfs.replication</name>
   <value>1</value>
</property>
</configuration>

配置文件3:mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
   <name>mapred.job.tracker</name>
   <value>hdfs://192.168.123.24:54311/</value>
</property>
</configuration>

$hadoop namenode -format

参考http://dikar.iteye.com/blog/941877

Warning: $HADOOP_HOME is deprecated.

12/05/18 13:09:58 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = DevStation24/192.168.123.24
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.0.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
************************************************************/
12/05/18 13:09:59 INFO util.GSet: VM type       = 32-bit
12/05/18 13:09:59 INFO util.GSet: 2% max memory = 19.84625 MB
12/05/18 13:09:59 INFO util.GSet: capacity      = 2^22 = 4194304 entries
12/05/18 13:09:59 INFO util.GSet: recommended=4194304, actual=4194304
12/05/18 13:09:59 INFO namenode.FSNamesystem: fsOwner=djboss
12/05/18 13:09:59 INFO namenode.FSNamesystem: supergroup=supergroup
12/05/18 13:09:59 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/05/18 13:09:59 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
12/05/18 13:09:59 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
12/05/18 13:09:59 INFO namenode.NameNode: Caching file names occuring more than 10 times 
12/05/18 13:10:00 INFO common.Storage: Image file of size 112 saved in 0 seconds.
12/05/18 13:10:00 INFO common.Storage: Storage directory /home/djboss/hdtest/tmp/dfs/name has been successfully formatted.
12/05/18 13:10:00 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at DevStation24/192.168.123.24
************************************************************/

$start-all.sh

[djboss@DevStation24 bin]$ ./start-all.sh
Warning: $HADOOP_HOME is deprecated.

starting namenode, logging to /home/djboss/hdtest/hadoop-1.0.2/libexec/../logs/hadoop-djboss-namenode-DevStation24.out
192.168.123.61: starting datanode, logging to /home/djboss/hdtest/hadoop-1.0.2/libexec/../logs/hadoop-djboss-datanode-TestStation61.out
192.168.123.24: starting secondarynamenode, logging to /home/djboss/hdtest/hadoop-1.0.2/libexec/../logs/hadoop-djboss-secondarynamenode-DevStation24.out
starting jobtracker, logging to /home/djboss/hdtest/hadoop-1.0.2/libexec/../logs/hadoop-djboss-jobtracker-DevStation24.out
192.168.123.61: starting tasktracker, logging to /home/djboss/hdtest/hadoop-1.0.2/libexec/../logs/hadoop-djboss-tasktracker-TestStation61.out

查看NameNode:http://192.168.123.24:50070/dfshealth.jsp

Map/Reduce Administration:http://192.168.123.24:50030/jobtracker.jsp

namenode上执行jps

$ jps

7296 NameNode
30756 Main
7650 Jps
7473 SecondaryNameNode

datanode上执行jps

[djboss@TestStation61 logs]$ jps
6367 Jps 

$./hadoop dfsadmin -report

Warning: $HADOOP_HOME is deprecated.

Configured Capacity: 0 (0 KB)
Present Capacity: 0 (0 KB)
DFS Remaining: 0 (0 KB)
DFS Used: 0 (0 KB)
DFS Used%: .?%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 0 (0 total, 0 dead)

----------------datanode有问题!!-------------

1.上面执行命令时都出现HADOOP_HOME is deprecated。没有可用的datanode会不会跟这有关?

查看datanode的logs

[djboss@TestStation61 logs]$ pwd
/home/djboss/hdtest/hadoop-1.0.2/logs
[djboss@TestStation61 logs]$ ll
总用量 12
-rw-r--r--  1 djboss dev 3620  5月 18 13:31 hadoop-djboss-datanode-TestStation61.log
-rw-r--r--  1 djboss dev    0  5月 18 13:31 hadoop-djboss-datanode-TestStation61.out
-rw-r--r--  1 djboss dev  629  5月 18 13:22 hadoop-djboss-datanode-TestStation61.out.1
-rw-r--r--  1 djboss dev 3892  5月 18 13:31 hadoop-djboss-tasktracker-TestStation61.log
-rw-r--r--  1 djboss dev    0  5月 18 13:31 hadoop-djboss-tasktracker-TestStation61.out
-rw-r--r--  1 djboss dev    0  5月 18 13:22 hadoop-djboss-tasktracker-TestStation61.out.1

查看datanode上日志

[djboss@TestStation61 logs]$ more hadoop-djboss-tasktracker-TestStation61.log
2012-05-18 13:22:21,463 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting TaskTracker
STARTUP_MSG:   host = TestStation61/192.168.123.61
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.0.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
************************************************************/
2012-05-18 13:22:21,725 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2012-05-18 13:22:21,806 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2012-05-18 13:22:21,808 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2012-05-18 13:22:21,808 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
2012-05-18 13:22:22,555 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2012-05-18 13:22:22,684 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.IllegalArgumentException: Does not contain a valid host:port authority: local
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:162)
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:128)
	at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:2560)
	at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1426)
	at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)

2012-05-18 13:22:22,685 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down TaskTracker at TestStation61/192.168.123.61
************************************************************/
2012-05-18 13:31:26,798 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting TaskTracker
STARTUP_MSG:   host = TestStation61/192.168.123.61
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.0.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
************************************************************/
2012-05-18 13:31:27,059 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2012-05-18 13:31:27,139 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2012-05-18 13:31:27,141 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2012-05-18 13:31:27,141 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
2012-05-18 13:31:27,789 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2012-05-18 13:31:27,916 ERROR org.apache.hadoop.mapred.TaskTracker: Can not start task tracker because java.lang.IllegalArgumentException: Does not contain a valid host:port authority: local
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:162)
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:128)
	at org.apache.hadoop.mapred.JobTracker.getAddress(JobTracker.java:2560)
	at org.apache.hadoop.mapred.TaskTracker.<init>(TaskTracker.java:1426)
	at org.apache.hadoop.mapred.TaskTracker.main(TaskTracker.java:3742)

2012-05-18 13:31:27,917 INFO org.apache.hadoop.mapred.TaskTracker: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down TaskTracker at TestStation61/192.168.123.61
************************************************************/
 
[djboss@TestStation61 logs]$ more hadoop-djboss-datanode-TestStation61.log
2012-05-18 13:22:16,990 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = TestStation61/192.168.123.61
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.0.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
************************************************************/
2012-05-18 13:22:17,298 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2012-05-18 13:22:17,320 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2012-05-18 13:22:17,322 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2012-05-18 13:22:17,322 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2012-05-18 13:22:17,460 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2012-05-18 13:31:22,268 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = TestStation61/192.168.123.61
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.0.2
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0.2 -r 1304954; compiled by 'hortonfo' on Sat Mar 24 23:58:21 UTC 2012
************************************************************/
2012-05-18 13:31:22,491 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2012-05-18 13:31:22,509 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2012-05-18 13:31:22,511 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2012-05-18 13:31:22,511 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2012-05-18 13:31:22,635 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2012-05-18 13:31:22,849 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.lang.IllegalArgumentException: Does not contain a valid host:port authority: file:///
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:162)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:198)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:228)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.getServiceAddress(NameNode.java:222)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:337)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
	at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)

2012-05-18 13:31:22,851 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down DataNode at TestStation61/192.168.123.61
************************************************************/
 

!!!!有问题!!!!

应用:统计数据

./hadoop fs -copyFromLocal /home/djboss/hdtest/filespace/file* test-in

./hadoop jar ../hadoop-examples-1.0.2.jar wordcount test-in output

问题:

Warning: $HADOOP_HOME is deprecated.

****hdfs://192.168.123.24:54310/user/djboss/test-in
12/05/18 16:31:52 INFO input.FileInputFormat: Total input paths to process : 2
12/05/18 16:31:53 INFO mapred.JobClient: Cleaning up the staging area hdfs://192.168.123.24:54310/home/djboss/hdtest/tmp/mapred/staging/djboss/.staging/job_201205181619_0001
12/05/18 16:31:53 ERROR security.UserGroupInformation: PriviledgedActionException as:djboss cause:java.io.IOException: Call to /192.168.123.24:54311 failed on local exception: java.io.EOFException
java.io.IOException: Call to /192.168.123.24:54311 failed on local exception: java.io.EOFException
	at org.apache.hadoop.ipc.Client.wrapException(Client.java:1103)
	at org.apache.hadoop.ipc.Client.call(Client.java:1071)
	at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
	at org.apache.hadoop.mapred.$Proxy2.submitJob(Unknown Source)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:921)
	at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:850)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1093)
	at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:850)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:500)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:530)
	at org.apache.hadoop.examples.WordCount.main(WordCount.java:67)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
	at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
	at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:64)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.io.EOFException
	at java.io.DataInputStream.readInt(DataInputStream.java:375)
	at org.apache.hadoop.ipc.Client$Connection.receiveResponse(Client.java:800)
	at org.apache.hadoop.ipc.Client$Connection.run(Client.java:745)
 

输入路径和输出路径

[djboss@DevStation24 hadoop-1.0.2]$ ./bin/hadoop dfs -ls
Warning: $HADOOP_HOME is deprecated.

Found 1 items
drwxr-xr-x   - djboss supergroup          0 2012-05-18 13:51 /user/djboss/test-in
 

疑问:master启动slave?

start-all.sh

stop-all.sh

http://www.hadoopor.com/thread-71-1-1.html

http://www.hadoopor.com/viewthread.php?action=printable&tid=71

问题描述

step1:启动hadoop

$./start-all.sh

step2:

实验2

step1:单独启动hdfs(包括namenode和datanode)

./start-dfs.sh

starting namenode, logging to /home/djboss/hdtest/hadoop-1.0.2/libexec/../logs/hadoop-djboss-namenode-DevStation24.out
192.168.123.61: starting datanode, logging to /home/djboss/hdtest/hadoop-1.0.2/libexec/../logs/hadoop-djboss-datanode-TestStation61.out
192.168.123.24: starting secondarynamenode, logging to /home/djboss/hdtest/hadoop-1.0.2/libexec/../logs/hadoop-djboss-secondarynamenode-DevStation24.out
 

http://www.infoq.com/cn/articles/hadoop-config-tip

猜你喜欢

转载自nemogu.iteye.com/blog/1533985
今日推荐