2.5 WordContent简单应用

第2章 Hadoop快速入门

2.5 WordContent简单应用

Hadoop的HelloWorld程序

2.5.1 创建HDFS目录

hdfs命令位于bin目录下,通过hdfs dfs -mkdir命令可以创建一个目录。

[root@node1 hadoop-2.7.3]# bin/hdfs dfs -mkdir -p input
      
      
  • 1

hdfs创建的目录默认会放到/user/{username}/目录下面,其中{username}是当前用户名。所以input目录应该在/user/root/下面。 
下面通过`hdfs dfs -ls`命令可以查看HDFS目录文件

[root@node1 hadoop-2.7.3]# bin/hdfs dfs -ls /
      
      
  • 1

这里写图片描述

2.5.2 上传文件到HDFS

在本地新建一个文本文件 
vi /root/words.txt

[root@node1 hadoop-2.7.3]# vi /root/words.txt
      
      
  • 1

随便输入几个单词,保存退出。 
这里写图片描述

将本地文件/root/words.txt上传到HDFS 
bin/hdfs dfs -put /root/words.txt input 
bin/hdfs dfs -ls input

这里写图片描述

2.5.3 运行WordContent

执行下面命令: 
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount input output


      
      
  1. [root@node1 hadoop- 2.7 .3]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples- 2.7 .3.jar wordcount input output
  2. 17/ 05/ 12 09: 04: 39 INFO client.RMProxy: Connecting to ResourceManager at / 0.0 .0 .0: 8032
  3. 17/ 05/ 12 09: 04: 41 INFO input.FileInputFormat: Total input paths to process : 1
  4. 17/ 05/ 12 09: 04: 41 INFO mapreduce.JobSubmitter: number of splits: 1
  5. 17/ 05/ 12 09: 04: 42 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1494590593576_0001
  6. 17/ 05/ 12 09: 04: 43 INFO impl.YarnClientImpl: Submitted application application_1494590593576_0001
  7. 17/ 05/ 12 09: 04: 43 INFO mapreduce.Job: The url to track the job: http: //node1: 8088 /proxy/application_1494590593576_0001/
  8. 17/ 05/ 12 09: 04: 43 INFO mapreduce.Job: Running job: job_1494590593576_0001
  9. 17/ 05/ 12 09: 05: 08 INFO mapreduce.Job: Job job_1494590593576_0001 running in uber mode : false
  10. 17/ 05/ 12 09: 05: 08 INFO mapreduce.Job: map 0% reduce 0%
  11. 17/ 05/ 12 09: 05: 19 INFO mapreduce.Job: map 100% reduce 0%
  12. 17/ 05/ 12 09: 05: 31 INFO mapreduce.Job: map 100% reduce 100%
  13. 17/ 05/ 12 09: 05: 32 INFO mapreduce.Job: Job job_1494590593576_0001 completed successfully
  14. 17/ 05/ 12 09: 05: 32 INFO mapreduce.Job: Counters: 49
  15. File System Counters
  16. FILE: Number of bytes read= 54
  17. FILE: Number of bytes written= 237325
  18. FILE: Number of read operations= 0
  19. FILE: Number of large read operations= 0
  20. FILE: Number of write operations= 0
  21. HDFS: Number of bytes read= 163
  22. HDFS: Number of bytes written= 32
  23. HDFS: Number of read operations= 6
  24. HDFS: Number of large read operations= 0
  25. HDFS: Number of write operations= 2
  26. Job Counters
  27. Launched map tasks= 1
  28. Launched reduce tasks= 1
  29. Data- local map tasks= 1
  30. Total time spent by all maps in occupied slots (ms)= 8861
  31. Total time spent by all reduces in occupied slots (ms)= 8430
  32. Total time spent by all map tasks (ms)= 8861
  33. Total time spent by all reduce tasks (ms)= 8430
  34. Total vcore-milliseconds taken by all map tasks= 8861
  35. Total vcore-milliseconds taken by all reduce tasks= 8430
  36. Total megabyte-milliseconds taken by all map tasks= 9073664
  37. Total megabyte-milliseconds taken by all reduce tasks= 8632320
  38. Map-Reduce Framework
  39. Map input records= 3
  40. Map output records= 9
  41. Map output bytes= 91
  42. Map output materialized bytes= 54
  43. Input split bytes= 108
  44. Combine input records= 9
  45. Combine output records= 4
  46. Reduce input groups= 4
  47. Reduce shuffle bytes= 54
  48. Reduce input records= 4
  49. Reduce output records= 4
  50. Spilled Records= 8
  51. Shuffled Maps = 1
  52. Failed Shuffles= 0
  53. Merged Map outputs= 1
  54. GC time elapsed (ms)= 249
  55. CPU time spent (ms)= 2950
  56. Physical memory (bytes) snapshot= 303017984
  57. Virtual memory (bytes) snapshot= 4157116416
  58. Total committed heap usage (bytes)= 165810176
  59. Shuffle Errors
  60. BAD_ID= 0
  61. CONNECTION= 0
  62. IO_ERROR= 0
  63. WRONG_LENGTH= 0
  64. WRONG_MAP= 0
  65. WRONG_REDUCE= 0
  66. File Input Format Counters
  67. Bytes Read= 55
  68. File Output Format Counters
  69. Bytes Written= 32
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69

2.5.4 查看结果

bin/hdfs dfs -ls output 
bin/hdfs dfs -cat output/part-r-00000


      
      
  1. [root @node1 hadoop- 2.7. 3] # bin/hdfs dfs -ls output/
  2. Found 2 items
  3. -rw-r--r-- 1 root supergroup 0 2017- 05- 12 09 : 05 output/_SUCCESS
  4. -rw-r--r-- 1 root supergroup 32 2017- 05- 12 09 : 05 output/part-r- 00000
  5. [root @node1 hadoop- 2.7. 3] # bin/hdfs dfs -cat output/part-r-00000
  6. Hadoop 3
  7. Hello 2
  8. Java 2
  9. World 2
  10. [root @node1 hadoop- 2.7. 3] #
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 这里写图片描述
版权声明:本文为博主原创文章,欢迎转载。 https://blog.csdn.net/chengyuqiang/article/details/71773515

第2章 Hadoop快速入门

2.5 WordContent简单应用

Hadoop的HelloWorld程序

2.5.1 创建HDFS目录

hdfs命令位于bin目录下,通过hdfs dfs -mkdir命令可以创建一个目录。

[root@node1 hadoop-2.7.3]# bin/hdfs dfs -mkdir -p input
    
    
  • 1

hdfs创建的目录默认会放到/user/{username}/目录下面,其中{username}是当前用户名。所以input目录应该在/user/root/下面。 
下面通过`hdfs dfs -ls`命令可以查看HDFS目录文件

[root@node1 hadoop-2.7.3]# bin/hdfs dfs -ls /
    
    
  • 1

这里写图片描述

2.5.2 上传文件到HDFS

在本地新建一个文本文件 
vi /root/words.txt

[root@node1 hadoop-2.7.3]# vi /root/words.txt
    
    
  • 1

随便输入几个单词,保存退出。 
这里写图片描述

将本地文件/root/words.txt上传到HDFS 
bin/hdfs dfs -put /root/words.txt input 
bin/hdfs dfs -ls input

这里写图片描述

2.5.3 运行WordContent

执行下面命令: 
bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.3.jar wordcount input output


    
    
  1. [root@node1 hadoop- 2.7 .3]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples- 2.7 .3.jar wordcount input output
  2. 17/ 05/ 12 09: 04: 39 INFO client.RMProxy: Connecting to ResourceManager at / 0.0 .0 .0: 8032
  3. 17/ 05/ 12 09: 04: 41 INFO input.FileInputFormat: Total input paths to process : 1
  4. 17/ 05/ 12 09: 04: 41 INFO mapreduce.JobSubmitter: number of splits: 1
  5. 17/ 05/ 12 09: 04: 42 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1494590593576_0001
  6. 17/ 05/ 12 09: 04: 43 INFO impl.YarnClientImpl: Submitted application application_1494590593576_0001
  7. 17/ 05/ 12 09: 04: 43 INFO mapreduce.Job: The url to track the job: http: //node1: 8088 /proxy/application_1494590593576_0001/
  8. 17/ 05/ 12 09: 04: 43 INFO mapreduce.Job: Running job: job_1494590593576_0001
  9. 17/ 05/ 12 09: 05: 08 INFO mapreduce.Job: Job job_1494590593576_0001 running in uber mode : false
  10. 17/ 05/ 12 09: 05: 08 INFO mapreduce.Job: map 0% reduce 0%
  11. 17/ 05/ 12 09: 05: 19 INFO mapreduce.Job: map 100% reduce 0%
  12. 17/ 05/ 12 09: 05: 31 INFO mapreduce.Job: map 100% reduce 100%
  13. 17/ 05/ 12 09: 05: 32 INFO mapreduce.Job: Job job_1494590593576_0001 completed successfully
  14. 17/ 05/ 12 09: 05: 32 INFO mapreduce.Job: Counters: 49
  15. File System Counters
  16. FILE: Number of bytes read= 54
  17. FILE: Number of bytes written= 237325
  18. FILE: Number of read operations= 0
  19. FILE: Number of large read operations= 0
  20. FILE: Number of write operations= 0
  21. HDFS: Number of bytes read= 163
  22. HDFS: Number of bytes written= 32
  23. HDFS: Number of read operations= 6
  24. HDFS: Number of large read operations= 0
  25. HDFS: Number of write operations= 2
  26. Job Counters
  27. Launched map tasks= 1
  28. Launched reduce tasks= 1
  29. Data- local map tasks= 1
  30. Total time spent by all maps in occupied slots (ms)= 8861
  31. Total time spent by all reduces in occupied slots (ms)= 8430
  32. Total time spent by all map tasks (ms)= 8861
  33. Total time spent by all reduce tasks (ms)= 8430
  34. Total vcore-milliseconds taken by all map tasks= 8861
  35. Total vcore-milliseconds taken by all reduce tasks= 8430
  36. Total megabyte-milliseconds taken by all map tasks= 9073664
  37. Total megabyte-milliseconds taken by all reduce tasks= 8632320
  38. Map-Reduce Framework
  39. Map input records= 3
  40. Map output records= 9
  41. Map output bytes= 91
  42. Map output materialized bytes= 54
  43. Input split bytes= 108
  44. Combine input records= 9
  45. Combine output records= 4
  46. Reduce input groups= 4
  47. Reduce shuffle bytes= 54
  48. Reduce input records= 4
  49. Reduce output records= 4
  50. Spilled Records= 8
  51. Shuffled Maps = 1
  52. Failed Shuffles= 0
  53. Merged Map outputs= 1
  54. GC time elapsed (ms)= 249
  55. CPU time spent (ms)= 2950
  56. Physical memory (bytes) snapshot= 303017984
  57. Virtual memory (bytes) snapshot= 4157116416
  58. Total committed heap usage (bytes)= 165810176
  59. Shuffle Errors
  60. BAD_ID= 0
  61. CONNECTION= 0
  62. IO_ERROR= 0
  63. WRONG_LENGTH= 0
  64. WRONG_MAP= 0
  65. WRONG_REDUCE= 0
  66. File Input Format Counters
  67. Bytes Read= 55
  68. File Output Format Counters
  69. Bytes Written= 32
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69

2.5.4 查看结果

bin/hdfs dfs -ls output 
bin/hdfs dfs -cat output/part-r-00000


    
    
  1. [root @node1 hadoop- 2.7. 3] # bin/hdfs dfs -ls output/
  2. Found 2 items
  3. -rw-r--r-- 1 root supergroup 0 2017- 05- 12 09 : 05 output/_SUCCESS
  4. -rw-r--r-- 1 root supergroup 32 2017- 05- 12 09 : 05 output/part-r- 00000
  5. [root @node1 hadoop- 2.7. 3] # bin/hdfs dfs -cat output/part-r-00000
  6. Hadoop 3
  7. Hello 2
  8. Java 2
  9. World 2
  10. [root @node1 hadoop- 2.7. 3] #
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 这里写图片描述
版权声明:本文为博主原创文章,欢迎转载。 https://blog.csdn.net/chengyuqiang/article/details/71773515

猜你喜欢

转载自blog.csdn.net/zqwzlanbao/article/details/83989080
2.5