大数据运维MapReduce

MapReduce 题:

  1. 在集群节点中/usr/hdp/2.4.3.0-227/hadoop-mapreduce/目录下,存在一个案例JAR 包 hadoop-mapreduce-examples.jar。运行 JAR 包中的 PI 程序来进行计算圆周率π的近似值,要求运行 5 次 Map 任务,每个 Map 任务的投掷次数为 5。
    root@master ~]# su hdfs
    [hdfs@master ~]$ cd /usr/hdp/2.6.1.0-129/hadoop-mapreduce/
    [hdfs@master hadoop-mapreduce]$ hadoop jar hadoop-mapreduce-examples.jar pi 5 5
    Number of Maps = 5
    Samples per Map = 5
    Wrote input for Map #0
    Wrote input for Map #1
    Wrote input for Map #2
    Wrote input for Map #3
    Wrote input for Map #4
    Starting Job
    19/05/03 16:08:42 INFO client.RMProxy: Connecting to ResourceManager at slaver1.hadoop/10.0.0.104:8050
    19/05/03 16:08:42 INFO client.AHSProxy: Connecting to Application History server at slaver1.hadoop/10.0.0.104:10200
    19/05/03 16:08:42 INFO input.FileInputFormat: Total input paths to process : 5
    19/05/03 16:08:42 INFO mapreduce.JobSubmitter: number of splits:5
    19/05/03 16:08:43 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1556738815524_0004
    19/05/03 16:08:43 INFO impl.YarnClientImpl: Submitted application application_1556738815524_0004
    19/05/03 16:08:43 INFO mapreduce.Job: The url to track the job: http:// slaver1.hadoop:8088/proxy/application_1556738815524_0004/
    19/05/03 16:08:43 INFO mapreduce.Job: Running job: job_1556738815524_0004
    19/05/03 16:08:50 INFO mapreduce.Job: Job job_1556738815524_0004 running in uber mode : false
    19/05/03 16:08:50 INFO mapreduce.Job: map 0% reduce 0%
    19/05/03 16:08:57 INFO mapreduce.Job: map 20% reduce 0%
    19/05/03 16:08:58 INFO mapreduce.Job: map 40% reduce 0%
    19/05/03 16:09:01 INFO mapreduce.Job: map 60% reduce 0%
    19/05/03 16:09:04 INFO mapreduce.Job: map 80% reduce 0%
    19/05/03 16:09:05 INFO mapreduce.Job: map 100% reduce 0%
    19/05/03 16:09:09 INFO mapreduce.Job: map 100% reduce 100%
    19/05/03 16:09:10 INFO mapreduce.Job: Job job_1556738815524_0004 completed successfully
    19/05/03 16:09:10 INFO mapreduce.Job: Counters: 49
    File System Counters
    FILE: Number of bytes read=116
    FILE: Number of bytes written=886989
    FILE: Number of read operations=0
    FILE: Number of large read operations=0
    FILE: Number of write operations=0
    HDFS: Number of bytes read=1340
    HDFS: Number of bytes written=215
    HDFS: Number of read operations=23
    HDFS: Number of large read operations=0
    HDFS: Number of write operations=3
    Job Counters
    Launched map tasks=5
    Launched reduce tasks=1
    Data-local map tasks=5
    Total time spent by all maps in occupied slots (ms)=44726
    Total time spent by all reduces in occupied slots (ms)=7838
    Total time spent by all map tasks (ms)=22363
    Total time spent by all reduce tasks (ms)=3919
    Total vcore-milliseconds taken by all map tasks=22363
    Total vcore-milliseconds taken by all reduce tasks=3919
    Total megabyte-milliseconds taken by all map tasks=34349568
    Total megabyte-milliseconds taken by all reduce tasks=8026112
    Map-Reduce Framework
    Map input records=5
    Map output records=10
    Map output bytes=90
    Map output materialized bytes=140
    Input split bytes=750
    Combine input records=0
    Combine output records=0
    Reduce input groups=2
    Reduce shuffle bytes=140
    Reduce input records=10
    Reduce output records=0
    Spilled Records=20
    Shuffled Maps =5
    Failed Shuffles=0
    Merged Map outputs=5
    GC time elapsed (ms)=400
    CPU time spent (ms)=5840
    Physical memory (bytes) snapshot=5756882944
    Virtual memory (bytes) snapshot=19876769792
    Total committed heap usage (bytes)=5479333888
    Shuffle Errors
    BAD_ID=0
    CONNECTION=0
    IO_ERROR=0
    WRONG_LENGTH=0
    WRONG_MAP=0
    WRONG_REDUCE=0
    File Input Format Counters
    Bytes Read=590
    File Output Format Counters
    Bytes Written=97
    Job Finished in 28.341 seconds
    Estimated value of Pi is 3.68000000000000000000

  2. 在集群节点中/usr/hdp/2.4.3.0-227/hadoop-mapreduce/目录下,存在一个案例JAR 包 hadoop-mapreduce-examples.jar。运行 JAR 包中的 wordcount 程序来对/1daoyun/file/BigDataSkills.txt 文件进行单词计数,将运算结果输出到/1daoyun/output 目录中,使用相关命令查询单词计数结果。
    root@master ~]# su hdfs
    [hdfs@master ~]$ cd /usr/hdp/2.6.1.0-129/hadoop-mapreduce/
    [hdfs@master hadoop-mapreduce]$ hadoop jar hadoop-mapreduce-examples.jar wordcount /1daoyun/file/BigDataSkills.txt /1daoyun/output
    19/05/03 16:13:07 INFO client.RMProxy: Connecting to ResourceManager at slaver1.hadoop/10.0.0.104:8050
    19/05/03 16:13:07 INFO client.AHSProxy: Connecting to Application History server at slaver1.hadoop/10.0.0.104:10200
    19/05/03 16:13:08 INFO input.FileInputFormat: Total input paths to process : 1
    19/05/03 16:13:08 INFO mapreduce.JobSubmitter: number of splits:1
    19/05/03 16:13:08 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1556738815524_0005
    19/05/03 16:13:09 INFO impl.YarnClientImpl: Submitted application application_1556738815524_0005
    19/05/03 16:13:09 INFO mapreduce.Job: The url to track the job: http:// slaver1.hadoop:8088/proxy/application_1556738815524_0005/
    19/05/03 16:13:09 INFO mapreduce.Job: Running job: job_1556738815524_0005
    19/05/03 16:13:17 INFO mapreduce.Job: Job job_1556738815524_0005 running in uber mode : false
    19/05/03 16:13:17 INFO mapreduce.Job: map 0% reduce 0%
    19/05/03 16:13:23 INFO mapreduce.Job: map 100% reduce 0%
    19/05/03 16:13:30 INFO mapreduce.Job: map 100% reduce 100%
    19/05/03 16:13:31 INFO mapreduce.Job: Job job_1556738815524_0005 completed successfully
    19/05/03 16:13:31 INFO mapreduce.Job: Counters: 49
    File System Counters
    FILE: Number of bytes read=158
    FILE: Number of bytes written=295257
    FILE: Number of read operations=0
    FILE: Number of large read operations=0
    FILE: Number of write operations=0
    HDFS: Number of bytes read=265
    HDFS: Number of bytes written=104
    HDFS: Number of read operations=6
    HDFS: Number of large read operations=0
    HDFS: Number of write operations=2
    Job Counters
    Launched map tasks=1
    Launched reduce tasks=1
    Data-local map tasks=1
    Total time spent by all maps in occupied slots (ms)=7322
    Total time spent by all reduces in occupied slots (ms)=10228
    Total time spent by all map tasks (ms)=3661
    Total time spent by all reduce tasks (ms)=5114
    Total vcore-milliseconds taken by all map tasks=3661
    Total vcore-milliseconds taken by all reduce tasks=5114
    Total megabyte-milliseconds taken by all map tasks=5623296
    Total megabyte-milliseconds taken by all reduce tasks=10473472
    Map-Reduce Framework
    Map input records=11
    Map output records=22
    Map output bytes=230
    Map output materialized bytes=158
    Input split bytes=121
    Combine input records=22
    Combine output records=12
    Reduce input groups=12
    Reduce shuffle bytes=158
    Reduce input records=12
    Reduce output records=12
    Spilled Records=24
    Shuffled Maps =1
    Failed Shuffles=0
    Merged Map outputs=1
    GC time elapsed (ms)=116
    CPU time spent (ms)=2220
    Physical memory (bytes) snapshot=1325559808
    Virtual memory (bytes) snapshot=6946390016
    Total committed heap usage (bytes)=1269301248
    Shuffle Errors
    BAD_ID=0
    CONNECTION=0
    IO_ERROR=0
    WRONG_LENGTH=0
    WRONG_MAP=0
    WRONG_REDUCE=0
    File Input Format Counters
    Bytes Read=144
    File Output Format Counters
    Bytes Written=104
    [hdfs@master hadoop-mapreduce]$ hadoop fs -cat /1daoyun/output/part-r-00000
    docker 1
    elasticsearch 1
    flume 1
    hadoop 5
    hbase 1
    hive 3
    kafka 1
    redis 1
    solr 1
    spark 5
    sqoop 1
    storm 1

  3. 在集群节点中/usr/hdp/2.4.3.0-227/hadoop-mapreduce/目录下,存在一个案例JAR 包 hadoop-mapreduce-examples.jar。运行 JAR 包中的 sudoku 程序来计算下表中数独运算题的结果。。
    root@master ~]# su hdfs
    [hdfs@master ~]$ cd /usr/hdp/2.6.1.0-129/hadoop-mapreduce/
    [hdfs@master hadoop-mapreduce]$ hadoop jar hadoop-mapreduce-examples.jar sudoku /opt/puzzle1.dta
    Solving /opt/puzzle1.dta
    8 1 2 7 5 3 6 4 9
    9 4 3 6 8 2 1 7 5
    6 7 5 4 9 1 2 8 3
    1 5 4 2 3 7 8 9 6
    3 6 9 8 4 5 7 2 1
    2 8 7 1 6 9 5 3 4
    5 2 1 9 7 4 3 6 8
    4 3 8 5 2 6 9 1 7
    7 9 6 3 1 8 4 5 2

Found 1 solutions

4.在集群节点中/usr/hdp/2.4.3.0-227/hadoop-mapreduce/目录下,存在一个案例JAR 包 hadoop-mapreduce-examples.jar。运行 JAR 包中的 grep 程序来统计文件系统中/1daoyun/file/BigDataSkills.txt 文件中“Hadoop”出现的次数,统计完成后,查询统计结果信息。
root@master ~]# su hdfs
[hdfs@master ~]$ cd /usr/hdp/2.6.1.0-129/hadoop-mapreduce/
[hdfs@master hadoop-mapreduce]$ hadoop jar hadoop-mapreduce-examples.jar grep /1daoyun/file/BigDataSkills.txt /1daoyun/output hadoop

猜你喜欢

转载自blog.csdn.net/mn525520/article/details/93775024