Mapreduce 数据处理过程简介

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/c13232906050/article/details/65632578

前言

本文用到key相关的排序知识,需要了解请转移到上一篇文章。

Mapreduce数据处理过程

1. Mapper

map()

每运行一次map()方法,就会调用一个Partitioner的getPartition()方法;两个方法交替运行,直到该Mapper的输入数据被处理完。

所有输入数据都经过map()和getPartition()处理后,每个Partition的数据进行一次排序(排序的实现请转移到上一篇文章;没有说明,该排序算法在整个mapreduce里都是使用同一个实现),排序结束后,进入Combiner(如果有)。

2. Combiner(可选)

combiner也是以一个partition为单位,即每个partition单独进行combine。

先对key进行两两排序。

例如,输入:1,3,5,6,9;则两两排序为:(1,3),(3,5),(5,6),(6,9)。

每进行一次两两排序,对reduce()里的values添加一个value。有两种情况:

  1. 若两两排序返回非0,即判断为不相等;向reduce()里的values中,只添加一个value。reduce()完整结束(values.iterator().hasNext == false),进入下一次reduce()。
  2. 若两两排序返回0,即判断为相等;向reduce()里的values中,添加一个value,并等待下一次两两排序的结果。若下一次两两排序返回不相等,则把这一次相等的另一个value添加到values中,reduce()完整结束;若下一次两两排序返回相等,则同样添加value并等待下下一次两两排序的结果,直到返回不相等reduce()才完整结束,进入下一次reduce()。

两两排序全部完成后,进入Reducer。

3. Reducer

Sorting(当Reducer有多个输入时)

如果当前Reducer只有一个输入源,则不会触发这里的Sorting;只有多个输入源时,才会触发,排序的算法同一个算法。

同样是两两排序,假设有两个输入源,a和b,用a(n)表示第n个元素:
因为a,b已经分别经过排序,所以用a(1)跟b(1)比较,必定可以确认其中一个在整体排序中的位置;
假设比较的结果是a(1)可以确认位置,则用b(1)和a(2)比较。
如此操作直到全部排序结束。

Grouping(可选,类似于Combiner的两两排序)

grouping和combiner中的过程一样(但比较算法不一定相同)

(下面复制Combiner过程的描述)

例如,输入:1,3,5,6,9;则两两排序为:(1,3),(3,5),(5,6),(6,9)。

每进行一次两两排序,对reduce()里的values添加一个value。有两种情况:

  1. 若两两排序返回非0,即判断为不相等;向reduce()里的values中,只添加一个value。reduce()完整结束(values.iterator().hasNext == false),进入下一次reduce()。
  2. 若两两排序返回0,即判断为相等;向reduce()里的values中,添加一个value,并等待下一次两两排序的结果。若下一次两两排序返回不相等,则把这一次相等的另一个value添加到values中,reduce()完整结束;若下一次两两排序返回相等,则同样添加value并等待下下一次两两排序的结果,直到返回不相等reduce()才完整结束,进入下一次reduce()。

不同的是,grouping的排序实现需要调用下面方法指定

someMethod() {
    Job job = Job.getInstance();

    ...

    job.setGroupingComparatorClass(Class<? extends RawComparator> cls);
}

传入的比较器也是和上一篇文章使用的一样,比较的实现也一样,就不再一次解释了。

reduce真正过程

两种情况:

  1. 单输入源

重复:grouping一次 -> values增加一个value
最后一次:grouping一次 -> values增加两个value

  1. 多输入源

第一次:sorting两次 -> grouping一次 -> values增加一个value
重复:sorting一次-> grouping一次 -> values增加一个value
最后一次:grouping一次 -> values增加两个value

简单具体测试结果

相同的输入文本文件

1
2
3
4
5
6
7
8
9
10
11
21
123

Mapper一个,Reducer一个

Map结果

Thread[main,5,main] map: 1, cnt=1
Thread[main,5,main] map: 2, cnt=2
Thread[main,5,main] map: 3, cnt=3
Thread[main,5,main] map: 4, cnt=4
Thread[main,5,main] map: 5, cnt=5
Thread[main,5,main] map: 6, cnt=6
Thread[main,5,main] map: 7, cnt=7
Thread[main,5,main] map: 8, cnt=8
Thread[main,5,main] map: 9, cnt=9
Thread[main,5,main] map: 10, cnt=10
Thread[main,5,main] map: 11, cnt=11
Thread[main,5,main] map: 21, cnt=12
Thread[main,5,main] map: 123, cnt=13
Thread[main,5,main] MyRawComparator: [7,123], cnt=14
Thread[main,5,main] MyRawComparator: [123,1], cnt=15
Thread[main,5,main] MyRawComparator: [7,123], cnt=16
Thread[main,5,main] MyRawComparator: [21,123], cnt=17
Thread[main,5,main] MyRawComparator: [123,7], cnt=18
Thread[main,5,main] MyRawComparator: [123,2], cnt=19
Thread[main,5,main] MyRawComparator: [123,3], cnt=20
Thread[main,5,main] MyRawComparator: [123,4], cnt=21
Thread[main,5,main] MyRawComparator: [123,5], cnt=22
Thread[main,5,main] MyRawComparator: [123,6], cnt=23
Thread[main,5,main] MyRawComparator: [123,1], cnt=24
Thread[main,5,main] MyRawComparator: [11,123], cnt=25
Thread[main,5,main] MyRawComparator: [10,123], cnt=26
Thread[main,5,main] MyRawComparator: [9,123], cnt=27
Thread[main,5,main] MyRawComparator: [123,8], cnt=28
Thread[main,5,main] MyRawComparator: [10,1], cnt=29
Thread[main,5,main] MyRawComparator: [10,11], cnt=30
Thread[main,5,main] MyRawComparator: [9,8], cnt=31
Thread[main,5,main] MyRawComparator: [9,21], cnt=32
Thread[main,5,main] MyRawComparator: [8,21], cnt=33
Thread[main,5,main] MyRawComparator: [9,6], cnt=34
Thread[main,5,main] MyRawComparator: [8,6], cnt=35
Thread[main,5,main] MyRawComparator: [21,6], cnt=36
Thread[main,5,main] MyRawComparator: [9,5], cnt=37
Thread[main,5,main] MyRawComparator: [8,5], cnt=38
Thread[main,5,main] MyRawComparator: [6,5], cnt=39
Thread[main,5,main] MyRawComparator: [21,5], cnt=40
Thread[main,5,main] MyRawComparator: [9,4], cnt=41
Thread[main,5,main] MyRawComparator: [8,4], cnt=42
Thread[main,5,main] MyRawComparator: [6,4], cnt=43
Thread[main,5,main] MyRawComparator: [5,4], cnt=44
Thread[main,5,main] MyRawComparator: [21,4], cnt=45
Thread[main,5,main] MyRawComparator: [9,3], cnt=46
Thread[main,5,main] MyRawComparator: [8,3], cnt=47
Thread[main,5,main] MyRawComparator: [6,3], cnt=48
Thread[main,5,main] MyRawComparator: [5,3], cnt=49
Thread[main,5,main] MyRawComparator: [4,3], cnt=50
Thread[main,5,main] MyRawComparator: [21,3], cnt=51
Thread[main,5,main] MyRawComparator: [9,2], cnt=52
Thread[main,5,main] MyRawComparator: [8,2], cnt=53
Thread[main,5,main] MyRawComparator: [6,2], cnt=54
Thread[main,5,main] MyRawComparator: [5,2], cnt=55
Thread[main,5,main] MyRawComparator: [4,2], cnt=56
Thread[main,5,main] MyRawComparator: [3,2], cnt=57
Thread[main,5,main] MyRawComparator: [21,2], cnt=58
Thread[main,5,main] MyRawComparator: [9,7], cnt=59
Thread[main,5,main] MyRawComparator: [8,7], cnt=60
Thread[main,5,main] MyRawComparator: [6,7], cnt=61
Thread[main,5,main] MyRawComparator: [1,10], cnt=62

(注解:没有Combiner的话,则没有下面的输出)
Thread[main,5,main] ------ combine start, key=1, cnt=63 --------
Thread[main,5,main] combine: 1, cnt=64
Thread[main,5,main] ------ combine end, key=1, cnt=65 --------

Thread[main,5,main] MyRawComparator: [10,11], cnt=66
Thread[main,5,main] ------ combine start, key=10, cnt=67 --------
Thread[main,5,main] combine: 10, cnt=68
Thread[main,5,main] ------ combine end, key=10, cnt=69 --------

Thread[main,5,main] MyRawComparator: [11,123], cnt=70
Thread[main,5,main] ------ combine start, key=11, cnt=71 --------
Thread[main,5,main] combine: 11, cnt=72
Thread[main,5,main] ------ combine end, key=11, cnt=73 --------

Thread[main,5,main] MyRawComparator: [123,2], cnt=74
Thread[main,5,main] ------ combine start, key=123, cnt=75 --------
Thread[main,5,main] combine: 123, cnt=76
Thread[main,5,main] ------ combine end, key=123, cnt=77 --------

Thread[main,5,main] MyRawComparator: [2,21], cnt=78
Thread[main,5,main] ------ combine start, key=2, cnt=79 --------
Thread[main,5,main] combine: 2, cnt=80
Thread[main,5,main] ------ combine end, key=2, cnt=81 --------

Thread[main,5,main] MyRawComparator: [21,3], cnt=82
Thread[main,5,main] ------ combine start, key=21, cnt=83 --------
Thread[main,5,main] combine: 21, cnt=84
Thread[main,5,main] ------ combine end, key=21, cnt=85 --------

Thread[main,5,main] MyRawComparator: [3,4], cnt=86
Thread[main,5,main] ------ combine start, key=3, cnt=87 --------
Thread[main,5,main] combine: 3, cnt=88
Thread[main,5,main] ------ combine end, key=3, cnt=89 --------

Thread[main,5,main] MyRawComparator: [4,5], cnt=90
Thread[main,5,main] ------ combine start, key=4, cnt=91 --------
Thread[main,5,main] combine: 4, cnt=92
Thread[main,5,main] ------ combine end, key=4, cnt=93 --------

Thread[main,5,main] MyRawComparator: [5,6], cnt=94
Thread[main,5,main] ------ combine start, key=5, cnt=95 --------
Thread[main,5,main] combine: 5, cnt=96
Thread[main,5,main] ------ combine end, key=5, cnt=97 --------

Thread[main,5,main] MyRawComparator: [6,7], cnt=98
Thread[main,5,main] ------ combine start, key=6, cnt=99 --------
Thread[main,5,main] combine: 6, cnt=100
Thread[main,5,main] ------ combine end, key=6, cnt=101 --------

Thread[main,5,main] MyRawComparator: [7,8], cnt=102
Thread[main,5,main] ------ combine start, key=7, cnt=103 --------
Thread[main,5,main] combine: 7, cnt=104
Thread[main,5,main] ------ combine end, key=7, cnt=105 --------

Thread[main,5,main] MyRawComparator: [8,9], cnt=106
Thread[main,5,main] ------ combine start, key=8, cnt=107 --------
Thread[main,5,main] combine: 8, cnt=108
Thread[main,5,main] ------ combine end, key=8, cnt=109 --------

Thread[main,5,main] ------ combine start, key=9, cnt=110 --------
Thread[main,5,main] combine: 9, cnt=111
Thread[main,5,main] ------ combine end, key=9, cnt=112 --------

Reduce结果

Thread[main,5,main] MyGrouping: [1,10], cnt=1
Thread[main,5,main] ------ reduce start, key=1, cnt=2 --------
Thread[main,5,main] reduce: 1, cnt=3
Thread[main,5,main] MyGrouping: [10,11], cnt=4
Thread[main,5,main] reduce: 10, cnt=5
Thread[main,5,main] MyGrouping: [11,123], cnt=6
Thread[main,5,main] reduce: 11, cnt=7
Thread[main,5,main] MyGrouping: [123,2], cnt=8
Thread[main,5,main] reduce: 123, cnt=9
Thread[main,5,main] ------ reduce end, key=123, cnt=10 --------

Thread[main,5,main] MyGrouping: [2,21], cnt=11
Thread[main,5,main] ------ reduce start, key=2, cnt=12 --------
Thread[main,5,main] reduce: 2, cnt=13
Thread[main,5,main] MyGrouping: [21,3], cnt=14
Thread[main,5,main] reduce: 21, cnt=15
Thread[main,5,main] ------ reduce end, key=21, cnt=16 --------

Thread[main,5,main] MyGrouping: [3,4], cnt=17
Thread[main,5,main] ------ reduce start, key=3, cnt=18 --------
Thread[main,5,main] reduce: 3, cnt=19
Thread[main,5,main] ------ reduce end, key=3, cnt=20 --------

Thread[main,5,main] MyGrouping: [4,5], cnt=21
Thread[main,5,main] ------ reduce start, key=4, cnt=22 --------
Thread[main,5,main] reduce: 4, cnt=23
Thread[main,5,main] ------ reduce end, key=4, cnt=24 --------

Thread[main,5,main] MyGrouping: [5,6], cnt=25
Thread[main,5,main] ------ reduce start, key=5, cnt=26 --------
Thread[main,5,main] reduce: 5, cnt=27
Thread[main,5,main] ------ reduce end, key=5, cnt=28 --------

Thread[main,5,main] MyGrouping: [6,7], cnt=29
Thread[main,5,main] ------ reduce start, key=6, cnt=30 --------
Thread[main,5,main] reduce: 6, cnt=31
Thread[main,5,main] ------ reduce end, key=6, cnt=32 --------

Thread[main,5,main] MyGrouping: [7,8], cnt=33
Thread[main,5,main] ------ reduce start, key=7, cnt=34 --------
Thread[main,5,main] reduce: 7, cnt=35
Thread[main,5,main] ------ reduce end, key=7, cnt=36 --------

Thread[main,5,main] MyGrouping: [8,9], cnt=37
Thread[main,5,main] ------ reduce start, key=8, cnt=38 --------
Thread[main,5,main] reduce: 8, cnt=39
Thread[main,5,main] ------ reduce end, key=8, cnt=40 --------

Thread[main,5,main] ------ reduce start, key=9, cnt=41 --------
Thread[main,5,main] reduce: 9, cnt=42
Thread[main,5,main] ------ reduce end, key=9, cnt=43 --------

part-r-0000

(注解:两个字段是相同的,出现不相同的原因是:使用了Grouping导致不同key判断为相同,同一个key使用了两次或以上)
-----------------------------------
1   1
1   10
10  11
11  123
-----------------------------------
2   2
2   21
-----------------------------------
3   3
-----------------------------------
4   4
-----------------------------------
5   5
-----------------------------------
6   6
-----------------------------------
7   7
-----------------------------------
8   8
-----------------------------------
9   9

Mapper一个,Reducer两个

Map结果

Thread[main,5,main] partition: 1, reduce num=1, cnt=1
Thread[main,5,main] map: 1, cnt=2
Thread[main,5,main] partition: 2, reduce num=0, cnt=3
Thread[main,5,main] map: 2, cnt=4
Thread[main,5,main] partition: 3, reduce num=1, cnt=5
Thread[main,5,main] map: 3, cnt=6
Thread[main,5,main] partition: 4, reduce num=0, cnt=7
Thread[main,5,main] map: 4, cnt=8
Thread[main,5,main] partition: 5, reduce num=1, cnt=9
Thread[main,5,main] map: 5, cnt=10
Thread[main,5,main] partition: 6, reduce num=0, cnt=11
Thread[main,5,main] map: 6, cnt=12
Thread[main,5,main] partition: 7, reduce num=1, cnt=13
Thread[main,5,main] map: 7, cnt=14
Thread[main,5,main] partition: 8, reduce num=0, cnt=15
Thread[main,5,main] map: 8, cnt=16
Thread[main,5,main] partition: 9, reduce num=1, cnt=17
Thread[main,5,main] map: 9, cnt=18
Thread[main,5,main] partition: 10, reduce num=1, cnt=19
Thread[main,5,main] map: 10, cnt=20
Thread[main,5,main] partition: 11, reduce num=1, cnt=21
Thread[main,5,main] map: 11, cnt=22
Thread[main,5,main] partition: 21, reduce num=0, cnt=23
Thread[main,5,main] map: 21, cnt=24
Thread[main,5,main] partition: 123, reduce num=1, cnt=25
Thread[main,5,main] map: 123, cnt=26
Thread[main,5,main] MyRawComparator: [7,123], cnt=27
Thread[main,5,main] MyRawComparator: [123,1], cnt=28
Thread[main,5,main] MyRawComparator: [7,123], cnt=29
Thread[main,5,main] MyRawComparator: [11,123], cnt=30
Thread[main,5,main] MyRawComparator: [10,123], cnt=31
Thread[main,5,main] MyRawComparator: [9,123], cnt=32
Thread[main,5,main] MyRawComparator: [123,7], cnt=33
Thread[main,5,main] MyRawComparator: [1,123], cnt=34
Thread[main,5,main] MyRawComparator: [5,123], cnt=35
Thread[main,5,main] MyRawComparator: [123,3], cnt=36
Thread[main,5,main] MyRawComparator: [5,3], cnt=37
Thread[main,5,main] MyRawComparator: [5,9], cnt=38
Thread[main,5,main] MyRawComparator: [9,7], cnt=39
Thread[main,5,main] MyRawComparator: [5,7], cnt=40
Thread[main,5,main] MyRawComparator: [4,21], cnt=41
Thread[main,5,main] MyRawComparator: [11,10], cnt=42
Thread[main,5,main] MyRawComparator: [4,2], cnt=43
Thread[main,5,main] MyRawComparator: [21,2], cnt=44
Thread[main,5,main] MyRawComparator: [4,8], cnt=45
Thread[main,5,main] MyRawComparator: [11,1], cnt=46
Thread[main,5,main] MyRawComparator: [10,1], cnt=47
Thread[main,5,main] MyRawComparator: [8,6], cnt=48
Thread[main,5,main] MyRawComparator: [4,6], cnt=49
Thread[main,5,main] MyRawComparator: [2,21], cnt=50

(注解:没有Combiner的话,则没有下面的输出)
Thread[main,5,main] ------ combine start, key=2, cnt=51 --------
Thread[main,5,main] combine: 2, cnt=52
Thread[main,5,main] ------ combine end, key=2, cnt=53 --------

Thread[main,5,main] MyRawComparator: [21,4], cnt=54
Thread[main,5,main] ------ combine start, key=21, cnt=55 --------
Thread[main,5,main] combine: 21, cnt=56
Thread[main,5,main] ------ combine end, key=21, cnt=57 --------

Thread[main,5,main] MyRawComparator: [4,6], cnt=58
Thread[main,5,main] ------ combine start, key=4, cnt=59 --------
Thread[main,5,main] combine: 4, cnt=60
Thread[main,5,main] ------ combine end, key=4, cnt=61 --------

Thread[main,5,main] MyRawComparator: [6,8], cnt=62
Thread[main,5,main] ------ combine start, key=6, cnt=63 --------
Thread[main,5,main] combine: 6, cnt=64
Thread[main,5,main] ------ combine end, key=6, cnt=65 --------

Thread[main,5,main] ------ combine start, key=8, cnt=66 --------
Thread[main,5,main] combine: 8, cnt=67
Thread[main,5,main] ------ combine end, key=8, cnt=68 --------

Thread[main,5,main] MyRawComparator: [1,10], cnt=69
Thread[main,5,main] ------ combine start, key=1, cnt=70 --------
Thread[main,5,main] combine: 1, cnt=71
Thread[main,5,main] ------ combine end, key=1, cnt=72 --------

Thread[main,5,main] MyRawComparator: [10,11], cnt=73
Thread[main,5,main] ------ combine start, key=10, cnt=74 --------
Thread[main,5,main] combine: 10, cnt=75
Thread[main,5,main] ------ combine end, key=10, cnt=76 --------

Thread[main,5,main] MyRawComparator: [11,123], cnt=77
Thread[main,5,main] ------ combine start, key=11, cnt=78 --------
Thread[main,5,main] combine: 11, cnt=79
Thread[main,5,main] ------ combine end, key=11, cnt=80 --------

Thread[main,5,main] MyRawComparator: [123,3], cnt=81
Thread[main,5,main] ------ combine start, key=123, cnt=82 --------
Thread[main,5,main] combine: 123, cnt=83
Thread[main,5,main] ------ combine end, key=123, cnt=84 --------

Thread[main,5,main] MyRawComparator: [3,5], cnt=85
Thread[main,5,main] ------ combine start, key=3, cnt=86 --------
Thread[main,5,main] combine: 3, cnt=87
Thread[main,5,main] ------ combine end, key=3, cnt=88 --------

Thread[main,5,main] MyRawComparator: [5,7], cnt=89
Thread[main,5,main] ------ combine start, key=5, cnt=90 --------
Thread[main,5,main] combine: 5, cnt=91
Thread[main,5,main] ------ combine end, key=5, cnt=92 --------

Thread[main,5,main] MyRawComparator: [7,9], cnt=93
Thread[main,5,main] ------ combine start, key=7, cnt=94 --------
Thread[main,5,main] combine: 7, cnt=95
Thread[main,5,main] ------ combine end, key=7, cnt=96 --------

Thread[main,5,main] ------ combine start, key=9, cnt=97 --------
Thread[main,5,main] combine: 9, cnt=98
Thread[main,5,main] ------ combine end, key=9, cnt=99 --------

Reduce结果

(注解:Reduce0)
Thread[main,5,main] MyGrouping: [2,21], cnt=1
Thread[main,5,main] ------ reduce start, key=2, cnt=2 --------
Thread[main,5,main] reduce: 2, cnt=3
Thread[main,5,main] MyGrouping: [21,4], cnt=4
Thread[main,5,main] reduce: 21, cnt=5
Thread[main,5,main] ------ reduce end, key=21, cnt=6 --------

Thread[main,5,main] MyGrouping: [4,6], cnt=7
Thread[main,5,main] ------ reduce start, key=4, cnt=8 --------
Thread[main,5,main] reduce: 4, cnt=9
Thread[main,5,main] ------ reduce end, key=4, cnt=10 --------

Thread[main,5,main] MyGrouping: [6,8], cnt=11
Thread[main,5,main] ------ reduce start, key=6, cnt=12 --------
Thread[main,5,main] reduce: 6, cnt=13
Thread[main,5,main] ------ reduce end, key=6, cnt=14 --------

Thread[main,5,main] ------ reduce start, key=8, cnt=15 --------
Thread[main,5,main] reduce: 8, cnt=16
Thread[main,5,main] ------ reduce end, key=8, cnt=17 --------


(注解:Reduce1)
Thread[main,5,main] MyGrouping: [1,10], cnt=1
Thread[main,5,main] ------ reduce start, key=1, cnt=2 --------
Thread[main,5,main] reduce: 1, cnt=3
Thread[main,5,main] MyGrouping: [10,11], cnt=4
Thread[main,5,main] reduce: 10, cnt=5
Thread[main,5,main] MyGrouping: [11,123], cnt=6
Thread[main,5,main] reduce: 11, cnt=7
Thread[main,5,main] MyGrouping: [123,3], cnt=8
Thread[main,5,main] reduce: 123, cnt=9
Thread[main,5,main] ------ reduce end, key=123, cnt=10 --------

Thread[main,5,main] MyGrouping: [3,5], cnt=11
Thread[main,5,main] ------ reduce start, key=3, cnt=12 --------
Thread[main,5,main] reduce: 3, cnt=13
Thread[main,5,main] ------ reduce end, key=3, cnt=14 --------

Thread[main,5,main] MyGrouping: [5,7], cnt=15
Thread[main,5,main] ------ reduce start, key=5, cnt=16 --------
Thread[main,5,main] reduce: 5, cnt=17
Thread[main,5,main] ------ reduce end, key=5, cnt=18 --------

Thread[main,5,main] MyGrouping: [7,9], cnt=19
Thread[main,5,main] ------ reduce start, key=7, cnt=20 --------
Thread[main,5,main] reduce: 7, cnt=21
Thread[main,5,main] ------ reduce end, key=7, cnt=22 --------

Thread[main,5,main] ------ reduce start, key=9, cnt=23 --------
Thread[main,5,main] reduce: 9, cnt=24
Thread[main,5,main] ------ reduce end, key=9, cnt=25 --------

part-r-000x

(注解:两个字段是相同的,出现不相同的原因是:使用了Grouping导致不同key判断为相同,同一个key使用了两次或以上)

(注解:part-r-00000)
-----------------------------------
2   2
2   21
-----------------------------------
4   4
-----------------------------------
6   6
-----------------------------------
8   8


(注解:part-r-00000)
-----------------------------------
1   1
1   10
10  11
11  123
-----------------------------------
3   3
-----------------------------------
5   5
-----------------------------------
7   7
-----------------------------------
9   9

Mapper两个,Reducer一个

Map结果

每个Map的结果和Mapper一个,Reducer一个的时候一样。

Reduce结果

Thread[main,5,main] MyRawComparator: [1,1], cnt=1
Thread[main,5,main] MyRawComparator: [1,111], cnt=2
Thread[main,5,main] MyGrouping: [1,1], cnt=3
Thread[main,5,main] ------ reduce start, key=1, cnt=4 --------
Thread[main,5,main] reduce: 1, cnt=5
Thread[main,5,main] MyRawComparator: [111,10], cnt=6
Thread[main,5,main] MyGrouping: [1,10], cnt=7
Thread[main,5,main] reduce: 1, cnt=8
Thread[main,5,main] MyRawComparator: [111,11], cnt=9
Thread[main,5,main] MyGrouping: [10,11], cnt=10
Thread[main,5,main] reduce: 10, cnt=11
Thread[main,5,main] MyRawComparator: [111,123], cnt=12
Thread[main,5,main] MyGrouping: [11,111], cnt=13
Thread[main,5,main] reduce: 11, cnt=14
Thread[main,5,main] MyRawComparator: [123,3], cnt=15
Thread[main,5,main] MyGrouping: [111,123], cnt=16
Thread[main,5,main] reduce: 111, cnt=17
Thread[main,5,main] MyRawComparator: [3,2], cnt=18
Thread[main,5,main] MyGrouping: [123,2], cnt=19
Thread[main,5,main] reduce: 123, cnt=20
Thread[main,5,main] ------ reduce end, key=123, cnt=21 --------

Thread[main,5,main] MyRawComparator: [3,21], cnt=22
Thread[main,5,main] MyGrouping: [2,21], cnt=23
Thread[main,5,main] ------ reduce start, key=2, cnt=24 --------
Thread[main,5,main] reduce: 2, cnt=25
Thread[main,5,main] MyRawComparator: [3,3], cnt=26
Thread[main,5,main] MyGrouping: [21,3], cnt=27
Thread[main,5,main] reduce: 21, cnt=28
Thread[main,5,main] ------ reduce end, key=21, cnt=29 --------

Thread[main,5,main] MyRawComparator: [3,4], cnt=30
Thread[main,5,main] MyGrouping: [3,3], cnt=31
Thread[main,5,main] ------ reduce start, key=3, cnt=32 --------
Thread[main,5,main] reduce: 3, cnt=33
Thread[main,5,main] MyRawComparator: [4,77], cnt=34
Thread[main,5,main] MyGrouping: [3,4], cnt=35
Thread[main,5,main] reduce: 3, cnt=36
Thread[main,5,main] ------ reduce end, key=3, cnt=37 --------

Thread[main,5,main] MyRawComparator: [77,5], cnt=38
Thread[main,5,main] MyGrouping: [4,5], cnt=39
Thread[main,5,main] ------ reduce start, key=4, cnt=40 --------
Thread[main,5,main] reduce: 4, cnt=41
Thread[main,5,main] ------ reduce end, key=4, cnt=42 --------

Thread[main,5,main] MyRawComparator: [77,6], cnt=43
Thread[main,5,main] MyGrouping: [5,6], cnt=44
Thread[main,5,main] ------ reduce start, key=5, cnt=45 --------
Thread[main,5,main] reduce: 5, cnt=46
Thread[main,5,main] ------ reduce end, key=5, cnt=47 --------

Thread[main,5,main] MyRawComparator: [77,7], cnt=48
Thread[main,5,main] MyGrouping: [6,7], cnt=49
Thread[main,5,main] ------ reduce start, key=6, cnt=50 --------
Thread[main,5,main] reduce: 6, cnt=51
Thread[main,5,main] ------ reduce end, key=6, cnt=52 --------

Thread[main,5,main] MyRawComparator: [77,8], cnt=53
Thread[main,5,main] MyGrouping: [7,77], cnt=54
Thread[main,5,main] ------ reduce start, key=7, cnt=55 --------
Thread[main,5,main] reduce: 7, cnt=56
Thread[main,5,main] MyRawComparator: [8,90], cnt=57
Thread[main,5,main] MyGrouping: [77,8], cnt=58
Thread[main,5,main] reduce: 77, cnt=59
Thread[main,5,main] ------ reduce end, key=77, cnt=60 --------

Thread[main,5,main] MyRawComparator: [90,9], cnt=61
Thread[main,5,main] MyGrouping: [8,9], cnt=62
Thread[main,5,main] ------ reduce start, key=8, cnt=63 --------
Thread[main,5,main] reduce: 8, cnt=64
Thread[main,5,main] ------ reduce end, key=8, cnt=65 --------

Thread[main,5,main] MyGrouping: [9,90], cnt=66
Thread[main,5,main] ------ reduce start, key=9, cnt=67 --------
Thread[main,5,main] reduce: 9, cnt=68
Thread[main,5,main] reduce: 90, cnt=69
Thread[main,5,main] ------ reduce end, key=90, cnt=70 --------

part-r-0000

(注解:两个字段是相同的,出现不相同的原因是:使用了Grouping导致不同key判断为相同,同一个key使用了两次或以上)
-----------------------------------
1   1
1   1
1   10
10  11
11  111
111 123
-----------------------------------
2   2
2   21
-----------------------------------
3   3
3   3
-----------------------------------
4   4
-----------------------------------
5   5
-----------------------------------
6   6
-----------------------------------
7   7
7   77
-----------------------------------
8   8
-----------------------------------
9   9
9   90

Mapper两个,Reducer两个

Map结果

每个Map和Mapper一个,Reducer两个相同。

Reduce结果

每个Reduce和Mapper两个,Reducer一个相同。

猜你喜欢

转载自blog.csdn.net/c13232906050/article/details/65632578