求好友中互粉的好友对

好友列表：

A:B,C,D,F,E,O
B:A,C,E,K
C:F,A,D,I
D:A,E,F,L
E:B,C,D,M,L
F:A,B,C,D,E,O,M
G:A,C,D,E,F
H:A,C,D,E,O
I:A,O
J:B,O
K:A,C,D
L:D,E,F
M:E,F,G
O:A,H,I,J

分析A里面好友有B，而B里面好友有A，那么A跟B就是互为好友。

我们的Map过程：
将好友列表文件的每一行数据都进行提取出来，提取成好友：

A:B,C,D,F,E,O

经过Map就变成：

<A-B,NULL>
<A-C,NULL>
<A-D,NULL>
<A-F,NULL>
<A-E,NULL>
<A-O,NULL>

B:A,C,E,K

经过Map就变成

<A-B.NULL>
<B-C,NULL>
<B-E,NULL>
<B-K,NULL>

static class EachFanMapper extends Mapper<LongWritable, Text, Text, NullWritable> {
		Text k = new Text();
		@Override
		protected void map(LongWritable key, Text value, Context context)
				throws IOException, InterruptedException {
				String line = value.toString();
				String user = line.split(":")[0];
				String[] friends = line.split(":")[1].split(",");
				for(int i = 0; i < friends.length; i++) {
					String friend = friends[i];
					/**
					 * 下面这个过程就是关键的过程，
					 * 第一行数据 ： A:B,C,D,F,E,O
					 * 第二行数据  ：B:A,C,E,K
					 * A里面有B，B里面有A
					 * 现在我们需要获取这两个
					 * 那么我们在map的时候， user 跟 friend 进行组合往reduce输出的时候，
					 * 我们组合的时候需要进行排序，  
					 * 第一行数据组合  A-B
					 * 第二行如果没有排序 那么就是  B-A
					 * 但是我们在reduce进行处理时，需要的是相同的key
					 * 那么我们应该让 A-B  与  B-A 变成同一个key往reduce输出
					 * 这样reduce才能根据相同的key进行数量的统计
					 * 当数量为2时，就代表，A-B出现了两次，那么这一对手机互相关注的。
					 */
					if(user.compareTo(friend) < 0) {
						k.set(user + "-" + friend);
						context.write(k, NullWritable.get());
					} else {
						k.set(friend + "-" + user);
						context.write(k, NullWritable.get());
					}
				}
		}
	}

reduce

static class EachFanReducer extends Reducer<Text, NullWritable, Text, NullWritable> {
		@Override
		protected void reduce(Text key, Iterable<NullWritable> values, Context context) throws IOException, InterruptedException {
			int count = 0;
			for(NullWritable nw : values) {
				count++;
			}
			/**
			 * 严格来说，这个count要么为1 要么为 2，当为2的时候，就说明是互粉
			 */
			if(count == 2) {
				context.write(key, NullWritable.get());
			}
		}
	}

代码
https://gitee.com/tanghongping/hadoopMapReduce/tree/master/src/com/thp/bigdata/eachFan

求好友中互粉的好友对

reduce

猜你喜欢