Map Reduce Application(Top 10 IDs base on their value) - 代码天地

Map Reduce Application(Top 10 IDs base on their value)

其他 2018-09-25 14:20:51 阅读次数: 0

Top 10 IDs base on their value

First , we need to set the reduce to 1. For each map task, it is not a good idea to output each key/value pair. Instead, we can just output the top 10 IDs based on their value. So, less data will be written to disk and transferred to the reducer. If we need to get the top 10 for each mapper task, we need to iterator over the whole split. In map function, we collect each id/value, add it to the data structure that supports sorting like black-red tree, keep only the top 10. In the cleanup function, we output the result.

 1 //hadoop code for map/reduce task , see the cleanup function.
 2 public void run(Context context) throws IOException, InterruptedException {
 3     setup(context);
 4     try {
 5       while (context.nextKey()) {
 6         reduce(context.getCurrentKey(), context.getValues(), context);
 7       }
 8     } finally {
 9       cleanup(context);
10     }
11   }

The map task below. the sorted IDs is written in cleanup function.

The reduce task has the similar logic.(Note: there is only 1 reducer)

reference:https://www.youtube.com/watch?v=Bj6-maOjB8M

猜你喜欢

转载自www.cnblogs.com/nativestack/p/9699155.html

Map Reduce Application(Top 10 IDs base on their value)

map-reduce base

Map Reduce Application(Partitioninig)

Map Reduce Application(Join)

2020入侵检测（IDS）品牌TOP10

map/reduce

map reduce

map()与reduce()

人工智能之Python10 map和reduce

10、Python_高阶函数map/reduce/filter/sorted

django报错：Exception Value: invalid literal for int() with base 10: ''

Map-Reduce思想

java map和reduce

【转】Map Reduce & YARN

Google Map Reduce简介

Scala Map&Reduce

Map/Reduce hadoop 细节

hadoop Map/Reduce 初试

Map/Reduce执行流程

mongodb的简单map reduce

map reduce 任务串联

初探map/reduce原理

Hadoop Map/Reduce教程

Hadoop Map/Reduce框架

not hadoop but map/reduce

Map-Reduce浅析

map/reduce 过程的认识

MongoDB 关于Map及Reduce

Python的map和reduce

map()、reduce()、filter()总结

今日推荐

TIOBE 5 月榜单：Fortran “复活”进入 Top 10

GCC 14.1 发布

面壁智能发布 Eurux-8x22B 开源大模型 —— 堪称「理科状元」

开源日报 | 谷歌扶持鸿蒙上位；开源Rabbit R1；Docker加持的安卓手机；微软的焦虑和野心；海尔电器把开放平台关了

中国码农的“35岁魔咒”

蘭雅 CorelDRAW 插件 2024.5.1 国际劳动节版，免费下载

Arc Browser for Windows 1.0 正式 GA

90后程序员开发视频搬运软件、不到一年获利超 700 万，结局很刑！

周排行

基本数据类型封装类比较 Java源码解读(一) 8种基本类型对应的封装类型

JS实现无缝滚动上

深入解析HashMap原理（基于JDK1.8）

mysql的连接池

关于.htc

linux下的ubuntu12.04图形界面

【数论】好推不好记的扩展欧几里德

设备树详解

cscope + tags 简单设置

xml学习

每日归档

更多

2024-05-09(35)

2024-05-08(42)

2024-05-07(14)

2024-05-06(40)

2024-05-05(0)

2024-05-04(7)

2024-05-03(19)

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)