499 单词计数 (Map Reduce版本) C++中substr的用法 C++中substr函数的用法

原题网址：https://www.lintcode.com/problem/word-count-map-reduce/description

描述

使用 map reduce 来计算单词频率
https://hadoop.apache.org/docs/r1.2.1/mapred_tutorial.html#Example%3A+WordCount+v1.0

您在真实的面试中是否遇到过这个题？

样例

chunk1: "Google Bye GoodBye Hadoop code"
chunk2: "lintcode code Bye"


Get MapReduce result:
    Bye: 2
    GoodBye: 1
    Google: 1
    Hadoop: 1
    code: 2
    lintcode: 1

标签

Big Data

Map Reduce

思路：没怎么看懂这道题什么意思，参照着网上的代码捋了一遍。

Map类负责对原始数据进行处理，将字符串拆分成单词后输出到output； Reduce负责对Map输出的数据进行计数。转自此文

即：

map函数对输入的文本进行分词处理，然后输出（单词， 1）这样的结果，例如“You are a young man”，输出的就是（you， 1），（are， 1）之类的结果；

在reduce函数中，我们把具有相同key的结果聚合起来。reduce函数的第二个参数类型为Input<int>，这是一堆value的集合，他们具有相同的key，reduce函数的意义就是将这些结果聚合起来。

例如（”hello“， 1）和（”hello“， 1）聚合为（”hello“， 2），后者可能再次和（”hello“， 3）（”hello“， 1），聚合为（”hello“， 7）。转自此文

AC代码：

/**
 * Definition of Input:
 * template<class T>
 * class Input {
 * public:
 *     bool done(); 
 *         // Returns true if the iteration has elements or false.
 *     void next();
 *         // Move to the next element in the iteration
 *         // Runtime error if the iteration has no more elements
 *     T value();
 *        // Get the current element, Runtime error if
 *        // the iteration has no more elements
 * }
 */
class WordCountMapper: public Mapper {
public:
    void Map(Input<string>* input) {
        // Write your code here
        // Please directly use func 'output' to 
        // output the results into output buffer.
        // void output(string &key, int value);
        vector<string> vecStr; //没看懂这句是干什么的……;
        while(!input->done())
        {
            string str=input->value();
            int j=0;
            for(int i=0;i<=(int)str.size();i++)
            {
                if(str[i]==' '||i==str.size())
                {
                    string temp=str.substr(j,i-j);
                    output(temp,1);
                    j=i+1;
                }
            }
            input->next();
        }
        
    }
};


class WordCountReducer: public Reducer {
public:
    void Reduce(string &key, Input<int>* input) {
        // Write your code here
        // Please directly use func 'output' to 
        // output the results into output buffer.
        // void output(string &key, int value);
        int sum=0;
        while(!input->done())
        {
            sum+=input->value();
            input->next();
        }
        output(key,sum);
        
    }
};

其他参考：C++中substr的用法

C++中substr函数的用法

【C++】C++中substr的用法

499 单词计数 (Map Reduce版本) C++中substr的用法 C++中substr函数的用法

描述

样例

猜你喜欢