lintcode----最高频的k个单词

题目描述：
给一个单词列表，求出这个列表中出现频次最高的K个单词。

注意事项：
你需要按照单词的词频排序后输出，越高频的词排在越前面。如果两个单词出现的次数相同，则词典序小的排在前面。

样例：
给出单词列表：

[ “yes”, “lint”, “code”, “yes”, “code”, “baby”, “you”, “baby”, “chrome”, “safari”, “lint”, “code”, “body”, “lint”, “code” ]

如果 k = 3, 返回 [“code”, “lint”, “baby”]。
如果 k = 4, 返回 [“code”, “lint”, “baby”, “yes”]。

思路讲解：
自从发现了优先队列这个东西，我发现做这样类似的题，都可以直接套用优先队列，只不过这道题并不是简简单单的使用优先队列就够了，这里我们首先要统计各个单词出现的次数，这里我们使用的是STL中的set来进行查找，其查找的时间复杂度是log（n），首先如果找到了，就将其次数加一，没有找到就将其加入set中，并也将其加入数组中，这里还有一个问题就是关于获取单词在数组中的位置，因为我们加一的时候，我们需要判断其是对哪一个单词对应的个数加一，这里我们就需要使用哈希表了，将单词与其在数组中的位置进行映射，这样我们就可以得到我们想要单词的位置了，然后我们思考一下单词与其次数我们是使用两个数组呢，还是使用结构体呢，我这里使用的是结构体，由于我们后面需要利用优先队列，所以我们只能使用结构体，使用了结构体，这里我们就需要自定义优先队列的排序函数了，首先根据次数排序，如果次数一样，就根据单词的词典序排序。

代码详解：


struct str_count{
    string str;
    int num;
};

struct cmp{
    bool operator()(str_count a,str_count b){

        if(a.num!=b.num){
            return (a.num<b.num);
        }else{
            return (a.str>b.str);
        }
    }
};
class Solution {
public:
    /**
     * @param words: an array of string
     * @param k: An integer
     * @return: an array of string
     */
    vector<string> topKFrequentWords(vector<string> &words, int k) {
        // write your code here

        set<string>hm;//存储字符串并用来统计字符串出现的次数

        unordered_map<string,int>hm2;//将字符串与其在数组中的位置形成映射，方便下面的通过字符串找到其位置


        vector<str_count>store_for_str_num;//将所有出现的字符串都保存下来，方便下面的优先队列的使用

        int len=words.size();

        int flag=0;
        for(int i=0;i<len;i++){

            if(i==0){//第一次hm是空的，特殊处理

                hm.insert(words[i]);
                hm2[words[i]]=flag;
                str_count temp;
                temp.str=words[i];
                temp.num=1;
                store_for_str_num.push_back(temp);
                flag++;

            }else{

                if(hm.find(words[i])!=hm.end()){//首选在hm中查找下，是否存在，如果存在就将其出现次数加一

                    store_for_str_num[hm2[words[i]]].num++;

                }else{//不存在，就将其加入hm中，并在数组以及hm2中添加其信息

                    hm.insert(words[i]);
                    hm2[words[i]]=flag;
                    str_count temp;
                    temp.str=words[i];
                    temp.num=1;
                    store_for_str_num.push_back(temp);
                    flag++;

                }
            }
        }//处理好了所有的数据，数据转换成了str_count类型的，字符串与次数的结构体

        priority_queue<str_count,vector<str_count>,cmp>q;//利用优先队列，得到其排序，这里自定义了排序方法
        for(int i=0;i<store_for_str_num.size();i++){
            q.push(store_for_str_num[i]);
        }

        vector<string>res;
        for(int i=0;i<k;i++){//将优先队列的前k取出，其就是最终结果
            str_count temp=q.top();
            q.pop();
            res.push_back(temp.str);
        }
        return res;
    }
};

lintcode----最高频的k个单词

猜你喜欢