lintcode 反向索引

lintcode 反向索引

描述

创建给定文档的反向索引

确保数据不包含标点符号.

样例

出一个包括id与内容的文档list(我们提供了document类).
返回一个反向索引(hashmap的key是单词, value是文档的id).

例 1:

输入:
[
{
“id”: 1,
“content”: “This is the content of document 1 it is very short”
},
{
“id”: 2,
“content”: “This is the content of document 2 it is very long bilabial bilabial heheh hahaha …”
},
]
输出:
{
“This”: [1, 2],
“is”: [1, 2],

}
例 2:

输入:
[
{
“id”: 1,
“content”: “you are young”
},
{
“id”: 2,
“content”: “you are handsome”
},
]
输出:
{
“are”: [1, 2],

}

思路

遍历每个content的每个字符串,插入到map中,并且更新map的vector,最后删除数组中的重复元素。

代码

/**
 * Definition of Document:
 * class Document {
 * public:
 *     int id;
 *     string content;
 * }
 */
class Solution {
public:
    /**
     * @param docs a list of documents
     * @return an inverted index
     */
    map<string, vector<int>> invertedIndex(vector<Document>& docs) {
        // Write your code here
        stringstream ss;
        map<string, vector<int>> m;
        for (int i = 0; i < docs.size(); i++) {
            int m_id = docs[i].id;
            ss << docs[i].content;
            for (string str; ss >> str; m[str].push_back(m_id));
            ss.clear();
        }
        for (map<string, vector<int>>::iterator it = m.begin(); it != m.end(); it++) {
            it->second.erase(unique(it->second.begin(), it->second.end()), it->second.end());
        }
        return m;
    }
};

猜你喜欢

转载自blog.csdn.net/qq_40147449/article/details/88886134