Data structure notes--related applications of hash tables (RandomPool structure, Bloom filter and consistent hash algorithm)

Table of contents

1--RandomPool structure

2--Bloom filter

3-- Consistent hashing


1--RandomPool structure

Designing a RandomPool structure requires the following three functions:

        ① insert(key): Add a certain key to the structure, and do not add it repeatedly;

        ② delete(key): remove the key in the structure;

        ③ getRandom(): randomly returns a key in the structure with equal probability;

Requirements: The time complexity of the above three functions is O(1);

Main idea:

        Use two hash tables indexKeymap and keyIndexmap to store (key, index) and (index, key), where index increases continuously from 0, which is related to the size of the hash table;

#include <iostream>
#include <unordered_map>
#include <string>

class randomPool{
public:
    void insertKey(std::string key){
        if(keyIndexmap.find(key) == keyIndexmap.end()){
            keyIndexmap[key] = this->size;
            indexKeymap[this->size] = key;
            this->size++;
        }
    }

    void deleteKey(std::string key){
        if(keyIndexmap.find(key) != keyIndexmap.end()){
            int deleteIndex = keyIndexmap[key];
            int lastIndex = this->size - 1;
            std::string lastKey = indexKeymap[lastIndex];
            // 使用(lastKey, lastIndex)替换删除的(key, deleteIndex)
            keyIndexmap[lastKey] = deleteIndex;
            indexKeymap[deleteIndex] = lastKey;
            // 移除(key, deleteIndex) 和 (lastIndex, lastKey)
            // 此步骤的意义是确保indexKeymap中的index是逻辑连续的,以便getRandomKey()的调用
            keyIndexmap.erase(key);
            indexKeymap.erase(lastIndex);
        }
    }

    std::string getRandomKey(){
        //随机生成[0, size - 1]上的 index
        int random = rand() % size;
        return indexKeymap[random];
    }

public:
    std::unordered_map<std::string, int> keyIndexmap;
    std::unordered_map<int, std::string> indexKeymap;
    int size = 0;
};

int main(int argc, char *argv[]){
    randomPool rp1;
    rp1.insertKey("A");
    rp1.insertKey("B");
    rp1.insertKey("C");
    std::cout << "***************keyIndexmap: " << std::endl;
    for(auto &it : rp1.keyIndexmap){
        std::cout << "(" << it.first << ", " << it.second << ")" << std::endl; 
    }

    std::cout << "***************indexKeymap: " << std::endl;
    for(auto &it : rp1.indexKeymap){
        std::cout << "(" << it.first << ", " << it.second << ")" << std::endl; 
    }

    std::cout << "***************After deleteKey:" << std::endl;
    rp1.deleteKey("A");
    std::cout << "***************keyIndexmap: " << std::endl;
    for(auto &it : rp1.keyIndexmap){
        std::cout << "(" << it.first << ", " << it.second << ")" << std::endl; 
    }

    std::cout << "***************getRandomkey: " << rp1.getRandomKey() << std::endl;
    return 0;
}

2--Bloom filter

Bloom filters are generally used for information retrieval. For video explanations, please refer to: Bloom filter explanations (video starts at 1:08:00)

3-- Consistent hashing

The consistent hash algorithm is often used in load balancing. For video explanations, please refer to: 7-minute video explaining the consistent hash algorithm

Guess you like

Origin blog.csdn.net/weixin_43863869/article/details/132367688