Analog hash table & string hash

Analog hash table

Introduce

A hash table is a data structure for efficient access based on a key value. A data can be mapped as a key through a hash function to obtain a storage address for access.

For example, if you want to query 100 numbers in the range (1 ~ 1e8), and query whether they have duplicate values, then you can use a hash table to solve this problem.

For small numbers, we are used to marking them in an array, but for large numbers, it is a waste of space to open a large array for marking, and there are only 100 numbers to be processed. It is really a 1e8 array. It is not worth it, and when the data increases, it may not be possible to directly open such an array for C++.

Hash function

At this point, these numbers can be converted. For example, we can take the modulo so that they can all be converted into a certain range of numbers.
This function that transforms these data in a certain way is called a hash function, and the hash function used above is modulo.

Deal with conflicts

It can be found that after processing with the hash function, there may be the same value, such as 5% 3 = 2, 8% 3 = 2 This may make the result of the query wrong, then you need to deal with the conflict in a way ( you can prove When the modulus is a prime number, the probability of conflict is the smallest ).

Zipper method

The zipper method is very simple to handle conflicts. For several repeated values, you can put them in a linked list. You only need to look up the linked list to find out if there are any.

using namespace std;
const int mod = 100003;
int h[100005],cnt = 1,n;
struct node{
    
    
    int nex,number;
}edge[100005];//链表
void insert(int x){
    
    
    int hashCode = (x % mod + mod) % mod;//把负数也给映射过来
    edge[cnt].number = x;
    edge[cnt].nex = h[hashCode];
    h[hashCode] = cnt++;
}
bool find(int x){
    
    
    int hashCode = (x % mod + mod) % mod;
    for(int i = h[hashCode]; i != 0;i = edge[i].nex){
    
    //遍历该哈希值下的所有数
        if(edge[i].number == x)return true;
    }
    return false;
}

String hash

The hash
of a string can be a prefix hash (in the competition). For a string, it can be regarded as a p-base number. For example, use sum to represent the prefix. For a string: "acf"
then there is:
sum [1 ] = P ∗ 'a' sum[1] = P *'a'sum[1]=Pa
s u m [ 2 ] = P 2 ∗ ‘ c ’ + s u m [ 1 ] sum[2] = P^2 * ‘c’ + sum[1] sum[2]=P2c+sum[1]
s u m [ 3 ] = P 3 ∗ ‘ f ’ + s u m [ 2 ] sum[3] = P^3 * ‘f’ + sum[2] sum[3]=P3f+s u m [ 2 ]
But there will be a problem in this way. The final figure obtained is very large, so it needs to be modulo. Then you can complete the prefix hash.
For P, it is generally 131, and for modulus, it is generally2 64 2^{64}26 4 In this way, when unsigned long long is opened, it will automatically take the modulus. For repeated strings, the probability of repetition is very small, so it can almost be regarded as non-repetition. If you want to find the hash value of a certain string, you can use it.sum [r] − sum [l − 1] ∗ P r − l + 1 sum[r]-sum[l-1] * P^{r-l + 1}sum[r]s u m [ l1]Pr l + 1 is obtained by the formula
Title:String hash

#include<bits/stdc++.h>
#define INF 0x3f3f3f3f
#define ll long long
#define eps 1e-8
using namespace std;
const ll maxn = 1e5 + 5;
typedef unsigned long long ull;
int n,m,L1,R1,L2,R2;
ull p[maxn],sum[maxn];
char str[100005];
ull finds(int l,int r){
    
    
    ull res = sum[r] - sum[l - 1] * p[r - l + 1];
    return res;
}
int main(){
    
    
    scanf("%d %d",&n,&m);
    scanf("%s",str + 1);
    p[0] = 1;
    for(int i = 1; i <= n; i++){
    
    
        p[i] = p[i - 1] * 131;
        sum[i] = sum[i - 1] * 131 + str[i];
    }
    while(m--){
    
    
        scanf("%d %d %d %d",&L1,&R1,&L2,&R2);
        if(finds(L1,R1) == finds(L2,R2))
            printf("Yes\n");
        else
            printf("No\n");
    }
    return 0;
}

Guess you like

Origin blog.csdn.net/qq_36102055/article/details/107409183