Study notes: hash table

Integer hash:

The reason for using hash is that the range of x is very large, but the number is limited, so a relatively small number can be used to represent it, which is a bit like discretization, but discretization is an ordered hash (required in order Come, that is, the relative size relationship cannot be confused. I was originally bigger than you, and the value after hashing is also bigger than you)

The hash of an integer can generally be directly modulo a number (the number on the modulus is generally a prime number, which can reduce the possibility of conflicts, I don’t know why), and the obtained value is used as its hash value, but it is very There may be conflicts (that is, the hash values ​​of a and b are the same). There are two ways to deal with conflicts: 1. Zipper method 2. Develop addressing method
—————————————— ——————
Zipper method:
For each point mapped to, a chain is pulled at that point, and the values ​​that can be mapped to the point are stringed together (that is, equivalent to a singly linked list)

————————————————————
Open addressing method: open
a hash table to 2~3 times the actual number, when mapping a value, if the hash value Already occupied, move on to the next one until there is a vacancy or you find yourself

————————————————————
Look at the question directly:
Maintain a set and support the following operations:
"I x", insert a number x;
"Q x", ask Whether the number x has appeared in the set;
now we need to perform N operations, and output the corresponding result for each query operation.
Input format The
first line contains the integer N, which represents the number of operations.
Next N lines, each line contains an operation instruction, the operation instruction is one of "I x" and "Q x".

Output format
For each query command "Q x", output a query result. If x has appeared in the set, output "Yes", otherwise output "No". Each result is on one line.
Data range
1≤N≤1e5, −1e9≤x≤1e9

Zipper method:

const int N = 100003; 

int h[N],e[N],ne[N],idx;

void insert(int x)
{
    
    
	int k=(x%N+N)%N; //防止出现负数 ,k是哈希值 
	 
	e[idx]=x; ne[idx]=h[k]; h[k]=idx++; //和单链表的插入一样 
}


bool find(int x) //查询 x是否存在
{
    
    
	int k=(x%N+N)%N; //得到 x的哈希值
	for(int i=h[k];~i;i=ne[i]) //从表头遍历,看是否存在x 
		if(e[i]==x)	return true;
	return false;
} 
 
int main()
{
    
    
	memset(h,-1,sizeof(h)); //类似单链表,初始时每个点上都没有值,用-1表示指向空 
	
	int n; cin>>n;
	char op; int x;
	while(n--)
	{
    
    
		cin>>op>>x;
		if(op=='I')	insert(x);
		else
		{
    
    
			if(find(x))	puts("Yes");
			else	puts("No");
		}
	}
	return 0;
}

Open addressing method:

const int N = 200003,null=0x3f3f3f3f;

int h[N];
 // 开放寻址法 ,使用该方法的话hash数组一定要开到实际个数的2~3倍 
int find(int x) 
{
    
    
	int t=(x%N+N)%N; //防止出现负数
	while(h[t]!=null&&h[t]!=x) //直到出现空位或者找到了自己 
	{
    
    
		t++;
		if(t==N)	t=0;//如果找到了最后一个,要从0开始找
	}
	return t;
}

int main()
{
    
    
	memset(h,0x3f,sizeof(h)); // 用一个大数表示该点为空
	 
	int n; cin>>n;
	char op; int x;
	while(n--)
	{
    
    
		cin>>op>>x;
		if(op=='I')	h[find(x)]=x;
		else
		{
    
    
			if(h[find(x)]==null)	puts("No");
			else	puts("Yes");
		}
	}
	return 0;
}

————————————————————
String Hash:

The hash method of a string is called string prefix hashing.
This method is to preprocess a string to get the hash of all its prefix strings
(a string of length n, h[1] represents the hash of the first character Greek value, h[2] represents the hash value of the first 2 characters...h[i] represents the hash value of the first i characters) The
hash method is to treat the string as a P-ary number, the string The hash value of is the value that converts this P base number into a decimal number, because this number may be very large, and then the modulus
should be paid attention to:
1. Cannot map a character to 0
2 , For the situation where integers will conflict, strings are also possible, but here, I assume that the character is good enough (the conflicts of strings may be very small), regardless of conflicts.
3. The P in this P base Generally take 131 or 13331, and this modulo number is 2^64. According to metaphysics<( ̄ˇ ̄)/, this will reduce the probability of conflict (99.99%).
Tip:
Take the modulo 2^64 to get the number the range is (64 ~ -10 power of 2), that the scope of unsigned long long, we can define it as the array h ull types of variables, so that if the number is greater than this range, the system will automatically help us modulo
next It is the advantage of using the prefix hash to match the P base: the
prefix hash can be used to calculate the hash of any substring (n not nb).
For example, for a string, it is preprocessed after the prefix hash , If you want to find the hash value of the substring from l to r, you can directly find it like this: h[r]-h[l-1]*p^(r-l+1)
analogy to the decimal system to understand :
For example, for 123456, find the segment from 4 to 6,
then 456=123456-123 *10^3

Look directly at the title:
AcWing 841. String hash
Given a string of length n, and then given m queries, each query contains four integers l1, r1, l2, r2, please judge [l1, r1 ] And [l2,r2] whether the string substrings contained in the two ranges are exactly the same.
The string only contains uppercase and lowercase English letters and numbers.

Input format The
first line contains integers n and m, indicating the length of the string and the number of queries.
The second line contains a string of length n, which contains only uppercase and lowercase English letters and numbers.
The next m lines, each line contains four integers l1, r1, l2, r2, which represent the two intervals involved in a query.
Note that the position of the string is numbered from 1.

Output format
Output a result for each query. If the two string substrings are exactly the same, output "Yes", otherwise output "No". Each result is on one line.
Data range: 1≤n,m≤1e5

#include<cstdio>
#include<cmath>
#include<ctime>
#include<cstring>
#include<iostream>
#include<map>
#include<set>
#include<stack>
#include<queue>
#include<string>
#include<vector>
#define ll long long
#define ull unsigned long long
#define up_b upper_bound
#define low_b lower_bound
#define m_p make_pair
#define mem(a) memset(a,0,sizeof(a))
#define IOS ios::sync_with_stdio(false);cin.tie(0);cout.tie(0)
#define inf 0x3f3f3f3f
#define endl "\n"
#include<algorithm>
using namespace std;

inline ll read()
{
    
    
	ll x=0,f=1; char ch=getchar();
	while(ch<'0'||ch>'9')	{
    
     if(ch=='-') f=-1; ch=getchar(); }
	while('0'<=ch&&ch<='9')	x=x*10+ch-'0', ch=getchar();
	return f*x;
}

const int N = 1e5+1;

const int P = 131; //P进制 
char str[N];
ull h[N],p[N]; //p数组代表的是p的次方数 p[i]存的是p的i次方 

ull get(int l,int r) //求子串[l,r]这一段的哈希值 
{
    
    
	return h[r]-h[l-1]*p[r-l+1];
}

int main()
{
    
    
	int n,m;	cin>>n>>m;
	
	p[0]=1; //p的0次方是1 ,预处理出来所有的p 
	for(int i=1;i<=n;i++)	p[i]=p[i-1]*P;
	

	cin>>str+1; //从下标为1开始存 
	for(int i=1;i<=n;i++) //求前缀哈希 
		h[i]=h[i-1]*P+str[i]; //把字母当数字 
	
	while(m--)
	{
    
    
		int l1,r1,l2,r2;
		cin>>l1>>r1>>l2>>r2;
		if(get(l1,r1)==get(l2,r2)) //如果[l1,r1]和[l2,r2]的哈希值相等我们就认为这两段相等 
			puts("Yes");
		else
			puts("No");
	}
	return 0;
}


 

Guess you like

Origin blog.csdn.net/m0_50815157/article/details/113637933