HDU1686——Oulipo 【字符串HASH，KMP】

The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e'. He was a member of the Oulipo group. A quote from the book:

Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…

Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T's is not unusual. And they never use spaces.

So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {'A', 'B', 'C', …, 'Z'} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.

Input

The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:

One line with the word W, a string over {'A', 'B', 'C', …, 'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
One line with the text T, a string over {'A', 'B', 'C', …, 'Z'}, with |W| ≤ |T| ≤ 1,000,000.

Output

For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.

Sample Input

3
BAPC
BAPC
AZA
AZAZAZA
VERDI
AVERDXIVYERDIAN

Sample Output

1
3
0

题目大意：求解A串在B串中有多少个，KMP，HASH裸题，直接上板子就行了。

KMP写法，及其运行时间：

Status	Accepted
Time	109ms
Memory	6692kB
Length	709

#include <iostream>
#include <cstdio>
#include <cstring>
#include <algorithm>
using namespace std;
const int MAXN=1e6+7;
int nex[MAXN],f[MAXN];
char s1[MAXN],s2[MAXN];
void getNext(){
	nex[1]=0;
	int len=strlen(s1+1);
	for(int i=2,j=0;i<=len;i++){
		while(j>0&&s1[i]!=s1[j+1]) j=nex[j];
		if(s1[i]==s1[j+1]) j++;
		nex[i]=j;
	}
}
int kmp(){
	int len1=strlen(s2+1),len2=strlen(s1+1);
	int ans=0;
	for(int i=1,j=0;i<=len1;i++){
		while(j>0&&(j==len2||s2[i]!=s1[j+1])) j=nex[j];
		if(s2[i]==s1[j+1]) j++;
		f[i]=j;
		if(f[i]==len2) ans++;
	}
	return ans;
}
int main(){
	int t;
	scanf("%d",&t);
	while(t--){
		scanf("%s%s",s1+1,s2+1);
		getNext();
		memset(f,0,sizeof(f));
		printf("%d\n",kmp());
	}
	return 0;
}

HASH写法及其运行时间：

Status	Accepted
Time	109ms
Memory	18008kB
Length	1239

#include <string>
#include <cstdio>
#include <algorithm>
#include <cstring>
#include <iostream>
using namespace std;
typedef unsigned long long ull;
const ull B=13331;
const int MAXN=1e6+5;
int a1,b1;
char a[10000+7];
char b[1000000+7];
ull p[MAXN];
ull has[MAXN];
/*int contain(char a[],char b[]){
    int a1=strlen(a),b1=strlen(b);
    ull t=1;
    for(int i=0;i<a1;i++)
        t=t*B;
    ull ah=0,bh=0;
    for(int i=0;i<a1;i++) ah=ah*B+a[i];
    for(int i=0;i<a1;i++) bh=bh*B+b[i];
    int cnt=0;
    for(int i=0;i+a1<=b1;i++){
        if(ah==bh) cnt++;
        else bh=bh*B+b[i+a1]-b[i]*t;//右移动一位
    }
    return cnt;
}*/
void get(){
    p[0]=1;
    for(int i=1;i<=MAXN;i++){
        p[i]=p[i-1]*B;
    }
}
ull calculatea(char s[]){
    ull sum=0;
    for(int i=a1-1;i>=0;i--)
        sum=sum*B+s[i];
    return sum;
}
void calculateb(char s[]){
    has[b1]=0;
    for(int i=b1-1;i>=0;i--)
        has[i]=has[i+1]*B+s[i];
}
int main(){
	int n;
	scanf("%d",&n);
	get();
	while(n--){
        scanf("%s%s",a,b);
		a1=strlen(a);
		b1=strlen(b);
		ull ah=calculatea(a);
		calculateb(b);
		int ans=0;
		for(int i=0;i<=b1-a1;i++){
            if(ah==has[i]-has[i+a1]*p[a1])
                ans++;
		}
		printf("%d\n",ans);
	}
	return 0;
}

HDU1686——Oulipo 【字符串HASH，KMP】

猜你喜欢