## Luo Gu P3375 [template] KMP string matching

Topic Portal: poke me into

KMP string matching algorithm is used to deal with the problem, that is, to give you two strings, you need to answer is whether the B string sub-string A string, there have been several strings B, B string A string in A location string arise.

Meaning KMP algorithm is that if you made something in Luo Valley, kkksc03 you can find out if you said something discordant word according to KMP algorithm, and block out your sentence of discordant words ( such as chicken you cxk so beautiful it will be shielded to **** CXK ), you will be punished according to the number of discordant words appear in your sentence

For chestnut: A: GCAKIOI B: GC, then we call B string is a substring A string

We call waiting A string matching the main string to match the B-string is a string mode.

The simple approach is to enumerate general position of the first letter B string appearing in the A string and determine the suitability of this approach and the time complexity is O (mn), when you deal with a long article obviously time will time out.

We find that during the string matching, most attempts fail, then there is no algorithm to take advantage of these failures do?

KMP algorithm is

KMP algorithm using the key information after the matching fails, to minimize the number of pattern matching string and the main strings to achieve rapid match

Provided main string (hereinafter referred to as T)

Set pattern string (hereinafter referred to as W)

Violence during string matching algorithm, we will T [0] with W [0] match, if the next character is the same match, until the situation is not the same, then we discard the foregoing matching information, and then T [1] with W [0] matching cycle, until the end of the main train, or the successfully matched situation. This method of matching information is discarded front, matching efficiency is greatly reduced.

Let's look at how this works KMP

In KMP algorithm, a pattern for each string we calculated in advance information pattern string matching internal (that is the only thing relevant string and patterns, may be pretreated, this process we will be mentioned later), the matching fails the maximum movement pattern string, in order to reduce the number of matches.

For example, after a simple match fails, we will want to try the pattern string and the right main string to match. Right distance KMP algorithm is thus calculated: in the matched sub-string pattern string, find the longest prefix and suffix of the same, and then move so that they overlap.

We use two pointers i and j represent A [i-j + 1 ...... i] and B [1 ...... j] exactly equal, i.e. i is increasing, and with the increase of i, j changes accordingly, and j satisfy the length of the end of the a [j] is j string matches exactly before the j-th character string B, are now required to see a [i + 1] and B [ j + 1] relationship

• When A [i + 1] = B [j + 1], we increase the i and j are each 1
• Otherwise, we reduce the value of j such that A [i-j + 1 ...... i] and B [1 ...... j] holding a new match and try to match the A [i + 1] and B [j + 1]

For chestnut:

T： a b a b a b a a b a b a c b

W：a b a b a c b

When i = j = 5, this time T [6]! = W [6], suggesting that this case can not j is equal to 5, this time we want to change the value of j such that W [1 ... j] in front j 'and the letters j' the same letters, because j to j 'after (i.e. the right W j' length) in order to keep the nature of i and j. The j 'is obviously the bigger the better. Where W [1 ... 5] are matched, we have found that the first three letters ababa and ABA are the three letters, so j '3 is the maximum, then this is the case

T： a b a b a b a a b a b a c b

W：      a b a b a c b

Then the time i = 5, j = 3, we find that T [6] and W [4] are equal, then T [7] and W [5] are equal (here two-step)

So now the case: i = 7, j = 5

T： a b a b a b a a b a b a c b

W：      a b a b a c b

This time has emerged T [8]! = W [6] of the situation, so we continue. Because just now seeking out when j = 5 Shi, j '= 3, so we can directly use (by where we can find j' is how much and nothing to do with the main string, string and pattern only has a relationship)

So it has become such

T： a b a b a b a a b a b a c b

W：            a b a b a c b

At this time, the new still can not satisfy j = 3 A [i + 1] = B [j + 1], so we also need to take j '

We found that when j = 3 aba the first letter and the last letter is a, so when j '= 1

The new situation:

T： a b a b a b a a b a b a c b

W:                  a b  a b a cb

Still not satisfied, so needs to be reduced to j j 'is 0 (when we set when j = 1, j' = 0)

T： a b a b a b a a b a b a c b

W:                      a  b  a b a cb

Finally, T [8] = B [1], i becomes 8, j to 1, a next one we found are equal, when the last further satisfies the condition j = 7, the following we can Conclusion: W is a substring of T, and can also be found in the main string substring positions (i + 1-m + 1, because the index starts from 0)

This part of the code is very short, because with a for loop

```inline void kmp()
{
int j=0;
for(int i=0;i<n;i++)
{
while(j>0&&b[j+1]!=a[i+1]) j=nxt[j];
if(b[j+1]==a[i+1]) j++;
if(j==m)
{
printf("%d\n",i+1-m+1);
j=nxt[j];
//当输出第一个位置时 直接break掉
//当输出所有位置时 j=nxt[j];
//当输出区间不重叠的位置时 j=0
}
}
}```

W ：a b a b a c b

nxt：0 0 1 2 ？？

W ：a b a b a c b

nxt：0 0 1 2 3？

```inline void pre()
{
nxt[1]=0;//定义nxt[1]=0
int j=0;
rep(i,1,m-1)
{
while(j>0&&b[j+1]!=b[i+1]) j=nxt[j];
//不能继续匹配并且j还没有减到0,就退一步
if(b[j+1]==b[i+1]) j++;
//如果能匹配，就j++
nxt[i+1]=j;//给下一个赋值
}
}```

```#include<iostream>
#include<cstdio>
#include<cstdlib>
#include<cstring>
#include<string>
#include<cmath>
#include<queue>
#include<algorithm>
#include<iomanip>
using namespace std;
#define rep(i,a,n) for(int i=a;i<=n;i++)
#define per(i,n,a) for(int i=n;i>=a;i--)
typedef long long ll;
{
ll ans=0;
char last=' ',ch=getchar();
while(ch<'0'||ch>'9') last=ch,ch=getchar();
while(ch>='0'&&ch<='9') ans=ans*10+ch-'0',ch=getchar();
if(last=='-') ans=-ans;
return ans;
}

char a[1000005],b[1000005];
int nxt[1000005],n,m;

inline void pre()
{
nxt[1]=0;
int j=0;
rep(i,1,m-1)
{
while(j>0&&b[j+1]!=b[i+1]) j=nxt[j];
if(b[j+1]==b[i+1]) j++;
nxt[i+1]=j;
}
}

inline void kmp()
{
int j=0;
for(int i=0;i<n;i++)
{
while(j>0&&b[j+1]!=a[i+1]) j=nxt[j];
if(b[j+1]==a[i+1]) j++;
if(j==m)
{
printf("%d\n",i+1-m+1);
j=nxt[j];
}
}
rep(i,1,m) printf("%d ",nxt[i]);

}

int main()
{
scanf("%s%s",a+1,b+1);
n=strlen(a+1),m=strlen(b+1);
pre();
kmp();
return 0;
}```

### Guess you like

Origin www.cnblogs.com/lcezych/p/11002026.html
Recommended
Ranking
Daily