KMP algorithm (study notes)

KMP algorithm summary (Nanchang Institute of Technology ACM training)

(The brain hurts when thinking about topics these days)

  • What is KMP algorithm

(I'm going to quote someone else, it's very good)
Knuth-Morris-Pratt string search algorithm (referred to as KMP algorithm, 0.0) can find the occurrence position of a word W in a main text string S.
This algorithm avoids rechecking previously matched characters by using the finding that the word itself contains enough information to determine where the next match will start when it does not match.

  • Why do we use KMP algorithm

Let's first introduce the algorithm
topic of ordinary matching strings.
Given a pattern string S and a template string P, all strings only contain uppercase and lowercase English letters and Arabic numerals.
The template string P appears as a substring multiple times in the pattern string S.
Calculate the initial index of all the positions of the template string P in the pattern string S.

  • Naive algorithm (violence)

(Borrow the template question of y to use it, haha)

#include<iostream>

using namespace std;

int main(){
    
    
    string a,b;
    cin>>a>>b;
    for(int i=0;i<a.size()-b.size()+1;i++){
    
    
        string c=a.substr(i,b.size());
        if(c==b) cout<<i<<" ";
    }
}

But when the letter length is large, it will time out.
So the KMP algorithm was born (haha, the problem creates the algorithm)

  • -KMP algorithm solution

Wait a minute, let me introduce how to use KMP algorithm

  • Principle The
    mother string abcdabc
    string abc,
    if we use the naive algorithm, we will find one by one from the mother string.
    But we matched a string many times, which was a waste of efficiency and greatly improved time.
    such as

Mother string abcdabc
string abc
mother string abcdabc
string*abc
mother string abcdabc
string **abc
mother string abcdabc
string***abc
mother string abcdabc
string****abc

This match is very repetitive, and the characters need to be matched multiple times (sad)

  • The kmp algorithm implements the
    kmp algorithm by two things consisting of the next array and kmp matching.

  • What is the
    next array ? The next array mainly stores the position of the prefix and suffix of the
    substring. In the p string,
    p [1- j] = p [i-j + 1, i];

  • kmp matching
    This is the matching of substrings and parent strings. The next array is used to avoid repeated matching of a character

  • template

//next数组
for (int i = 2, j = 0; i <= m; i ++ )
{
    
    
    while (j && p[i] != p[j + 1]) j = ne[j];
    if (p[i] == p[j + 1]) j ++ ;
    ne[i] = j;
}
// kmp匹配
for (int i = 1, j = 0; i <= n; i ++ )
{
    
    
    while (j && s[i] != p[j + 1]) j = ne[j];
    if (s[i] == p[j + 1]) j ++ ;
    if (j == m)
    {
    
    
    	// 匹配成功后的逻辑
        j = ne[j];
    }
}

Do you think it is simple? If you understand, you can do this question

[Template] KMP string matching

In this way, you already have a little knowledge of getting started~~ (because the KMP algorithm is far from that simple, haha)~~

Now we must increase the difficulty for your understanding of the KMP algorithm

Familiar with the next array, understand this question

Finished, it seems you have a certain framework now

Let’s try a CF KMP. This
question fully demonstrates the power of kmp

This question explains the next array and kmp matching more specifically

Look at my AC code (the joy of AC, haha)

#include<iostream>
#include<cstring>
#include<algorithm>
using namespace std;
const int N=1e6+10;
int q[]={
    
    1,2,3};
char p[4][N];
int ne[4][N];
int k[4][4];
int len[4];
int ans=0x7fffffff;
void getnext(int m,char p[],int k){
    
    
    for(int i=2,j=0;i<=m;i++){
    
    
        while(j&&p[i]!=p[j+1]) j=ne[k][j];
        if(p[i]==p[j+1]) j++;
        ne[k][i]=j;
    }
}
int kmp(int n,int m,char s[],char p[],int k){
    
    
    int j=0;
    for(int i=1;i<=n;i++){
    
    
        while(j&&s[i]!=p[j+1]) j=ne[k][j];
        if(s[i]==p[j+1]) j++;
        if(j==m) return -1;
    }
    return j;
}
void solve(int i,int j,int l){
    
    
    if(k[i][j]>=0&&k[j][l]>=0) ans=min(ans,len[i]+len[j]+len[l]-k[i][j]-k[j][l]);
    else{
    
    
        if(k[i][j]<0&&k[j][l]<0) ans=min(ans,len[l]);
        else if(k[i][j]<0) ans=min(ans,len[j]+len[l]-k[j][l]);
        else if(k[j][l]<0) ans=min(ans,len[i]+len[l]-k[i][l]);
    }
}
int main(){
    
    
    scanf("%s%s%s",p[1]+1,p[2]+1,p[3]+1);
    for(int i=1;i<=3;i++) len[i]=strlen(p[i]+1);
    for(int i=1;i<=3;i++){
    
    
        getnext(len[i],p[i],i);
        for(int j=1;j<=3;j++){
    
    
            if(i==j) continue;
            k[i][j]=kmp(len[j],len[i],p[j],p[i],i);
        }
    }
    do{
    
    
        solve(q[0],q[1],q[2]);
    }while(next_permutation(q,q+3));
    cout<<ans;
    return 0;
}

Happy learning, the algorithm is a long way, I am very happy~~(Orz, come and save me)~~
Mengxin's little summary, welcome the big guys to point out 0.0
likes, I'm very hard to see the kids /(ㄒoㄒ)/~~

Guess you like

Origin blog.csdn.net/m0_52361859/article/details/112647621