Luo Gu [P3808] (template) AC automaton (simple version)

Topic Link

(Template) AC automaton (simple version)

Topic background

This is a simple AC automatic machine template title.
Algorithm for detecting the correctness constant.
To prevent card OJ, only two sets of data to ensure the correct basis, please do not submit malicious.
Tip: There repeated words in the title of this data, and repeat the word several times should be calculated, please note

Title Description

Given \ (n \) modes string and \ (1 \) text strings, find the number of models appeared in the text string in the string.

Input Format

A first line \ (n-\) , represents the number of pattern string;
below \ (n-\) lines each a pattern string;
following line of a text string.

Output Format

A number indicates the answer

Sample input

2 
and 
aa 
aa

Sample Output

2

Description / Tips

\ (subtask1 \) [ \ (50pts \) ]: \ (\ SUM length (pattern string) <= 10 ^ 6, length ( text string) <= 10 ^. 6, n-=. 1 \) ; \
(subtask2 \) [ \ (50pts \) ]: \ (\ SUM length (pattern string) <= 10 ^ 6, length ( text string) <= 10 ^ 6 \) ;

answer

AC automaton board problem, in terms of direct AC automaton it.
First of all, it will trie,
secondly, KMP will it,
then combine these two things is the AC automaton.
First, we built a string input mode according to the dictionary tree,
then traverse the tree from the root, and then run this tree KMP.
In KMP, for each point, we need to find \ (Next \) , equal to the longest prefix scratch the suffix ending point (the beginning of the prefix length less than the length of the current point).
But for AC automatic machine, this prefix is not necessarily the ancestors of the current point, there may be other child node of the tree.
As shown below:

\ (4 \) number of nodes \ (next \) needs to point to \ (3 \) number of nodes,
so we used a very clever way to solve this problem,
as long as wide search to search the tree on the line,
because each node is \ (Next \) depth values of points must be smaller than the point this node, this node so \ (Next \) value of the point pointed to by \ (Next \) values must be for good a (somewhat around).
Then we follow the algorithm for the KMP \ (next \) values just fine.


Next we consider some small optimization:
taking into account the KMP seeking \ (next \)When the value will often find the root node,
then we expect to continue up to find \ (next \) the conditions of the current node \ (next \) node does not have to be searched and the letter of the same sub-node,
then the current node the \ (next \) corresponding to the node node should be empty,
so if we keep the place corresponding to this node \ (\ next) value (that is similar to this place even an edge to this node \ ( the Next \) ), then later seek \ (next \) to find the place when you can directly replace the values need.
FIG follows:

\ (6 \) number of nodes \ (Next \) values point \ (7 \) , then the \ (6 \) number of nodes \ ( "d" \) son was \ (8 \) .
Personally I think that this combination of optimized code is easy to understand that.
Then each find \ (next \) when the time complexity is \ (O (1) \) a. (Just a constant level of optimization, you can not add)

On the code:

#include<bits/stdc++.h>
using namespace std;
int n;
char c[1000009];
struct aa{
    int s;
    int up;
    int to[30];
}p[1000009];
int len;
int ans;
void add(){
    int l=strlen(c);
    int u=0;
    for(int j=0;j<l;j++){
        if(p[u].to[c[j]-'a']) u=p[u].to[c[j]-'a'];
        else {p[u].to[c[j]-'a']=++len;u=len;}
    }
    p[u].s++;
}
int q[1000009],l=1,r=0;
void bfs(){
    for(int j=0;j<='z'-'a';j++)
        if(p[0].to[j]) q[++r]=p[0].to[j];
    while(l<=r){
        int u=q[l++];
        for(int j=0;j<='z'-'a';j++){
            if(p[u].to[j]){
                p[p[u].to[j]].up=p[p[u].up].to[j];
                q[++r]=p[u].to[j];
            }else p[u].to[j]=p[p[u].up].to[j];
        }
    }
}
int main(){
    scanf("%d",&n);
    for(int j=1;j<=n;j++){
        scanf("%s",c);
        add();
    }
    bfs();
    scanf("%s",c);
    int l=strlen(c);
    int uu=0;
    for(int j=0;j<l;j++){
        uu=p[uu].to[c[j]-'a'];
        int k=uu;
        while(k && p[k].s!=-1){
            ans+=p[k].s;
            p[k].s=-1;
            k=p[k].up;
        }
    }
    printf("%d",ans);
    return 0;
}

Guess you like

Origin www.cnblogs.com/linjiale/p/12219321.html