AC automaton (template)

 

 

AC automatic machine is used to doing:

  AC automatic machine is used to solve the multi-mode matching problem, for example, words s1, s2, s3, s4, s5, s6, Q: There appeared a few words, similar to the text string ss.

 

AC automaton achieve this function requires three parts:

1, all the words in a dictionary tree method achievements

2, constructed mismatch pointer

3, a text string in the lookup function

 

Here mainly about 2 and 3

 

First, achievements

int tree[400005][26],vis[400005],fail[400005];
int t,n,cnt,id,root,num=0;
string s,ss;

void insert()//建树
{
    root=0;
    for(int i=0;s[i];i++)
    {
        id=s[i]-'a';
        if(tree[root][id]==0)
            tree[root][id]=++num;
        root= Tree [the root] [ID]; 
    } 
    VIS [the root] ++; // words ending tag 
}

 

 

Second, the pointer mismatch Construction

 

Role mismatch pointer is this: When a text string mismatch in the current node, which node we should continue to go to match

After mismatch pointer function can find the longest common suffix length text string is [0, current node position] of the string

 

How to build a mismatch pointers:

 

Obviously, we have to do is to quickly find all fail pointer points. We obtained the order bfs turn each node fail, so that when we ask a node fail, it must have been his father's fail seeking out. If the current node is A, parent node of B, fail B to C, the string constant C represents the longest suffix of B. If there is a son C D A character equivalent characters, then obviously the string represented by D (C plus a character) is represented by a string of A (B plus a character) of the longest suffix. If C does not have a son, so the characters A character with the equivalent of it? Very simple, just need to access C's fail on the line. And so forth, until the suffix A find the longest, pointing up or fail A root node. (A no suffix in the Trie, a re-match obediently back to the root of it!)

 

 

step:

  1. For less special judge, the son of a setting for all auxiliary root node 0, node 0 are pointing to the real root node No. 1, then fail No. 1 node to the node 0.
  2. fail to find the node number 0 node 2 node's father node, node 0 see there is a child node of. There are, then 2 nodes fail point No. 1 node.
  3. fail to find the node number 0 node 3 node's father node, node 0 see there is no child node b. There are, then node 3, node number points to fail.
  4. fail to find the node node No. 1 No. 4 node's father node, see No. 1 for the node has no child node b. There are, then fail No. 3 No. 4 node to the node.
  5. Ibid.
  6. Ibid.
  7. Ibid.
  8. 找到8号节点的父亲节点的fail节点5号节点,看5号节点有没有为b的子节点。没有,于是再找到5号节点的fail节点2号节点,看2号节点有没有为b的子节点。有,于是8号节点的fail指向4号节点。

 

代码:

void build()//构建失配指针
{
    queue<int>p;
    for(int i=0;i<26;i++)
    {
        if(tree[0][i])//将第二行所有出现过的字母的失配指针指向root节点0
        {
            fail[tree[0][i]]=0;
            p.push(tree[0][i]);
        }
    }

    while(!p.empty())
    {
        root=p.front();
        p.pop();
        for(int i=0;i<26;i++)
        {
            if(tree[root][i]==0)//没有建树,不存在这个字母
                continue;
            p.push(tree[root][i]);
            int fa=fail[root];//fa是父亲节点
            while(fa&&tree[fa][i]==0)//fa不为0,并且fa的子节点没有这个字母
                fa=fail[fa];//继续判断fa的父亲节点的子节点有没有这个字母

            fail[tree[root][i]]=tree[fa][i];//找到就构建失配指针
            
        }
    }
}

 

三、查找函数

for循环遍历一遍文本串,统计被标记的次数,记录最终答案

这里要注意的是,失配指针不仅仅是在失配的时候起作用

 

 

为了不让这种事情发生,我们每遇到一个fail指针就必须进行“失配”转移,以保证不会漏过任何一个子串,就像这样:

 

 

代码:

int search(string ss)//查找
{
    root=0,cnt=0;
    for(int i=0;ss[i];i++)
    {
        id=ss[i]-'a';
        while(root&&tree[root][id]==0)//失配转移
            root=fail[root];

        root=tree[root][id];
        int temp=root;
        while(vis[temp])
        {
            cnt=cnt+vis[temp];
            vis[temp]=0;//清除标记,避免重复
            temp=fail[temp];
        }
    }
    return cnt;
}

 

模板题:https://www.cnblogs.com/-citywall123/p/11300251.html

 

Guess you like

Origin www.cnblogs.com/-citywall123/p/11300232.html