Konjac tree-lined review --AC automata

AC automaton: Aho-Corasick automaton, the algorithm was born in 1975 with an annual output of Bell Labs, is a well-known multi-mode matching algorithm.

Today's review konjac tree-lined AC automaton

Front cheese

  1. KMP algorithm
  2. TRIE tree (if this does not go right kindergartens)

Okay, I admit AC automaton than KMP easy to understand.

How to build an AC automaton

  1. The establishment of a dictionary tree, exactly where the insertion and ordinary trie
  2. In the trie construction fair array (which is linked with AC automatic machine KMP place)
  3. Find

How to construct a fair array? / Fair array is a what?

It is the same as fair a mismatch array and the array of KMP next, and for the node X, the stored element is fair fair [X father] son ​​in a same X, if there is no fair point root.

That is fair [x] = x and a point or the root node of the same, and the chain Fair [x] where necessarily present in this strand where X, in other words, Fair [x] jumped to X string must be located forward of the string portion from X, and includes a constant X.

As for how to find fair, it is achieved by BFS

Let's Photo

  1. Red line: root dequeued two points of h and s is enqueued fair set to root, and h and s
  2. Blue line: h dequeued child node e is equal to the location of their child nodes fair fair father of e, it is the root, the absence of queued e, while the s team, a child node of his father was fair child nodes fair in a position does not exist as the root, a enqueue child nodes h, fair subnode his father fair in h location, presence, fair [tire [now] [i]] = tire [fair [now]] [i]; h enqueue
  3. Green Line: e dequeued fair child node of r is equal to r, the position of the neutron node his father's fair (the root), y is the root node, r enqueue, dequeue A sub fair likewise the root does not exist, position of the child node y enqueued, h dequeued child node e is fair equal father fair in e a, fair [tire [now] [i]] = tire [fair [now]] [i], and left sub e connected to the tree, e enqueue, fair for the child node of the root r, enqueue;

这样的话fair数组就整理好了,对于不存在的点,tire[now][i]=tire[fair[now]][i]。如果这个点不存在,那么就指向自己父亲的fair的对应位置。类似于一个状态压缩,如果当前x的fair中并没有与x相同的元素,那么就前往fair[x]的fair,因为fair[x]代表的元素==x,fair[fair[x]]代表的元素==fair[x]代表的元素

void getfail()
{
    queue<int>q;
    for(int i=0;i<26;i++)
    {
        if(trie[0][i])
        {
            q.push(trie[0][i]);
            fail[trie[0][i]]=0;
        }
    }
    while(!q.empty())
    {
        int now=q.front();
        q.pop();
        for(int i=0;i<26;i++)
        {
            if(trie[now][i])
            {
                fail[trie[now][i]]=trie[fail[now]][i];
                q.push(trie[now][i]);
            }
            else
            {
                trie[now][i]=trie[fail[now]][i];
            }
        }
    }
}
FAIR

 

万恶的fair终于整理完成了

下面来说说查询

AC自动机是用来查询原串中出现模式串次数的,所以这里引入一个变量tdword[x]代表在x节点结尾的模式串的个数

那么查询的第一重循环一定是在tire上确定原串每一位的位置的,联想到刚才fair的定义,也就是说一个点如果目前与原串匹配,那么这个点的fair及fair以上字符构成的串也一定能和原串匹配,所以说每到达一个节点,都要访问这个点可以通过fair数组所访问到的每一个点

 

int query(string x)
{
    int ans=0;
    int now=0;
    for(int i=0;i<s.size();i++)
    {
        now=trie[now][x[i]-'a'];for(int j=now;j&&tdword[j]!=-1;j=fail[j])
        {
            ans+=tdword[j]; 
            tdword[j]=-1;
        }
    }
    return ans;
}

 

而本人的代码对应的是洛谷上AC自动机的模板,该题要求统计出现模式串的种类数,第二重循环即为根据某个点跳fair,如果tdword[j]==-1的话就代表这个点已经在之前被访问过一遍,那么它的fair肯定也全部都跳完了。所以说跳到tdword[j]==-1的情况就停下来。

OK,完结撒花!

 

Guess you like

Origin www.cnblogs.com/XLINYIN/p/11355527.html