Efficient handling of strings! --AC automata

AC automaton

Two days into the AC automata algorithms, better and feel that this algorithm is flexible and efficient, the next time around this learning to be a summary.

AC automaton, of course, its most important role is to automatically help your AC topic string matching multi-mode, which is the combination of kmp trie trie and, further in-depth talk is in the kmp jump mismatch thought applied to the trie !

         1.AC automatic mechanism built

         For building, basically a template to build the first trie , then BFS Fengyun trie in order to build the most important fail indicator , namely mismatch jump pointer (verbal), fail pointer is the longest follow-current state.
         And in general we have to speed up, each node will build similar stuff virtual node, the point is to make it clear that this node does not exist son pointer to it the longest follow-up (ie, the node fail points) of the son, so do can quickly jump at the end of the string match to continue on another pattern matching string, or watching their own specific brain supplement bar codes ( author lazy) .

    Extension points 2.AC automata

          Generally when we mismatch continues to match the pointer will jump fail, we called this rampage ! !

         Rampage, by definition, time-consuming , it will generally be subject stuck out, this is the case, we generally fail to consider trees that fail edge attached to a tree. Why guarantee a tree? Because obviously each node has only one father.

         After this fail tree built, state of the sub-tree of each node contains all the nodes represent the recognition of the state represented by the string node, so to a certain statistical pattern string in the required sequence number of the CCP appears, simply put requirements all node weights plus a string value, then the answer is the pattern string distal idea tree size.

         Statistics subtree size of the problem worth studying , because a lot of problems in pattern strings and have required more than a string, that is, the weight of the tree will certainly be changed, if the statistics sub-tree traversal method to calculate the size is not good, then we use DFS order to put the tree into segments range problem, so that we can use Fenwick tree maintenance. Specifically, the record is dfn and low values for each node on the line point of view, dfn and low intermediate range, that is, its entire sub-tree, so good explain why it can be used Fenwick tree.

Guess you like

Origin www.cnblogs.com/yzxx/p/11256934.html