Seven minutes to understand what is KMP algorithm

This article is to introduce what is BF algorithms , KMP algorithm , BM algorithm is one of Trilogy.

KMP algorithm internal mathematical principles involved and too much knowledge, this article will only KMP algorithm running process, part of the matching table , the Next arrays are introduced, if you understand these three points to go about reading other KMP algorithm article certainly have a clear understanding.

The following text description, please read in conjunction with the video animation ~

Video Address: https://www.bilibili.com/video/av60334201/

definition

K nuth- M orris- P Ratt string search algorithm, referred to as KMP algorithm , often used to find the position of a pattern string P appears in a text string S.

This algorithm, Vaughan Pratt, James H. Morris trio jointly published by Donald Knuth in 1977, so they chose the three people named after the algorithm.

Donald Knuth is not feeling very familiar name? Yes, in front of perhaps the best article explaining Knuth shuffle algorithm article also appeared in him!

The following is given directly KMP algorithm operation flow:

  • Suppose now that matches the text string S to position i, j to match the pattern string P position
  • If j = -1, or the current character matching is successful (i.e., S [i] == P [j ]), have made i ++, j ++, continues to match the next character;
    ! If j = -1, and fails to match the current character ( i.e., S [i]! = P [ j]), the same for i, j = next [j]. This means that when a mismatch, the pattern string P with respect to the text string S is moved to the right j - next [j] Bit
  • In other words, the index position corresponding to the value of the next array pattern string P mismatch position P to the pattern string mismatch at

Do not understand? Directly see the movie!

working process

The following text by the pattern string P and the string S Example:

First, list all the pattern string P substring:

a              
a b            
a b a          
a b a a        
a b a a b      
a b a a b c    
a b a a b c a  
a b a a b c a c

Then, each obtained with all prefixes of suffixes substrings.

Prefix refers except the last character of a string of all the head assembly; suffix refers to the total tail combination except the first character of a string.

The fifth column as an example to demonstrate.

Prefix is

a      
a b    
a b a  
a b a  

Suffix is

b      
a b    
a A b  
b a a b

Accordingly, it is the maximum length of the common elements of the prefixes and suffixes is 2 .

Obtain the original pattern string P of each of the common elements of the prefixes and suffixes substring corresponding to the maximum length of the table below.

The maximum length of the table to seek the next array : next array corresponds to the "maximum length" a whole move to the right, then the initial value assigned -1 .

Well, get the next array after, KMP algorithm operation is very clear.

The letter pattern string P and S is a text string matching one, when the mismatch pattern string moves to the right.

How to move?

比如模式串的 b 与文本串的 c 失配了,找出失配处模式串的 next数组 里面对应的值,这里为 0,然后将索引为 0 的位置移动到失配处。

Guess you like

Origin www.cnblogs.com/fivestudy/p/11287725.html