SubstringLocateOperations are commonly referred to 串的模式匹配
or string matching.
String pattern matching is provided with two strings S and T, provided mainly string S, also known
正文串
; T is set substring, also known模式
. Find T pattern matched substring main string S, if the matching succeeds, determines that matchesThe first position of a substring of characters occurring in the main string S。
Well-known pattern matching algorithms BF algorithm and KMP algorithm, here BF algorithm .
Pattern matching is not necessarily started from the first position of the main string, the string can be specified in the primary lookup starting position POS .
[Algorithm] Step
① usage counts each indicating the pointer i and j modes main string S and T currently pending comparison character position, the initial value of i pos
, j is the initial value 1
.
② If neither of the two strings when comparing to the end of the string, i.e., i and j are equal to a length of less than S and T, the loop performs the following operations:
- S.ch [i] and T.ch [j] comparison, if equal, both i and j respectively indicate the next position in a string, continuing through comparison of the character
- If such matching is restarted back pointer, a character from the next main string (i = i-j + 2 ) from a first pattern and a character again (j = 1) Comparative
③ If the j >T.length
described modes T and each character sequence a contiguous sequence of characters main string S are equal, the match is successful, and returns a character pattern T is equal to the first number of characters in the main string S (iT. length) ; otherwise, said there is no match, return 0.
【Algorithm Description】
int Index_BF(SString S,SString T,int pos)
{//返回模式T在主串S中第pos个字符开始第一次出现的位置。若不存在,则返回值为0
//其中,T非空,1 ≤pos≤ S.length
i=pos; j=1; //初始化
while( i<=S.length && j<=T.length) //两个串均未比较到串尾
{
if(S.ch[i]==T.ch[j])
{++i;++j;} //继续比较后续字符
else
{i=i-j+2;j=1;} //指针后退重新开始比较
}
if(j>T.length) return i-T.length; //匹配成功
else return 0; //匹配失败
}
Algorithm Analysis []
①the bestCase O(n+m)
, each trip are relatively unsuccessful match occurs in the corresponding character of the first character string and the main pattern string, for example:
S = "aaaaaba"
T = "BA"
②The worstCase O(n×m)
, matching each trip are relatively unsuccessful corresponding character occurs on the last character string of the master pattern string, for example:
S = "aaaaaab"
T = "AAB"
This is about the KMP algorithm on KMP algorithm - string pattern matching algorithm
for reference: "Data Structure" Yan Wei Min