Third, strings, arrays, and generalized table

(Contents to be perfect)

Knowledge Point

String pattern matching also known positioning operation or substring string matching. In matching, the main string called a target (string), substring called pattern (string) .

BF 法 (Brute Force):

KMP method:


Both methods string pattern matching. BF method, a simple string matching method. KMP method, sliding farther as possible, using the matching results portion.

Simple pattern matching algorithm (BF algorithm)

Illustration

The first round of comparison:

 

The second round comparison:

 

 

...... same principle, an intermediate step is omitted

Fifth round:

 

 

Sixth round:

 

 

  • The first round: sub-string comparison with the first character in the main character string in a first
    • If equal, then the comparison continues with the main string of the second character substring
    • If equal, comparing a second round
  • Second round: the first character string is compared with the second main character sub-string ......
  • N-th round: compared in turn continues until all matches

Code:

(slightly)

 

BF algorithm advantages: thinking simple, direct, drawback: each time the characters do not match, it must go back to the starting position, big time overhead. Time complexity O ((n-m + 1) * m).

 

KMP pattern matching algorithm

Illustration:

From the figure, we can easily find the same elements as the preceding character is present, S and T, it is not necessary to go back to the position S S [1] is, do not have T [0] position back to T. Comparison we can skip back to the same elements, a direct comparison of S [8] and T [3].

So we build a next storage array back position.

 

KMP algorithm idea: Suppose pattern matching process, the implementation of T [i] and W [j] matches the check. If T [i] = W [j], then continue to check T [i + 1] and W [j + 1] match.

Two Calculations next array

(1) The first method for finding: The next value of the previous character request

initialization:

 

 

 

Code:

 1 char t[]={"ababaabab"};
 2 int Len=strlen(t);
 3     
 4     int i = 0, j = -1;
 5     int next[len];
 6     next[0]=-1;
 7      while (i < len - 1) { 8 if ((j == -1) || t[i] == t[j]) { 9 ++i, ++j; 10 next[i] = j; 11 }else{ 12 j = next[j]; 13  } 14  } 15 16 for(i=0;i<len;i++) 17  {printf("next[%d]->%d\n",i,next[i])}

 

 

 

 

(2)第二种求法:根据最大公共元素长度求

 

 

 

next数组优化(nextval的求法

当子串中有多个连续重复的元素,例如主串 S=“aaabcde” 子串T=“aaaaax” 在主串指针不动,移动子串指针比较这些值,其实有很多无用功,因为子串中5个元素都是相同的a,所以我们可以省略掉这些重复的步骤。

nextval其实是next的改进。

Guess you like

Origin www.cnblogs.com/makelin/p/12146285.html