KMP algorithm and analytic templates

All data structures, algorithms and applications curricular templates please click: https://blog.csdn.net/weixin_44077863/article/details/101691360

The KMP algorithm, it must first talk about what it is intended, and why that name, algorithm violence is like

KMP is three Knuth, Morris, Pratt, the three of them invented this algorithm, so called KMP. . . (Do not be surprised ..)

Then it is used to solve the string pattern matching problem (vernacular is to find substring)

For example, he said that, to give you a master string S, T string mode, ask you how much substring in S contains T, the number of sub-string at the beginning of the mark

For example S = aaaaa T = aa, so there are 4 sub-strings are subscripts 0,1,2,3

Another example

1 substring, subscript 5

Then talk about violence algorithm, that is, you never learned KMP guarantee on the way to write a

BF algorithm called the Storm (Brute Force) algorithm (that is, violent friends ~), is a common pattern matching algorithm

Or it can be called NaiveStrMatching simple pattern matching algorithm

Violent means in fact is very simple double loop, S subscript 0, then each subscript 0 T and S began to compare

The time complexity of O (nm)

Then people realized, T every time a very backward move is not necessary, and then start from index 0 bit by bit comparison is also completely unnecessary

For example, still above the ST

The first round of comparison results:

Seeing, then the employer will feel the next comparison is more valuable:

at p marked direct comparison of FIG.

Naive algorithm violence would be more of the following:

And moved to No. 2 when, T will start comparing from 0

This time K, M, P three people began to study the matching characteristics of the human eye

The first round of comparison, the prefix string matching T is: ABAB

This string has true prefix string: A, AB , the ABA

True postfix: B, AB , BAB-

They found that the same portion of the bold

So what laws there, we'll look at the chart comparing the first round

For T T01 = T23, then look suffix 23, S23 = T23 = T01

T01 would then move to S23, and S4 and T2 can be directly compared

So, KMP algorithm principle is summed up in one sentence:

When the comparison fails, see string has been successfully matched prefix A, A take a proper suffix of string Y, there is a real X A prefix string of equivalent thereto

And a case lenX (= lenY) length as long as possible, the character and the next character position failure moved to a position Y, X, X is a direct comparison

The cycle of the above operations can not continue until the match

In fact, this algorithm is far from finished

Every now find the corresponding prefix and suffix, is very complicated and time

So first position corresponding to the saved pre-out

For example ABAB number it is corresponding to a length of 3 B 2 (AB)

We call this characteristic number N

To have ABABA

(Textbook is N, in fact, be a little more comfortable next general writing, but includes universal header files, then next will be ambiguity error)

So when we look at the first round match

When 4 i = j =, the match fails, then the mobile j = N [j-1], then compares S [i] and T [j] to

Of course, if j = 0 (the first one) is not equal to the already no N [-1], and directly moved backward like a i

Then the number of pre-N along time if too much violence will be very slow

With the idea of dynamic programming to consider, pretreatment of the nature of this thing is a KMP

For example a 4-bit A, N [3] = 2, then direct master: T [4] and the model: T [2] Comparative

Found T [4] == T [2], then N [4] = N [3] + 1 = 3, then determines the first five (of course, the string is not above 5, the end of it)

Suppose T [4] = T [2], and then determines whether the current true then the prefix string in a string prefix true, determination is to find N [N [3] - 1]! I.e., N [1]

Then repeat this until equal

Of course, we assumed that the determination is not equal to 0, then N [4] = 0 to

KMP algorithm is as follows :( template complexity O (n + m))

char s[maxn],t[maxm]; //主串s,模式串t 
int n,m,N[maxm],cnt; //主串长n,模式串长m,特征数N,成功匹配个数cnt
vector<int> pos; //成功匹配位置所有下标(第一个字符)
void get_next(){
	m=strlen(t);
	for(int i=1;i<m;i++){
		int j=N[i-1];
		while(j&&t[i]!=t[j]) j=N[j-1];
		if(t[i]==t[j]) N[i]=j+1;
		else N[i]=0;
	}
}
void KMP(){
	n=strlen(s);
	int i=0,j=0;
	for(;i<n;i++){
		while(s[i]!=t[j]&&j) j=N[j-1];               
		if(s[i]==t[j]) j++;
		if(j==m) cnt++,pos.push_back(i-j+1);
	}
}

KMP algorithm template question: https://cn.vjudge.net/problem/HihoCoder-1015

It can pay out of their own under test write KMP right

发布了49 篇原创文章 · 获赞 0 · 访问量 1718

Guess you like

Origin blog.csdn.net/weixin_44077863/article/details/102172106