Article Directory
Summary
KMP algorithm is an improved string matching algorithm, proposed by DEKnuth, JHMorris and VRPratt, so people call Knuth - Morris - Pratt operation (referred KMP algorithm). The core KMP algorithm using information fails to match, to minimize the number of main string pattern string match to achieve the purpose of fast matching. Specific implementation is implemented by a function next (), local matching function itself contains information pattern string. KMP algorithm time complexity of O (m + n).
Brief introduction
Matching string is the basic operation of the string, the most direct way is to traverse the full backtracking search, but its high complexity, and as people seek to match the efficiency and optimize the match, matching its complexity is gradually reduced.
In this process, when the most important part of the KMP, he will reduce the complexity of the original algorithm to O (n + m) wherein n, m is the length of the two strings.
Detailed
The most direct backtracking
Start matching string from the left, if you encounter the same sc j +1 and colleagues, until all been traversed when to return to the p subscript, if there is not the same as the matching process carried back to sc i.
public class Match_kmp {
public static void main(String[] args) {
System.out.println(indexOf("aaavbvdd","vbv"));
}
private static int indexOf(String s,String p){
int i = 0;
int sc = i;
int j = 0;
while (sc<s.length()){
if(s.charAt(sc)==p.charAt(j)){
j++;
sc++;
if(j==p.length()){
return i;
}
}else
{
i++;
sc=i;
j=0;
}
}
return -1;
}
}
As a result, the first match is returned subscript i
- This method of solving the complexity reached O (n * m), KMP algorithm described below.
KMP match
Diagram
① match left to right, when i = 0, j = 0; if the same j ++, i ++, j if not identical Backtracking
② a different match to the last time backtracking time if violence matching i ++, so j = 0;
but before we can make use of the string has been more of T, backtracking. This is more time will be reduced complexity. j depends on the back next to the matched string T [Array]. Next Solution below said array into .next = {0,0,0,0,2,2}
so next [5] = 2, the right size = 5-2, j at this time is also backtracking
③ so, then j = 2; compared are not identical, backtracking j = 2-next [j-
KMP
public class MatchKMP {
public static void main(String[] args) {
int ne[] =getNext("abcdabd");
int res = kmp("ssdfgasdbababa","bababa",ne);
System.out.println(res);
}
private static int kmp(String s,String t,int[] next){
for (int i = 0,j=0;i<s.length();i++){
while(j>0&&s.charAt(i)!=t.charAt(j)){
j=next[j-1];
}if(s.charAt(i)==t.charAt(j)){
j++;
}
if(j==t.length()){
return i-j+1;
}
}
return 0;
}
private static int[] getNext(String t){
int next[]=new int[t.length()];
next[0]=0;
for (int i = 1,j=0; i < t.length(); i++) {
while (j>0&&t.charAt(j)!=t.charAt(i))
j=next[j-1];
if(t.charAt(i)==t.charAt(j))
j++;
next[i]=j;
}
return next;
}
}
Solving next array
- Prefix is a collection of characters except the last one
- In addition to a first set of suffix character of the
prefix and suffix of abcabd calculated
before a character suffixes are 0
ab & prefix is [a] a suffix [b] 0 is the same
prefix is abc [a, ab] suffix is [bc, c] the same as the 0
prefix ABCA is [a, ab, abc] suffix is [BCA, CA, a] 0
abcab the prefix [a, ab, abc, abca ] suffix is [bcab, cab, ab, b] the maximum the same as ab next [4] = 2
prefix abcabd is [a, ab, abc, abcb , abcabd]
suffix [bcabd, cabd, abd, bd , d] maximum total length or ab it next [ 5] = 2;
private static int[] getNext(String t){
int next[]=new int[t.length()];
next[0]=0;
for (int i = 1,j=0; i < t.length(); i++) {
while (j>0&&t.charAt(j)!=t.charAt(i))
j=next[j-1];
if(t.charAt(i)==t.charAt(j))
j++;
next[i]=j;
}
return next;
}
summary
KMP learning algorithm I spent a lot of time, particularly when read has not been understood. Later, after his own drawing, and instantly understand; things still have to do it.
reference
Baidu Encyclopedia KMP algorithm
Drawing myself, and instantly understand; things still have to do it.
reference
Baidu Encyclopedia KMP algorithm
Blog Ruan Yifeng's string matching algorithm KMP