A, Sunday matching mechanism for
matching mechanism is very easy to understand:
Target string haystack, for len
Pattern string needle, a length of len_pattern
The current query index idx (initially 0)
Be matched string search = haystack.substring (idx, idx + len_pattern), i.e., the length of the string begins fetching from len_pattern index idx;
Are each hit from the target string extraction to be matched string with the pattern string matching:
If the match, it returns the current idx
Do not match, check to be matched string after a character c:
If c is present in the pattern string needle , then idx = idx + offset table [c]
否则,idx = idx + len(needle)
Repeat Loop 直到 idx + len(needle) > len(haystack)
Second, the offset table
Using the offset table is stored for each character appears in the pattern string from the end of the rightmost position in the pattern appear to string + 1 + 1
Third, for example
String: checkthisout
Pattern: the this
Step 1:
idx = 0
to be matched string: chec
because chec = this!
so cHEC view of the next character k
k Pattern in not
so offset table view, IDX IDX + =. 5
the Step 2:
idx = 5
be matched string: this
because this == this
match, it returns 5
IV algorithm analysis
worst case: O (nm)
Average where: O (n)
Code:
public static int sunday(String haystack, String needle) {
// 几种特殊情况,快速剪枝
if (haystack.equals(needle) || needle.isEmpty()) return 0;
if (needle.length() > haystack.length()) return -1;
int idx = 0;
int len_pattern = needle.length();
int len_search = haystack.length();
int len = len_search;
while (len_search >= len_pattern && idx + len_pattern <= len) {
String sub = haystack.substring(idx, idx + len_pattern);
if (sub.equals(needle)) return idx;
else {
int pos = getPos(haystack.charAt(idx + len_pattern == len ? idx + len_pattern - 1 : idx + len_pattern), needle.toCharArray());
if (pos == -1) {
idx += len_pattern + 1;
} else {
// 找到字符
idx += len_pattern - pos;
}
len_search = len - idx;
}
}
return -1;
}
public static int getPos(char ch, char[] tmp) {
for (int i = tmp.length - 1; i >= 0; i--) {
if (tmp[i] == ch) return i;
}
return -1;
}