sunday algorithm - to solve the string matching problem

A, Sunday matching mechanism for
matching mechanism is very easy to understand:

Target string haystack, for len

Pattern string needle, a length of len_pattern

The current query index idx (initially 0)

Be matched string search = haystack.substring (idx, idx + len_pattern), i.e., the length of the string begins fetching from len_pattern index idx;

Are each hit from the target string extraction to be matched string with the pattern string matching:

If the match, it returns the current idx

Do not match, check to be matched string after a character c:

If c is present in the pattern string needle , then idx = idx + offset table [c]

否则,idx = idx + len(needle)

Repeat Loop 直到 idx + len(needle) > len(haystack)

Second, the offset table

Using the offset table is stored for each character appears in the pattern string from the end of the rightmost position in the pattern appear to string  + 1 + 1

 

Third, for example
String: checkthisout
Pattern: the this

Step 1:

idx = 0
to be matched string: chec
because chec = this!
so cHEC view of the next character k
k Pattern in not
so offset table view, IDX IDX + =. 5
the Step 2:


idx = 5
be matched string: this
because this == this
match, it returns 5
IV algorithm analysis
worst case: O (nm)
Average where: O (n)

Code:

public static int sunday(String haystack, String needle) {
    // 几种特殊情况,快速剪枝
if (haystack.equals(needle) || needle.isEmpty()) return 0;
if (needle.length() > haystack.length()) return -1;
int idx = 0;
int len_pattern = needle.length();
int len_search = haystack.length();
int len = len_search;
while (len_search >= len_pattern && idx + len_pattern <= len) {
String sub = haystack.substring(idx, idx + len_pattern);
if (sub.equals(needle)) return idx;
else {
int pos = getPos(haystack.charAt(idx + len_pattern == len ? idx + len_pattern - 1 : idx + len_pattern), needle.toCharArray());
if (pos == -1) {
idx += len_pattern + 1;
} else {
// 找到字符
idx += len_pattern - pos;
}
len_search = len - idx;
}
}

return -1;
}

public static int getPos(char ch, char[] tmp) {
for (int i = tmp.length - 1; i >= 0; i--) {
if (tmp[i] == ch) return i;
}
return -1;
}

 

Guess you like

Origin www.cnblogs.com/majw/p/12134390.html