String matching BF/RK/BM/KMP algorithm

The length of the main string is m, and the length of the matching string (pattern string) is n.

One, BF

Compulsory algorithm comparison, the easiest to think of, the time complexity is O(m*n)

Two, RK

Calculate the hash value of the matching string, traverse the main string, find the hash value of the substring of the same length at each corresponding position, and compare whether the two are the same.

If they are not the same, they definitely do not match, and the main string is transferred to the next one. If they are the same, since the hashes may collide, they need to be compared character by character.

Time complexity O(m)

Three, BM

①, the pattern string is aligned with the main string to the left, comparing from back to front, it is found that the main string T and the pattern string G are not equal. Then start from this position in the pattern string and find the character T position forward.

②, after the position of the pattern string character T is aligned with the position of the main string T, it is matched from back to forward. It is found that the main string A and the pattern string G do not match, and the position of the character A is found from this position in the pattern string forward.

③. The position of the pattern string A is aligned with the position of the main string A, and then from back to front to facilitate whether to match. Find all matches if found.

Four, KMP

public static int kmp(String str, String pattern) {
        //预处理,生成next数组
        int[] next = getNexts(pattern);
        int j = 0;
        for (int i = 0; i < str.length(); i++) {

            while (j>0 && pattern.charAt(j)!=str.charAt(i)){
                //遇到坏字符时,查询next数组并改变模式串的起点
                j = next[j];
            }
            if(pattern.charAt(j)==str.charAt(i)){
                j++;
            }
            if(j == pattern.length()){
                return i - pattern.length() + 1 ;
            }
        }
        return -1;
    }

    // 生成Next数组
    private static int[] getNexts(String pattern) {
        int next[] = new int[pattern.length()];
        next[0]=-1;
        int j = 0;
        for (int i = 2; i < pattern.length(); i++) {
            while (j!=0 && pattern.charAt(i-1)!=pattern.charAt(j)){
                //从next[i+1]的求解回溯到 next[j]
                j=next[j];
            }
            if(pattern.charAt(i-1)==pattern.charAt(j)){
                j++;
            }
            next[i]=j;
        }
        return next;
    }

Time complexity O(m+n), space complexity O(n)

 

Guess you like

Origin blog.csdn.net/weichi7549/article/details/108549755