KMP字符串查找算法的 Python实现

首先是next数组的获取,用了一个递归的方法,时间复杂度为O(m),m为模式串的长度

def GetNext(p):
    n = len(p)
    next = [-1] * n
    k = -1
    j = 0

    while(j < n - 1):
        if k == -1 or p[k] == p[j]:
            k += 1
            j += 1
            next[j] = k
        else:
            k = next[k]
    return next

next数组改进,即实现nextval数组

def GetNextval(p):
    n = len(p)
    next = [-1] * n

    k = -1
    j = 0

    while j < n-1:
        if k == -1 or p[j] == p[k]:
            k += 1
            j += 1
            #//p[k]表示前缀,p[j]表示后缀
            if p[j] != p[k]:
                next[j] = k
            else:
                # 因为不能出现p[j] = p[next[j]],所以当出现时需要继续递归,k = next[k] = next[next[k]]
                next[j] = next[k]
        else:
            k = next[k]
    return next

KMP实现,复杂度为O(n),n为文本字符串的长度

def KmpSearch(s, p):
    next = GetNext(p)
    sLength = len(s)
    pLength = len(p)
    i = 0
    j = 0

    while(i < sLength and j < pLength):
        if j == -1 or s[i] == p[j]:
            i += 1
            j += 1
        else:
            j = next[j]
    if j == pLength:
        return i - j      #返回文本的索引
    else:
        return -1         #没有找到匹配的字符串

测试

s = 'BBC ABCDAB ABCDABCDABDE'
p = 'ABCDABC'
print("next = ", GetNext(p))
print("result = ",KmpSearch(s, p))

结果
next = [-1, 0, 0, 0, 0, 1, 2]
result = 11 #index

整个方法的复杂度为O(m+n)

参考https://blog.csdn.net/v_july_v/article/details/7041827

猜你喜欢

转载自blog.csdn.net/lisa_ren_123/article/details/80827754