算法——字符串匹配之Rabin-Karp算法

出处:http://blog.csdn.net/chenhanzhun/article/details/39895077
前言
Rabin-Karp字符串匹配算法和前面介绍的《朴素字符串匹配算法》类似,也是对应每一个字符进行比较,不同的是Rabin-Karp采用了把字符进行预处理,也就是对每个字符进行对应进制数并取模运算,类似于通过某种函数计算其函数值,比较的是每个字符的函数值。预处理时间O(m),匹配时间是O((n-m+1)m)。

Rabin-Karp算法的思想:

假设待匹配字符串的长度为M,目标字符串的长度为N(N>M);
首先计算待匹配字符串的hash值,计算目标字符串前M个字符的hash值;
比较前面计算的两个hash值,比较次数N-M+1:
若hash值不相等,则继续计算目标字符串的下一个长度为M的字符子串的hash值
若hash值相同,则需要使用朴素算法再次判断是否为相同的字串;
Rabin-Karp算法实现
伪代码:

Rabin_Karp_search(T, P, d, q)
n = T.length;
m = P.length;
h = d^(m-1)mod q;
p = 0;
t = 0;
for i =1 to m
p = (dp+P[i]) mod q;
t = (d
t+T[i])mod q;
for i = 0 to n-m
if p==t
if P[1…m]==T[i+1…i+m]
print"Pattern occurs with shift"i
if i<n-m
t = d(t-T[i+1]h) + T[i+m+1]mod q

源码:

// Rabin Karp Algorithm

#include
#include

using namespace std;

void Rabin_Karp_search(const string &T, const string &P, int d, int q)
{
int m = P.length();
int n = T.length();
int i, j;
int p = 0; // hash value for pattern
int t = 0; // hash value for txt
int h = 1;

// The value of h would be "pow(d, M-1)%q"
for (i = 0; i < m-1; i++)
    h = (h*d)%q;

// Calculate the hash value of pattern and first window of text
for (i = 0; i < m; i++)
{
    p = (d*p + P[i])%q;
    t = (d*t + T[i])%q;
}

// Slide the pattern over text one by one 
for (i = 0; i <= n - m; i++)
{
    
    // Chaeck the hash values of current window of text and pattern
    // If the hash values match then only check for characters on by one
    if ( p == t )
    {
        /* Check for characters one by one */
        for (j = 0; j < m; j++)
           if (T[i+j] != P[j])
			   break;
        
        if (j == m)  // if p == t and pat[0...M-1] = txt[i, i+1, ...i+M-1]
           cout<<"Pattern found at index :"<< i<<endl;
    }
     
    // Calulate hash value for next window of text: Remove leading digit, 
    // add trailing digit           
    if ( i < n-m )
    {
        t = (d*(t - T[i]*h) + T[i+m])%q;
         
        // We might get negative value of t, converting it to positive
        if(t < 0) 
          t = (t + q); 
    }
}

}

int main()
{
string T = “Rabin–Karp string search algorithm: Rabin-Karp”;
string P = “Rabin”;
int q = 101; // A prime number
int d = 16;
Rabin_Karp_search(T, P,d,q);
system(“pause”);
return 0;
}

参考资料:
《算法导论》

http://www.geeksforgeeks.org/searching-for-patterns-set-3-rabin-karp-algorithm/

发布了42 篇原创文章 · 获赞 21 · 访问量 3万+

猜你喜欢

转载自blog.csdn.net/Jamesaonier/article/details/89975452