Detailed explanation of KMP algorithm (personal understanding)

Detailed explanation of KMP algorithm

We know that the BF algorithm is a brute force method, which is time-consuming and labor-intensive, so what should we do? The three great gods of DEKnuth, J, H, Morris and VRPratt created a fast matching algorithm for us-KMP algorithm. The core of this algorithm is that it reduces the process of backtracking of the main string pointer and greatly improves our matching. speed.
The main string pointer backtracking mentioned above, what is this? Let me tell you about the implementation steps of KMP.

1. Steps

  • Obtaining the next value of the main string
    is because KMP has calculated the next value of the main string before matching, so the pointer backtracking of BF is reduced. To give you an analogy:
主串:ababbbab
子串:abb

第一轮匹配:匹配到了第一个字符
ababbbab
a
第二轮匹配:匹配到了第二个字符
ababbbab
ab
第三轮匹配:匹配失败
匹配失败之后BF算法是这样的,从主串第二个字符又开始匹配,这就较做主串指针回溯。
ababbbab
 abb

The following code is for the next[] of the main string

void getnext() {
    
    
    next[0] = -1;
    int i = 0, j = -1;
    int len = strlen(T);
    while(i < len) {
    
    
        if(j == -1 || T[i] == T[j])
            next[++i] = ++j;
         else
            j = next[j];
    }
}

The purpose of getting next is to find out if the substring in front of the character that fails to match has the same prefix and suffix. If there is, skip it directly. Here is a bit of a circumvention. I will introduce you to a screenshot in a moment. So the KMP algorithm avoids the problem of pointer backtracking.
To make it simpler, the backtracking algorithm is also known as the "heuristic". When solving the problem, every step you take is an attitude of trying. If you find that the current choice is not the best, or if you go on like this, you will definitely not achieve your goal, you should immediately go back and choose again. This method of going back and then going back when it fails is the backtracking algorithm.

Make a match

bool KMP() {
    
    
    getnext();
    int len1 = strlen(T);
    int len2 = strlen(S);
    int i = 0, j = 0;     //i指向模式串T,j指向主串S
    while(j < len2) {
    
    
        if(T[i] == S[j]) {
    
     
            i++;
            j++;
            if(i == len2) {
    
    
                return true;
            }
        } else {
    
    
            i = next[i];
            if(i == -1) {
    
    
                j++;i++;
            }
        }
    }
    return false;
}

Don't talk nonsense, just go to the picture. Show everyone the process of the above code
Insert picture description here

Specific process:
Insert picture description here

Insert picture description here
Insert picture description here
Insert picture description here
You can see the three pictures above. The first five are matched. When the sixth one is not matched, what should I do? Just use i = next[i]this syntax, let the substring point to B in the fourth picture, and then start pairing from B.

  • Code
#include <iostream>
#include<cstring>
#include <string>
typedef long long ll;
using namespace std;


bool KMP(char *T,char*S) {
    
    

    int next[100];
    next[0] = -1;
    int i = 0, j = -1;
    int len = strlen(T);
    while(i < len) {
    
    
        if(j == -1 || T[i] == T[j])
        {
    
    
          i++;
          j++;
          next[i] = j;
        }

         else
         {
    
    
             j = next[j];
         }

    }
    int len1 = strlen(T);
    int len2 = strlen(S);
    int i_ = 0, j_ = 0;     //i指向模式串T,j指向主串S

    while(j_ < len2) {
    
    
        if(T[i_] == S[j_]) {
    
    

            i_++;
            j_++;
            if(i_ == len2) {
    
    

                return true;
            }
        } else {
    
    

            i_ = next[i_];
            if(i_ == -1) {
    
    
                j_++;i_++;
            }
        }
    }
    return false;
}

int main()
{
    
    
	char a[] = "ABBABBABAB";
	char b[] = "ABBAB";
	bool m = KMP(a,b);
	printf("%d\n",m);
	return 0;
}

Guess you like

Origin blog.csdn.net/qq_45125250/article/details/109684508