Brief note of KMP pattern matching algorithm

KMP pattern matching brief notes

The KMP algorithm is a classic algorithm for pattern matching in the data structure. Some places are more obscure and many tutorials on the Internet are not very intuitive. Therefore, I am going to organize my own saliva language and briefly record what the KMP algorithm is about.
Relying on an example to introduce.

Throw example

For example, use a string, h="hello", to match the string, s="hehehello", first throw out the result to get a feel:

Serial number§ 0 1 2 3 4 5 6 7 8
character h e h e h e l l O
next -1 0 0 1 2 3 4 0 0
nextVal -1 0 -1 0 -1 0 4 0 0

The meaning of the next array

The meaning of the next array. This is the related record of the matched string s. The meaning of next[i] is the length of a character string that is one character before the i-th bit as the end, and a character string starting from the 0-th character as the beginning. From another perspective, when h and s fail to match at the i-th position of s, at this time, before the i-bit, suppose that there is a string of k-bits and k-bit length starting from the beginning of s. Then, These k bits, and the first k bits of the matched string s, are naturally matched, and there is no need to match again. Start with the matched string s, the one that failed to match, and Match the k bit (the k+1 character) of the string h, and start matching.

The meaning of the nextval array

Nextval is on the basis of the next array, making each step more valuable and saving meaningless steps.
It is to jump next to the one worthy of matching. For example, next[5] jumps to p[3], which is the character "e", but the strings of "e" and P[5] are the same, both are "e", so it is known that the match is no longer possible. There is no need to waste a step, otherwise it is necessary to jump to P[3] first, and then use next[3] of p[3] as a springboard, and then jump one step to p[1], and then find p[1] If it is still "e", it also needs to jump a step, to p[0], p[0] is "h", which is different and has comparative value.
And in the middle from p[5] to p[3], then to p[1], these steps are actually redundant and can be omitted, you can jump directly to p[0], therefore, the meaning of nextval is to put These steps in the middle are omitted and point directly to a step with comparative value. Therefore, this process is nextval[5]=next[5]=p[0].

to sum up

To understand the KMP algorithm, the most important thing is to understand the role of the KMP algorithm. In fact, it is to skip the parts that have already been matched and make each step more valuable.

Guess you like

Origin blog.csdn.net/baidu_31788709/article/details/112004185