KMP algorithm understanding and handwritten code

kmp algorithm

The kmp algorithm is an algorithm that compares whether str1 (length is n) contains str2 (length is s).

Small tips: The method of violent solution is O(sn), but look at the following example:
[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-kwaUp18H-1609902020557)(A448D49100A84A0A849C310F28048565)]

a and d are different, indicating that the string does not match, if it isViolent solution, Will become the following.

[External link image transfer failed. The origin site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-AhAgsDv5-1609902020563)(A01AEBDBA76B4332B6F973976649638B)]

Match again from the beginning, but lookkmp algorithmWhat will we do next?
[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-Rbya6R0y-1609902020569)(388ABA02FE214A8998B276175CFA3E22)]

Why is that?

The essence of the kmp algorithm is to understand the characteristics of its own search for strings and avoid repeated matching to accelerate the process of violent algorithms

In the beginning figure
[External link image transfer failed. The origin site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-B35m9ydF-1609902020573)(6DFDCF48CED44FE1B6CC6FCE9AA52F85)]

Both acac and acac are matched successfully, and the str2 string meets a characteristic before d

A . . . A A...A A . . . A
i.e. beginning and end are the same (A = ac) (... = no)

Therefore, you can better understand the above picture. Part2 and part3 have been successfully compared, and part2 and part1 are also compared (we understand the characteristics of str2). Naturally, part1 and part3 do not need to be compared. You can jump directly to the following picture.
[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-3q9zdtL7-1609902020582)(5000F41F7B784C17A2A983E8D9D003B4)]

Define the characteristics of str2-next array

The next array refers to when a bit of str1 does not match a bit of str2,Which j pointer should jump toWhat?
Let me give you a few scenes, which one should you jump to?
[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-gLKmYNgc-1609902020587)(7D38D9F8A6C849ECB0A6E51AE663C022)]

Here the string before d in str2 does not meet the characteristics of A...A, so skip to the following figure
[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-1C8ppwkt-1609902020590)(3A337B369C0944778A23A5BFAB657A5D)]

(J pointer jumps to 0 position, i pointer does not move), then next[4]=0 in our next array

Where do you jump here?
[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-iplvHmus-1609902020593)(B742FC799BB141C59BDC2175BEBEDE2D)]

We found that before h in str2 satisfies the characteristics of A...A, jump to the following figure

[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-XVLeFYzt-1609902020607)(DCE3C5DCB989437C9C71D2F8E9E65B58)]

That is, the i pointer is different, the j pointer jumps from position 6 to position 2, that is, for this string str2, next[6]=4

Where do you jump here? ? ?
[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-sBzuVxCh-1609902020609)(BFF39C06E17544219D97C5814AEF26E9)]

Skip to the image below
[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-vTX79UQV-1609902020611)(2F204F26BD08423FA4F62F25ED9C2088)]

(I pointer +1, j pointer does not move, here we define the next array as -1, that is, next[0]=-1)

Look at the next array below
[External link image transfer failed, the source site may have an anti-leech link mechanism, it is recommended to save the image and upload it directly (img-a4m2kdwk-1609902020624)(B050060DF0DF4B949A614FF279C67CE2)]

  1. If the 0th bit is not equal, i pointer +1, j pointer does not move, set to -1
  2. If the first place is not equal, look at the string before c, a, does not meet the characteristics of A...A, and the j pointer jumps to 0
  3. If the second digit is not equal, look at the string before a, ac, it does not meet the characteristics of A...A, and the j pointer jumps to 0
  4. If the third position is not equal, look at the string before c, aca, to meet the characteristics of A...A, and the j pointer jumps to the one after A, that is, 1
  5. If the 4th digit is not equal, look at the string before a, acac, and satisfy the characteristics of A...A, the j pointer jumps to the next digit of A, which is 2
  6. If the 5th position is not equal, look at the string before f, acaca, and satisfy the characteristics of A...A, the j pointer jumps to the back of A (aca, A overlaps with A), that is, 3.
  7. If the 6th digit is not equal, look at the string before g, acacaf, does not meet the characteristics of A...A, the j pointer jumps to 0

Code

public class main {
    
    
	//得到next数组
	//编写该函数代码是理解如何由next[i-1]得到next[i]
	//next[i-1]=t  代表前t位(0~t-1)  等于后面t位(到i-2位)
	//因此若s[t]==s[i-1]  则next[i]=t+1
	//否则最后一位不等代表不满足A...A的特点,直接为0
	public static int[] getnext(String s)
	{
    
    
		int[] next=new int[s.length()];
		next[0]=-1;
		for(int i=1;i<next.length;i++)
		{
    
    
			if(i==1)
			{
    
    
				next[1]=0;
			}
			else {
    
    
				if(s.charAt(next[i-1])==s.charAt(i-1))
				{
    
    
					next[i]=next[i-1]+1;
				}
				else {
    
    
					next[i]=0;
				}
			}
		}
		return next;
	}
	public static boolean kmpmatch(String s1,String s2)
	{
    
    
		int[] next=getnext(s2);
		int i=0,j=0;
		for(;i<s1.length();i++)
		{
    
    
			if(j==s2.length())
			{
    
    
				return true;
			}
			if(s1.charAt(i)==s2.charAt(j))
			{
    
    
			
				j+=1;
			}
			else
			{
    
    
				if(j==0)
				{
    
    
					i=i+1;
				}
				else {
    
    
					j=next[j];
				}
			}
		}
		return false;
		
	}
	
	
	public static void main(String[] args)
	{
    
    
		
		System.out.println(kmpmatch("adaddefg", "ac"));
	}
   
	

nextval

  1. Improvement of the next array, easy to understand

  2. The next array represents the position where the j pointer will jump to. If next[a]=b, which is not equal to s2[a], the j pointer should jump from a to b. If s2[b]==s2[a ], will not wait, but also jump forward, so we can change the next array to the final position of the jump, which is nextval

  3. For example
    [External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-BU89Zx63-1609902611201)(DB06AD98B1E741D6BD7476C8A95092D1)]

  4. Code

	public static int[] getnextval(String s)
	{
    
    
		int[] next=getnext(s);
		for(int i=0;i<next.length;i++)
		{
    
    
			if(s.charAt(i)==s.charAt(next[i]))
			{
    
    
				next[i]=next[next[i]];
			}
		}
		return next;
	}

kmp applications

Classic topic 1: str1: "123456", there are many rotating words, "123456", "23456,
"345612"...
Determine whether str1 is the rotating word of str2?

Extend str1 to str1str1, and then see if str2 is a substring, and use kmp here.

Classic topic 2: Ask whether tree2 is a
subtree of tree 1. It must be complete, and the subtree contains a complete root node:
[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-dnRX3c4u-1609998685690)(43FA13799BE94B73AC917F4F59427878)]

Violent solution:

public static class TreeNode {
    
    
	     int val;
	     TreeNode left;
	     TreeNode right;
	     TreeNode(int x) {
    
     val = x; }
	 }
	public static boolean ischildtree(TreeNode root1,TreeNode root2)
	{
    
    
		
		if(root1==null||root2==null)
		{
    
    
			return false;
		}
		if(issametree(root1, root2))
		{
    
    
			return true;
		}
		return ischildtree(root1.left, root2)||ischildtree(root1.right, root2);
		
	}
	
	
	public static boolean issametree(TreeNode root1,TreeNode root2)
	{
    
    
		if((root1==null&&root2!=null)||(root1!=null&&root2==null))
		{
    
    
			return false;
		}
		if(root1==null&&root2==null)
		{
    
    
			return true;
		}
		if(root1.val==root2.val)
		{
    
    
			return issametree(root1.left, root2.left)&&issametree(root1.right, root2.right);
		}
		else {
    
    
			return false;
		}
		
	}

The essence is also a string comparison, each tree is filled into a binary tree and stored in array order, and then the "behind string" is used on the opposite side of the kmp algorithm, is it the previous substring?

[External link image transfer failed. The source site may have an anti-leech link mechanism. It is recommended to save the image and upload it directly (img-ij3y9OY3-1609998685694)(F4D9753D86F54F4BAB9B44CD20A77E59)]

Guess you like

Origin blog.csdn.net/hch977/article/details/112260217