Naive pattern matching algorithm (violent search for strings)

0. Preface

Use a naive pattern matching algorithm to find if a substring is within the main string

Development environment: Dev-Cpp
Operating system: Windows10 Professional Edition

1. Algorithm Introduction

Naive pattern matching algorithm, also known as brute force pattern matching algorithm or exhaustive method, is a simple and direct string matching algorithm.

Algorithm idea: In the simple pattern matching algorithm, start from the first character of the main string to compare with the pattern string, if they are equal, continue to compare the next character, otherwise start matching from the next character of the main string. If the number of remaining characters in the main string is less than the length of the pattern string, the match fails; if all the characters in the pattern string can find corresponding characters in the main string, the match succeeds.

Algorithm steps:

  • Starting from the first character of the main string, compare it with the first character of the pattern string in turn;
  • If two characters are equal, compare the next character until the last character of the pattern string;
  • If all the characters in the pattern string can find the corresponding characters in the main string, the match is successful;
  • If a character in the pattern string is not equal to a character in the main string, start matching from the next character in the main string.

Algorithm complexity: The time complexity of the simple pattern matching algorithm is O((n-m+1)m), where n is the length of the main string and m is the length of the pattern string. In the worst case, the number of comparisons required is (n-m+1)m, that is, each character of the main string must be compared m times.

Advantages and disadvantages of the algorithm: The naive pattern matching algorithm is simple to implement and the code is easy to understand. However, because only one character is compared each time, the efficiency is low, and it is suitable for the case where the main string and the pattern string are short and small in scale. For longer main strings and pattern strings, it is recommended to use other more efficient string matching algorithms, such as KMP algorithm, Boyer-Moore algorithm, etc.

2. Code implementation

main.cpp

#include<stdio.h>
#include<stdlib.h>
#include<string.h>

int main(){
    
    
	
	//修改此处 s1为主串,s2为子串 
	char s1[] = "abcdefg";
	char s2[] = "abc";
	
	 
	int i,j;
	int len_main = strlen(s1);
	int len_son = strlen(s2);
	
	printf("s1 is '%s'\n",s1);
	printf("your find str is '%s'\n\n",s2);
	
	
	//取出所有 i-j 区间的子串 
	for(i=0;i<len_main-len_son+1;i++)
	{
    
    
		
		char temp[len_son+1] = {
    
    };
		j=i+len_son;
		
		//如果j大于字串长度,即错误退出 
		if(j>len_main)
		{
    
    
		printf("false\n");
		exit(0);
		}
		
		//开始找字串 
		for(int m=i,n=0;m<j;m++,n++)
			temp[n] = s1[m];
		
		//最后一位赋"\0"
		temp[len_son] = '\0';
		
		//比较  
		if(!strcmp(temp,s2))
		printf("√ s1[%d:%d] is '%s'\n",i,j,temp);
		else
		printf("× s1[%d:%d] not is '%s',that is '%s'\n",i,j,s2,temp);
	}
	
	printf("\nend\n");
	
	return 0;
} 

3. Running results

find abc
insert image description here

find def
insert image description here

Find jfha (does not exist)
insert image description here

The above is the result of the brute force algorithm. If the length of the main string is too long, the time complexity will increase geometrically.

Guess you like

Origin blog.csdn.net/qq_53381910/article/details/131463743