Python series pattern matching and regular expressions

Search the WeChat public account "Programmer Koala" Welcome to follow!

You may be familiar with text search, that is, press Ctrl-F, enter the word you want to find. "Regular expression" is an upgraded version of Ctrl-F. Regular expressions are great, but if they are not programmers, few people will know about them, although most modern text editors and word processors have search and find and replace functions, which can be searched based on regular expressions. Regular expressions can save a lot of time, not only for software users, but also for programmers. Learn regular expressions, problems that others need days of tedious work to solve, and others are prone to errors, and you only need to type keys to solve problems.

 

Below is no way to find text patterns using regular expressions.

Suppose you want to find the phone number in a string, and you know the pattern: 3 digits, 1 dash, 3 digits, 1 dash, and then 4 digits. For example: 498-553-5453.

Suppose we use a function called isPhoneNumber () to check whether the string matches the pattern, it returns True or False. Open a new file editor and enter the following:

 

def isPhoneNumber(text):
  if len(text) !=12:
      return False
  for i in range(0,3):
      if not text[i].isdecimal():
          return False
      if text[3]!='-':
          return False
  for i in range(47):
      if not text[i].isdecimal():
          return False
      if text[7] != '-':
          return False
  for i in range(812):
      if not text[i].isdecimal():
          return False
  return True
print('498-553-5453 is a phone number:')
print(isPhoneNumber('498-553-5453'))
print('Moshi moshi is a phone number:')
print(isPhoneNumber('Moshi moshi'))

 

Running the program, the output looks like this:

 

498-553-5453 is a phone number:
True
Moshi moshi is a phone number:
False

 

Calling the isPhoneNumber () function with the parameter ' 498-553-5453 ' will return True, and the parameter ' Moshi moshi ' will return False. The first test fails because it is not 12 characters.

More code must be added to find this text pattern in longer strings. Replace the 4 print function calls in the above code with the following code:

 

message='Call me at 498-553-5453 tomorrow.415-233-2322 is my office.'
for i in range(len(message)):
 chunk=message[i:i+12]
 if isPhoneNumber(chunk):
   print('Phone number found:' + chunk)
   print('Done')

 

When the program is run, the output looks like this:

 

Phone number found:498-553-5453
Phone number found:415-233-2322
Done

 

In this example, although the characters in the message are very short, it may also contain millions of characters, and the program still does not take 1 second to run. Similar programs that use regular expressions to find phone numbers will not run for more than a second, but writing such programs with regular expressions will be much faster.

no

 

 
 
 

 

 

 

 

 
Published 13 original articles · Likes2 · Visits 577

Guess you like

Origin blog.csdn.net/weixin_40273144/article/details/80294518