python—How to delete (retain) other elements except letters and numbers in a string

introduction

  I encountered the following problem
Insert picture description here
in the Leetcode brushing question: in a string containing ","":" and other characters, only letters and numbers are considered, so of course only the letters and numbers need to be kept, and the others are deleted.

1. Use the re.sub() function

For more regularization knowledge, see this website.
  Python's re module provides re.sub syntax for replacing matches in strings
:

re.sub(pattern, repl, string, count=0, flags=0)

parameter:

pattern: The pattern string in the regular.
repl: The replaced string, which can also be a function.
string: The original string to be searched and replaced.
count: The maximum number of replacements after the pattern is matched. The default 0 means to replace all matches.
flags: The matching mode used during compilation, in digital form.
The first three are required parameters, and the last two are optional parameters.

import re

s = "A man, a plan, a canal: Panama"
s = s.lower()
result = re.sub('[\W_]+', '', s)
print(result)
# amanaplanacanalpanama

Isn’t it amazing?
It takes advantage of special elements in regular expression pattern syntax.

[…] Used to represent a group of characters, listed separately: [amk] matches'a','m' or'k'
[^…] Characters not in []: [^abc] matches except a, b, Characters other than c.
re* matches zero or more expressions.
re+ matches one or more expressions.
re? Match 0 or 1 fragment defined by the previous regular expression, non-greedy way
\w matches alphanumeric underscore
\W matches non-numeric letter underscore

So, [\W_]+ means matching one or more non-digit letters,
if we change [\W_]+ to [\w_]+, then it means matching one or more digit letters

import re

s = "A man, a plan, a canal: Panama"
s = s.lower()
result = re.sub('[\w_]+', '', s)
print(result)
#  ,  ,  : 

2.isalpha()+isnumeric()+join()

  This method utilizes the built-in methods isalpha() and isnumeric() of the string, by iterating each element in the string and combining the join method (the join() method is used to connect the elements in the sequence with the specified characters to generate a new String).

s = "A man, a plan, a canal: Panama"
s = s.lower()
# 去掉除字符串与数字外的其他
s = [i for i in s if i.isalpha() or i.isnumeric()]
s = "".join(s)
print(s)
#  amanaplanacanalpanama

You can also delete only letters and numbers

s = "A man, a plan, a canal: Panama"
s = s.lower()
# 去掉除字符串与数字外的其他
s = [i for i in s if not i.isalpha() and not i.isnumeric()]
s = "".join(s)
print(s)
#   ,  ,  : 

Guess you like

Origin blog.csdn.net/weixin_46649052/article/details/114441811