Python string search matching and replacement (detailed)

Please point out that there is an error, it is being changed...


The basics of regular usage in python

The use of regular modifiers

python parameters in the flagparameter representing the regular modifier.
Setting a plurality of regular expression modifiers matching pattern: re.I | re.M.

Modifier description
re.I Make the match case insensitive
re.L Do locale-aware matching
re.M Multi-line matching, affects ^ and $
re.S Make. Match all characters including newline
re.U Parse characters according to the Unicode character set. This flag affects \w, \W, \b, \B.
re.X This flag allows you to write regular expressions more easily by giving you more flexible formatting.

Backreferences in python

  1. Backquote: backslash and number (\N)
  2. (?<name>exp)Match exp, and capture the text into the group named name, which can also be written as (?'name'exp).
    But in Python, for (?P<name>exp).
  • Subsequent use group(1)or group(name)obtain the content metadata matching expression.

> There is a question about invalid backreferences

Note : back-references need to prefix r, otherwise it will be mistaken for Python binary numbers: for example, \1the actual significance becomes\x01


Requirements: a single match string

re.match() function

In python match(), this string needs to start with this regular expression , [check whether the beginning of the string is correct]. If the beginning of the string is different, the match fails and returns none. (It is what many articles say "Check whether RE matches at the beginning of string")

re.search() function

search()It is a partial match in the entire string , [is there anything like this in the string]. It is noteworthy that search()only one match

re.fullmatch() function

fullmatch(): The string completely matches the regular, and the string completely matches the regular [Check if this string is what we want]


Requirements: Full-text search and replacement string

re.sub() function

Basic needs to use replace(), of course, the re module is mentioned here, and complex replacements should be used re.sub(). It will replace all matches in the string.

substitute
n .: 代者; 代物; 設物; 裝裝(the athlete);
v. (以…) Replace; replace;

re.sub(pattern, repl, string, count=0, flags=0)

pattern: Regular expression
repl: the string to be replaced, which can be a function
string: source string
count: the minimum number of times to be replaced
flag: the matching mode of the regular expression ( remember not to set it to count, it should be written flags=xxx)

Regex Modifiers-Optional Flags | Rookie Tutorial

# 将“\n3.”等换为“【3】”
oriStr = '\n3.这是第三点'
resStr = re.sub(r'\n(\d+)\.', r'\n【\1】', oriStr)
print(resStr)

[3] This is the third point


Requirements: Full text search matching string

re.findall()And re.finditer()function

  1. re.findall()To return a list of all matches
re.findall(pattern, string, flags=0)
pattern.findall(string[, pos[, endpos]])
import re
 
result1 = re.findall(r'\d+','runoob 123 google 456')
 
pattern = re.compile(r'\d+')   # 查找数字
result2 = pattern.findall('runoob 123 google 456')
result3 = pattern.findall('run88oob123google456', 0, 10)
 
print(result1)
print(result2)
print(result3)

Output result:

[‘123’, ‘456’]
[‘123’, ‘456’]
[‘88’, ‘12’]

  1. re.finditer(), Which returns an iterator of all matches
re.finditer(pattern, string, flags=0)
import re
 
it = re.finditer(r"\d+","12a32bc43jf3") 
for match in it: 
    print (match.group())

12
32
43
3


Requirements: Split a string with matching characters

re.split() function

Use the matched string as split characters, and return the list with the split string

re.split(pattern, string[, maxsplit=0, flags=0])
>>>import re

>>> re.split('\W+', 'runoob, runoob, runoob.')
['runoob', 'runoob', 'runoob', '']

>>> re.split('(\W+)', ' runoob, runoob, runoob.') # 小括号表示保留这个分隔者
['', ' ', 'runoob', ', ', 'runoob', ', ', 'runoob', '.', ''

This article references: Python3 regular expressions | Rookie Tutorial

Guess you like

Origin blog.csdn.net/zsq8187/article/details/109749945