Common methods of re module
-
Regular expressions, also known as regular expressions. (English: Regular Expression, often abbreviated as regex, regexp or RE in the code), a concept of computer science. Regular expressions are usually used to retrieve and replace text that meets a certain pattern (rule).
-
Given a regular expression and another string, we can achieve the following goals:
whether the given string meets the filtering logic of the regular expression (called "matching");
through regular expressions, from the string Get the specific part we want. -
The characteristics of regular expressions are:
flexibility, logic and functionality are very strong;
you can quickly use a very simple way to achieve complex control of the string;
for people who are new to contact, it is more difficult to understand.
re module operation
in Python through the re module to complete regular expression operations
match(string[, pos[, endpos]])
string
String to be matched pos
and endpos
optional parameters, the start and end position of the specified string, the default values are 0
and len
(string length).
# match 方法:从起始位置开始查找,一次匹配
re.match(pattern, string, flags=0)
result = re.match("hello", "hellolzt world")
print(result, result.group(), type(result))
Match at the beginning of the string pattern
, if the match is successful (it can be an empty string), return the corresponding match
object, otherwise return None
.
search method
- Search string anywhere in the match only once, as long as the result of a match is found, the return
search(string[, pos[, endpos]])
,string
is the string to be matchedpos
andendpos
optional parameters, start and end points specified string. When the match is successful, it returns anMatch
object, if it does not match, it returnsNone
. Scan the entire stringstring
, findpattern
the first match with the regular expression (can be an empty string), and return a correspondingmatch
object. If there is no match, None is returned.
re.search(pattern, string, flags=0)
result = re.search("hello", "2018hellolzt world")
print(result.group())
fullmatch method
fullmatch(pattern, string, flags=0)
, Ismatch
the exact match of the function (from the beginning to the end of the string)
re.fullmatch(pattern, string, flags=0)
result = re.fullmatch("hello", "hello1")
print(result)
string
Whether the whole and pattern
match, if it is to return the corresponding match
object, otherwise return None
.
findall method
- Return all the substrings that can be matched in the form of a list. If there is no match, an empty list is returned.
findall(string[, pos[, endpos]])
,string
To be matched stringpos
andendpos
optional parameters, the start and end position of the specified string.
findall(pattern, string, flags=0)
result = re.findall("hello", "lzt hello china hello world")
print(result, type(result))
# 返回列表
split method
- The string is divided according to the substring that can be matched and then returned to the list
split(string[, maxsplit])
, which ismaxsplit
used to specify the maximum number of divisions, and if not specified, all are divided.
re.split(pattern, string, maxsplit=0, flags=0)
result = re.split("hello", "hello china hello world", 2)
print(result, type(result))
# 返回分割列表
sub method
- For replacement
sub(repl, string[, count])
,epl
may be a string can also be a function of:
(1) if therepl
string will be usedrepl
to replace each of the substrings matching
(2), ifrepl
a function, the method takes a single argument (Match
Object ) And return a string for replacement.
(3)count
Used to specify the maximum number of replacements, if not specified, replace all.
sub(pattern, repl, string, count=0, flags=0)
result = re.sub("hello", "hi", "hello china hello world", 2)
print(result, type(result))
Use repl
replace pattern
substring matched up match count
times
iterator method
finditer(pattern, string, flags=0)
result = re.finditer("hello", "hello world hello china")
print(result, type(result))
# 返回迭代器
compile method
compile
Function is used to compile a regular expression, generate aPattern
target
compile(pattern, flags=0)
pat = re.compile("hello")
print(pat, type(pat))
result = pat.search("helloworld")
print(result, type(result))
# 编译得到匹配模型
flags
- Some functions of the re module will be
flags
used as optional parameters. The commonly used ones are listed belowflag
. They actually correspond to binary numbers, which can be used by bits or by combining them.flags
The behavior of regular expressions may be changed:
re.I
re.IGNORECASE: case insensitive in matching
re.M
re.MULTILINE: "^" matches the beginning of the string and after "\n"; "$" matches before and the end of the string. Usually called multi-line mode
re.S
re.DOTALL: "." matches any character, including newline characters. Usually called single-line mode.
If you want to use single-line mode and multi-line mode at the same time, you only need to set the optional parameter of the functionflags
tore.I
|re.S
.
result = re.match("hello", "HeLlo", flags=re.I)
print(result)
result = re.findall("^abc","abcde\nabcd",re.M)
print(result)
result = re.findall("e$","abcde\nabcd",re.M)
print(result)
result = re.findall(".", "hello \n china", flags=re.S)
# "." 可以匹配换行符
print(result)
result = re.findall(".", "hello \n china", flags=re.M)
# "." 不可以匹配换行符
print(result)