re Python standard library of (regular expression operations)

Two kinds of objects re Library

Pattern: Regular Expression Object
Match: matching objects

Common Content

search

Matching a first string position

In [1]: import re

In [2]: s = "I am a simple string"

In [3]: r = re.search("am", s)

In [4]: r
Out[4]: <re.Match object; span=(2, 4), match='am'>

In [5]: r.span()
Out[5]: (2, 4)

In [7]: r.group()
Out[7]: 'am'

match

From the start position of the string matching

In [1]: import re

In [2]: s = "I am a simple string"

In [3]: r = re.match("I", s)

In [4]: r
Out[4]: <re.Match object; span=(0, 1), match='I'>

In [5]: r.span()
Out[5]: (0, 1)

In [6]: r.group()
Out[6]: 'I'

In [7]: r = re.match("am", s)

In [8]: print(r)
None

findall

All matching string and returns a list of matches

In [1]: import re

In [2]: s = "Python 3.8.1 (tags/v3.8.1:1b293b6, Dec 18 2019, 22:39:24)"

In [3]: re.findall(r"\d+", s)
Out[3]: ['3', '8', '1', '3', '8', '1', '1', '293', '6', '18', '2019', '22', '39', '24']

cleave

Element returns to a non-matching object iterator

In [1]: import re

In [2]: s = "Python 3.8.1"

In [3]: r =  re.finditer(r"\d+", s)

In [4]: r
Out[4]: <callable_iterator at 0x56022f8>

In [5]: for i in r:
    ...:     print(i)
    ...:     print(i.group())
    ...:
    ...:
<re.Match object; span=(7, 8), match='3'>
3
<re.Match object; span=(9, 10), match='8'>
8
<re.Match object; span=(11, 12), match='1'>
1

sub

To replace the matching string

In [1]: import re

In [2]: s = "I am a simple string"

In [3]: re.sub("s\w*", "python", s)
Out[3]: 'I am a python python'

compile

The regular expression compiled into a regular expression object, for multiple matches

In [1]: import re

In [2]: pattern = r"\d+"

In [3]: p = re.compile(pattern)

In [4]: r = p.findall("abc123 k23 77")

In [5]: r
Out[5]: ['123', '23', '77']

In [6]: r = p.findall("1.2@12*g00")
    
In [7]: r
Out[7]: ['1', '2', '12', '00']

Cheat sheet

Function Parameters

parameter Features
pattern Matching regular expression
string To match the string
flags Flag, used to control the regular expression matching mode
maxsplit Separate frequency defaults to 0, does not limit the number of times
pos An optional parameter specifying the starting position of the string, the default is 0.
endpos An optional parameter specifying the end position of the string, the string length defaults.
repl Alternatively string can also be a function of
count The maximum number of replacements of the pattern matching, default 0 means to replace all occurrences.

Constant symbol

Mark Features
re.A
re.ASCII
Let \ w, \ W, \ b, \ B, \ d, \ D, \ s and \ S matches only ASCII
re.DEBUG compile-time debug information display, there is no inline tag
re.I
re.IGNORECASE
Ignore case match
re.L
re.LOCALE
Decision \ w, \ W, \ b, \ B and case-sensitive matching (? I) by the current locale
re.M
re.MULTILINE
Multiline matching (? M)
re.S
re.DOTALL
Let '' special character matches any character, including newline (? S)
re.X
re.VERBOSE
This tag allows you to write more readable and more friendly regular expressions, whitespace is ignored unless a set of characters or which escaped with a backslash, or * ?, (?: or (? P < ...> within the group

function

function Features
re.compile(pattern, flags=0) The regular expression pattern to compile a regular expression object
re.search(pattern, string, flags=0) Scanning the entire stringfind the first position matching pattern, and returns a corresponding matching object. If there is no match, returnNone
re.match(pattern, string, flags=0) From the stringstart position matches a regular expression pattern matching returns a corresponding matching object. If there is no match, returnNone
re.fullmatch(pattern, string, flags=0) If the entire stringmatch to the regular expression pattern, return a corresponding matching object. OtherwiseNone
re.split(pattern, string, maxsplit=0, flags=0) With patternsegmentationstring
re.findall(pattern, string, flags=0) In stringfound in the regular expression matched all the sub-string and returns a list, if no match is found, an empty list is returned.
re.finditer(pattern, string, flags=0) In stringfound in regular expression matching all sub-strings and returns an iterator
re.sub(pattern, repl, string, count=0, flags=0) Use replmatches in the replacement string
re.subn(pattern, repl, string, count=0, flags=0) Behavior and sub()the same, but returns a tuple
re.escape(pattern) Escape patternspecial characters
re.purge() Clear Cache Regular Expression

Regular expression object (Pattern)

method

method Features
search(string[, pos[, endpos]]) Scanning the entire stringfind the first position matching pattern, and returns a corresponding matching object. If there is no match, returnNone
match(string[, pos[, endpos]]) From the stringstart position matches a regular expression pattern matching returns a corresponding matching object. If there is no match, returnNone
fullmatch(string[, pos[, endpos]]) If the entire stringmatch to the regular expression pattern, return a corresponding matching object. OtherwiseNone
split(string, maxsplit=0) Equivalent to the split()function
finditer(string[, pos[, endpos]]) In stringfound in the regular expression matched all the sub-string and returns a list, if no match is found, an empty list is returned.
sub(repl, string, count=0) Equivalent to the sub()function
subn(repl, string, count=0) Equivalent to the subn()function

Attributes

Attributes Features
flags Regular match marker
groups The number of combinations of capture
groupindex Mapped by the (?P<id>)dictionary definition of naming the digital symbol combination and a combination
pattern Original style string compiled object

Matching objects (Match)

method

method Features
expand(template) Replacement of template were backslash escape and return, like the sub()method as
group([group1, ...]) Returns a plurality of matching or subset
__getitem__(g) Equivalent to m.group(g)
groups(default=None) It returns a tuple, comprising all the sub-set of matching, from 1 to any number appearing in the combination style
groupdict(default=None) Returns a dictionary containing all the named subgroups
start([group])
end([group])
Return groupmatch string to the start and end labels
span([group]) Returns comprising a matching tuple (start, end) position (m.start(group), m.end(group))

Attributes

Attributes Features
pos pos The value
endpos endpos The value
lastindex Finally integer index of the captured group match, if there is no match to generate returns None
lastgroup Name of the capture of the last match of the group, if there is no match to generate returns None
re 返回产生这个实例的 正则对象 ,这个实例是由正则对象的 match()search() 方法产生的
string 传递到 match()search() 的字符串
发布了33 篇原创文章 · 获赞 62 · 访问量 24万+

Guess you like

Origin blog.csdn.net/Jairoguo/article/details/104627423