re.compile
function is used to compile a regular expression compiler generates a regular expression (the Pattern) object for match () and search () function uses these two.
The syntax is:
re.compile(pattern[, flags])
parameter:
-
pattern: a string of regular expression
-
the flags: optional, represent the matching pattern, such as ignoring the case, multi-line mode, for the specific parameters:
-
re.I ignore case
-
re.L represent special characters \ w, \ W, \ b , \ B, \ s, \ S depends on the current environment
-
re.M multiline mode
-
re.S that is. and include any character, including newline (not included newline)
-
re.U represent special characters \ w, \ W, \ b , \ B, \ d, \ D, \ s, \ S depends on the Unicode character properties database
-
re.X To increase readability, ignore comments and whitespace behind #
-
= pattern Re. the compile ( R & lt '\ + D') # for matching at least one number
m = pattern. match ( 'one12twothree34four', . 3, 10) # from the position '1' to match the start
>>> m. Group ( 0 ) # can be omitted 0
'12 is'
>>> m. Start ( 0) # can be omitted 0
. 3
>>> m. End ( 0) # can be omitted 0
. 5
>>> m. span ( 0) # 0 can be omitted
( 3, 5)
among them:
-
group([group1, …])
A method for obtaining one or more packets matching string, to be obtained when the entire substring matching, can be used directlygroup()
orgroup(0)
; -
start([group])
Sub-method for obtaining a packet sequence that matches the entire string in the starting position (a first substring character index), the parameter default value is 0; -
end([group])
A method for obtaining the sub-string matches the end position of the packet in the entire string (a character substring last index + 1), the parameter default value is 0; -
span([group])
Method returns(start(group), end(group))
findall
Find expression in the string being matched by all the sub-string and returns a list, if no match is found, an empty list is returned.
Note: match and search is a match findall match all.
The syntax is:
findall(string[, pos[, endpos]])
parameter:
-
string: string to be matched.
-
pos: optional parameter specifies the starting position of the string, the default is 0.
-
endpos: an optional parameter, the end position of the specified string, the string length defaults
#coding = UTF-8
import re
pattern = re.compile(r'\d+') # 查找数字
result1 = pattern.findall('wintrysec 123 google 456')
result2 = pattern.findall('wintrysec123google456', 0, 10)
print(result1)
print(result2)
re.finditer
And findall Similarly, the string is found in the positive expression matched all substrings, and returns them as an iterator.
re.finditer(pattern, string, flags=0)
example:
#coding=utf-8
import re
it = re.finditer(r"\d+","12a32bc43jf3")
for match in it:
print (match.group() )
Output:
12
32
43
3
re.match
re.match try to match a pattern from a starting position of the string, the start position if not successful match, match () returns none
re.match(pattern, string, flags=0)
Function Parameters:
parameter | description |
---|---|
pattern | Matching regular expression |
string | To string matching. |
flags | Flag for controlling the regular expression matching method, such as: whether or not case-sensitive, multi-line matching, etc. |
Re.match method returns an object matching the success of a match, otherwise None.
We can use the group (num) or groups () function to obtain the matching object matching expression.
Matching object methods | description |
---|---|
group(num=0) | String matches the entire expression, Group () may be a plurality of input group number, in which case it will return those containing a group corresponding to the tuple values. |
groups() | Returns a string containing the tuple of all groups, the group number included from 1 to |
re.search
re.search first scan the entire string and returns a successful match
Function syntax:
re.search(pattern, string, flags=0)
Re.match and the difference re.search
re.match matches only the beginning of the string, if the string does not conform to begin regular expression, the match fails, the function returns None;
The re.search match the entire string, until a match is found