day 17 re module regular expressions

import re re module reference

 

Seek  

  finall: Match all, each of which is an element in the list

  search: only match the first left to right, not directly but rather a result variable, obtaining results group method is not matched to return None, use being given group

  match: scratch match, the equivalent of a regular expression search plus a ^

 

Extended string processing: Cutting replacement

  split  cut

  sub  alternate format re.sub (old, new, string, Replace Occurrence)

  subn  returns a tuple, the second element is the number of replacements

 

Advanced re module: time / space

  compile  saves you use regular expressions to solve the problem of time  

ret = re.compile ( '\ d + ') # has been completed compiled 
 Print (RET) 
 RES = ret.findall ( 'alex83taibai40egon25') 
 Print (RES)

   finditer  saves you use regular expressions to solve the problem of space / memory

ret = re.finditer('\d+','alex83taibai40egon25')
for i in ret:
    print(i.group())

 

Rearch ()  .group ()   parentheses in the figure it represents to the corresponding content packet

. 1  Import Re
 2 S = ' <a> Wahaha </a> '   # markup language html page 
. 3 RET = the re.search ( ' <(\ W +)> (\ + W) </ (\ + W)> ' , S)
 . 4  Print (ret.group ())   # all results 
. 5  Print (ret.group (. 1)) # numeric parameter corresponding content represents the take packet
View Code

findall ()  has a special syntax, priority displays the contents of a regular expression () parentheses

Ungroup priority (:? Regular Expressions)

    ret = re.findall('\d+(?:\.\d+)?','1.234*4')

     print (right)

About group:

  1, for regular expressions, sometimes we need to be grouped to constrain the number (\. [\ W] + ) of a character appearing? 

  2, for the python language, the packet can help you better and more accurate to find what you really need, for example, <(\ w +)> ( \ d +) <

split

1 K = re.split ( ' \ d + ' , ' alex83taibai40egon25 ' )
 2  the printer (right)
 3 K = re.split ( ' (\ d +) ' , ' alex83taibai40egon25aa ' )
 4  printer (right)
View Code

 

 

python special agreement between the regular expressions

  1, Group name (? P <name of the group> Regular Expressions)

  2, must be consistent with the same name in front of the packet and the packet contents using the previous packet, this requires the use of a matching name

pattern = '<(?P<tab>\w+)>(\w+)</(?P=tab)>'
ret = re.search(pattern,s)
print(ret

 

Guess you like

Origin www.cnblogs.com/xiaobai686/p/11681958.html