Regular knowledge point

1. All metacharacters

    \ D which matches all numbers
    \ W indicates a match numbers, letters, change the line  
    \ S to match all blank
    \ T denotes a tab matching tab
    \ N represents a newline \ n Enter key
    \ D matches non-numeric representation
    \ W represents a non-matching numbers, letters, change the line
    \ S represents the matching non-blank
    In addition to \ n than all
    [] Matches the character set of character
    [ ^ ] Matches all characters except character set characters
     ^     matches the start of the string
    $ Matches the end of the string
    () Matches expressions within the brackets, but also represents a group
    A | b matches the character or character b

2. quantifier

    ? Repeated zero or one
     +      repeated one or more times
     *      repeated zero or more times
    {N} n times
    {N,} n times or more times
    {N, m} n times m times or

3. greedy match non-greedy match

# Greedy match 
  - in the case of quantifier scope allowed as many matches
   -. * X which matches any character, any number of times, the last encounter came to a halt x
 # non-greedy match 
  -. * X which matches any character? any number of times, but stopped the event x

4. Escape

　　- The character originally had special meaning, in order to express its own meaning when the need to escape
　　- there is something special meaning, in character group, will cancel its special significance
　　- [() * +.?] All contents in the character set will cancel its special significance
　　[ac] - indicates a range of characters in a group, if you do not want it represents the range, need to escape, or on the front \ rearmost character set

Modules commonly used method 5.re　　

　　- fildall (regular, to be matched string flag): Returns all occurrences

import re
RET = the re.findall ( " D + " , " 19740ash93010uru " ) 
 # returns all the matching result satisfies the condition, in the list 
Print (RET)        
 # [ '19740', '93010']

　　- search: a variable return, by taking the group is a matching entry

= the re.search RET ( ' \ + D ' , ' 19740ash93010uru ' )   
 IF RET:
     Print (ret.group ())
 # function looks for the string pattern matching, only the first to find a match and return a matching information the object may be 
a method of string matching obtained by calling Group (), if no matching string, None is returned. 
# 19740

　　- finditer: returns an iterator taken by the iterator is a variable, the value of group (to save space)

Code Continued

　　- macth: from the beginning to find the first one, the same as the other and search

import re 
RET = re.match ( ' A ' , ' ABC ' ) .group ()   # same search, but in the beginning of the string do match 
Print (RET)
 # 'A'

      
# When matching content input by the user, the user needs to enter the 11-digit phone number, cell phone number ^ $ 
match ( " phone number regular $ " , " 123eva456taibai " )
Search ( " ^ phone number regular $ " , " 123eva456taibai " )

　　- compile (regular): with a regular expression is used many times when compiled in advance to save time

Code Continued

　　- split: split content by regular expression matching

import re
ret = re.split("\d+","eva3egon4yuan")
print(ret)
# ['eva', 'egon', 'yuan']
      
ret = re.split("(\d+)",,"eva3egon4yuan")
print(ret)
# ['eva', '3', 'egon', '4', 'yuan'

　　After matching section plus () cut out the results are different, not () does not match the item retained, but there are () but able to retain a matching entry, the need to retain some portion of the match the course is very important.

　　- sub: replace, replaced by the contents of the regular expression matching

import re
ret = re.sub("\d+","H","aas123dfghj147")
print(ret)
# aasHdfghjH
      
ret = re.sub("\d+","H",,"aas123dfghj147",1)    # 替换一个
print(ret)
# "aasHdfghj147

　　- subn Alternatively, on the basis of the sub, returns a tuple, the result is to replace the first content, the second number is replaced

import re
ret = re.subs("\d+","H",'alex123wusir456')
print(ret)
# ('alexHwusirH', 2)