Regular expressions (Python)

Regular expressions, also known as regular expressions. (English: Regular Expression, the code is often abbreviated as regex, regexp or RE), a concept in computer science. Regular expressions are typically used to retrieve, replace the text in line with those of a model (rule) is. Is pre-defined by some combination of specific characters, and these particular character, form a "string rule", this "rule string" is used to express a filtering logic of the string.

The first to write about my conclusion:

Predefined:
\ A: From the beginning of the string which matches the
\ Z: it represents the end of the string matching, if the line change, only the front end of the string matching the wrap
\ b: matching a word boundary, also refers to location and spaces between words. For example, 'py \ b' match 'python' in 'py', but it does not match 'openpyx1' in 'py'.
\ B: matching non-word boundary. 'py \ b' match 'openpyx1' in 'py', but does not match 'python' in 'Py'
\ D: match any number, is equivalent to [0-9]. digit for
\ D: matches any non-numeric characters, is equivalent to [^ \ d]. digit for Not
\ S: Matches any whitespace character, equivalent to [\ t \ n \ r \ f]. Space
\ S: Matches any non-blank character, equivalent to [^ \ s].
\ w: match any alphanumeric underlined, is equivalent to [a-zA-Z0-9].
\ W: matches any non-alphanumeric underlined, is equivalent to [^ \ W]
\: literal matches backslash \.

Special symbols:
[] represents the range of
| represents or
() denotes a set
to match all characters except newline (\ n) a.
^ For starting the matching string, i.e. the first row
at the end of the string to match $ (If the end of a line break \ n, it matches \ n that the preceding character), i.e., the end of line

Define the regular number of verification:
* (asterisk) in front of the pattern matching for 0 or more times (greedy, i.e. matches as much)> = 0
+ (plus sign) for pattern matching front 1 or more times (greedy)> = 1
? for pattern matching foregoing 0 or 1 times (greedy) 0,1
{m} for the front pattern matching verification m times
{m,} for verifying the front pattern matching m or more times
{m, n} for verifying the foregoing pattern matching or greater and smaller than or equal to n times m times

As long as related with the number in the python in all greedy. In the "*", "?", "+", "{M, n}" followed by? Greedy mode to make it non-greedy.

Then some of the more classic demo, remember to import the re module.
The function will be used re.match (), re.search ()

 

Import Re
 '' ' 
    to verify the phone number 
' '' 
Phone = INPUT ( " INPUT The Phone Number: " ) 
Result = re.match ( ' ^. 1 [3456789] \ {D} $. 9 ' , Phone)   # ^. 1 (. 3 | 4 | 5 | 6 | 7 | 8 | 9) \ d {9} $, written expression is not the only 
IF the Result == None:
     Print ( " phone number illegal! " )
 the else :
     Print ( " Congratulations, verified by . " )

A little bit with conditions, such as 4,7 is not the end of the phone number:

= Phone ' 18,476,529,115 ' 
Print ( " not end 4,7 phone number: " , Phone)
 # Result = re.match (R & lt '. 1 ^ \ {D}. 9 [0-35-689] $', Phone) 
re.match = result (R & lt ' ^. 1 \ {D}. 9 (. 1 | 2 |. 3 |. 5 |. 6 |. 8 |. 9) $ ' , Phone)
 Print ( " check result: " , result)

Come check it designated (qq, 163,126) E-mail format:

Print ( " ============================ mail registration ================= ============= " ) 
In Email = INPUT ( " INPUT The In Email: " )
 # Result = re.match ( '. ^ \ W + @ \ W + \ COM $', In Email) 
Result = re.match (R & lt ' . ^ \ W @ {5,18} (163 | QQ | 126) \ (COM) $ ' , In email)   # QQ mailbox 163 126 
IF Result == None:
     Print ( " mailbox format is not valid ! " )
 the else :
     Print ( " Congratulations, verified. " )

About group:

= Phone ' 010-123456789 ' 
# regular expression in a set of parentheses represents a 
Result = re.match (R & lt ' (^ \. 3 {D} | \ {D}. 4) - (\ D) {}. 9 $ ' , Phone)  
 # packet extracting (# in parentheses have several several Group) 
Print (result.group (. 1 ))  
 Print (result.group (2))

Packet Quote:

= Msg2 ' <h1 of> Hello </ h1 of> ' 
# number (index) method 
Result = re.match (R & lt ' <([0-9a-zA-the Z] +)> (. +) </ \. 1> $ ' , Msg2)
 # </ \. 1>: \. 1 represents a first set of references (in parentheses) of the matched content 
Print (Result)
 # Print (result.group (. 1))

Packet name (P <name> regular expression?), Reference:

# Packet named (? P <name> Regular) 
the Msg3 = ' <HTML> <h1 of> HHH </ h1 of> </ HTML> ' 
Result = re.match (R & lt ' <(? P <NAME1> \ W +)> < (? P <NAME2> \ W +)> (. +) </ (? P = NAME2)> </ (? P = NAME1)> ' , the Msg3)  
 # ? P <NAME1> represents a current (in parentheses) group : \ w + named name1 
# </ (name1 = P?)> is a reference to the matching result of the group named name1 
Print (result) 
  # run results: <_sre.SRE_Match object; span = ( 0, 25), match = '<html> <h1> hhh </ h1> </ html>'>

Other functions using re module:

'' ' 
    Sub (regular expression, alternative content (may be a function), string): Replace 
    split (regular expression, string) is divided, the divided content is then saved to the list 
' '' 
for newstr = Re. Sub (R & lt ' \ + D ' , ' 100 ' , ' Java: 98 Python 99 ' )
 Print (for newstr)   # Java: 100 Python 100 

# computing performance function +10 
DEF the Add (TEMP): 
    NUM = temp.group () 
    num1 int = (NUM) + 10
     return STR (num1) 


'' ' parameter as a function of the Add () ' '' 
for newstr = the re.sub (R & lt '\d+', The Add, ' this test Score: 90 ' )
 Print (for newstr)   # The test scores: 100 

# segmentation 
Result = re.split (R & lt ' [,:] ' , ' Java: 99, Python: 98 ' )
 Print (Result)   # [ 'Java', '99', 'Python', '98']

Greed, non-greedy:
long associated with the number, the python is in default greedy.
In the "*", "?", "+", "{M, n}" followed by? Greedy mode to make it non-greedy.
have a test:

contents = 'abc123'
result = re.match(r'abc(\d+)', contents)
print(result)  # <_sre.SRE_Match object; span=(0, 6), match='abc123'>
result = re.match(r'abc(\d+?)', contents)
print(result)  # <_sre.SRE_Match object; span=(0, 4), match='abc1'>

Hornet's nest how to increase fans http://blog.sina.com.cn/s/blog_184e9f38b0102yyi5.html
hornet's nest travel increase the popularity http://blog.sina.com.cn/s/blog_184e9f38b0102yyig.html
hornet's nest Travels how to brush views https : //www.douban.com/group/topic/162496598/

Guess you like

Origin www.cnblogs.com/huma/p/12153546.html