Python regular expression (RegEx)

Creative Commons License Copyright: Attribution, allow others to create paper-based, and must distribute paper (based on the original license agreement with the same license Creative Commons )

Copyright, without permission, is prohibited reprint


chapter


A regular expression is a character sequence of search mode.

Regular expression for the search string in the specified search pattern.

Regular Expression (RegEx) module

Python has called rebuilt-in packages for handling the regular expression.

Examples

Import remodule:

import re

Python in regular expressions

Import rethe module, you can start using regular expressions:

Examples

Search string to see if that begin with "the", to "Spain" at the end:

import re

txt = "The rain in Spain"
x = re.search("^The.*Spain$", txt)

Regular Expression Functions

reModule provides a set of functions that match the search string:

function description
findall It returns a list of all matches
search If there is a match text, match the object is returned
split Dividing the text string, it returns a list of the divided text
sub Replacement of one or more occurrences / td> string

Metacharacters

Yuan characters are characters that have special meaning:

character description example
[] Character Set "[a-m]"
\ Mark special escape character (can also be used to escape special characters) "\d"
. Any character (except newline characters) "he..o"
^ Begin text "^hello"
$ End text "world$"
* 0 or more occurrences "aix*"
+ 1 or more occurrences "aix+"
{} Determine the number of occurrences "al{2}"
| or "falls|stays"
() Packet capture and  

Escape special characters

Special escape character is \followed by a character from the following list, it has a special meaning:

character description example
\A If the specified string at the beginning of the text, then return match "\AThe"
\b If the specified string at the beginning or end of the text, a match is returned r"\bain"
r"ain\b"
\B If the string is not specified at the beginning or end of the text, a match is returned r"\Bain"
r"ain\B"
\d If the text contains a number (0-9), and returns a match "\d"
\D If the text does not contain numbers (0-9), and returns a match "\D"
\s Return match text contains blank characters "\s"
\S Return matches in the text does not contain blank characters "\S"
\w If the text contains any word characters (a character to Z, and numbers from 0 to 9, and the underscore character _) returns a match "\w"
\W If the word does not contain any text characters (a character to Z, and numbers from 0 to 9, and the underscore character _) returns a match "\W"
\WITH If at the end of the specified text string, the matching entry is returned "Spain\Z"

set

It is a collection of square brackets []in a set of characters has a special meaning:

Set description
[arn] Back matches one of (a, r or n) matches the specified character
[a-n] And return a match between any n, lowercase characters matches
[^arn] Return a match matches any character other than a, r and n
[0123] Returns matches the specified number (2 or 3) matches
[0-9] Back to match any number between 0 and 9 match
[0-5][0-9] Return match a match from any of the double-digit 00-59
[A-zA-Z] Return a match matches any character in alphabetical order between a and z, uppercase or lowercase
[+] In the collection, +, *,, |, (), $, {} no special meaning, the [+] meant: Returns a string "+" character matches

findall () function

findall()The function returns a list of all the matches included.

Examples

Print a list of all matches:

import re

str = "The rain in Spain"
x = re.findall("ai", str)
print(x)

The list of matches in order to find the sort.

If no match is found, it returns an empty list:

Examples

Print a list of all matching entries:

import re

str = "The rain in Spain"
x = re.findall("Portugal", str)
print(x)

search () function

search() Matches the search text, if there is a match, it returns a match object.

If there is more than one match, only to return to the first one:

Examples

Search text in the first space character:

import re

str = "The rain in Spain"
x = re.search("\s", str)

print("第一个空白字符位于:", x.start())

If a match is not found, the return Nonevalue:

Examples

Mismatch:

import re

str = "The rain in Spain"
x = re.search("Portugal", str)
print(x)

split () function

split()Text string segmentation function returns a list of the divided text:

Examples

Split text in each blank character at:

import re

str = "The rain in Spain"
x = re.split("\s", str)
print(x)

You can be specified maxsplitto control the number of division parameters:

Examples

Dividing only the first match string:

import re

str = "The rain in Spain"
x = re.split("\s", str, 1)
print(x)

sub() Function

The match is replaced with the specified text:

Examples

Replace each blank character with the number 9:

import re

str = "The rain in Spain"
x = re.sub("\s", "9", str)
print(x)

By countcontrolling the number of alternative parameters:

Examples

Replace the first two:

import re

str = "The rain in Spain"
x = re.sub("\s", "9", str, 2)
print(x)

Matching objects

Matching object is an object that contains information about the search and results.

Note : If there is no match, None value is returned, rather than matching the object.

Examples

Search will return a matching object:

import re

str = "The rain in Spain"
x = re.search("ai", str)
print(x) #打印对象

Matching objects having properties and methods for retrieving search information:

  • .span() It returns a tuple, which includes start and end positions match.
  • .string Text passed to the function returns
  • .group() Return portion of matched text

Examples

Print the first position (start and end positions) match.

Regular expression search any word with a capital letter "S" at the beginning of:

import re

str = "The rain in Spain"
x = re.search(r"\bS\w+", str)
print(x.span())

Examples

Print text passed into the function:

import re

str = "The rain in Spain"
x = re.search(r"\bS\w+", str)
print(x.string)

Examples

Print text matching part.

Regular expression search any word with a capital letter "S" at the beginning of:

import re

str = "The rain in Spain"
x = re.search(r"\bS\w+", str)
print(x.group())

Note : If there is no match, None value is returned, rather than matching the object.

Guess you like

Origin blog.csdn.net/weixin_43031412/article/details/93846336