Using regular expressions in python

First, the introduction to regular expressions

Regular expressions official document : https://www.runoob.com/regexp/regexp-tutorial.html

         Regular Expressions: also known as regular expressions, regular expressions, regular expressions, regular expressions, conventional notation (English: Regular Expression, the code is often abbreviated as regex, regexp or RE), is a concept in computer science . Regular expression to describe the use of a single string, the string matching a series of syntactic rules. In many text editor, a regular expression is usually to retrieve, replace text that matches a pattern.

  According to certain rules that match the expression.

Second, the regular introduction

  Regular expressions are a tool for the matching string, or a string used to extract.

1, it is determined whether or not a given string matches the format (format meets user account determination)

2, from the string, extracting information format specified (fetch phone number)

Import Re 
str1 = ' fijiooe18814726275iufdrrrrdf18814726275fsdssa ' 
# define a discovery rule 
# known to need to find the phone number of 
the p-= ' 18,814,726,275 ' 
# Search (): Find data from front to back, the first data returned by default found, will not continue Get back to 
RES = the re.search (P, str1) .group ()
 Print (RES)

Third, metacharacters

It represents a single character

character Features
. Matches any character (except \ n)
[] Matching character [] listed in
\d Matching numbers, that is, 0-9
\D Matching non-digital, that is not a number
\s Matching blank, space, tab key
\S Matching non-blank
\w Matching word character, ie az, AZ, 0-9, _
\W Matches non-word character

It represents the number

The relevant format matching multiple characters.

character Features
* A character appears zero or infinity times before the match, that is dispensable
+ A character appears once or unlimited times before the match, that is at least 1
? Matches the preceding character appear more than once or zero times, that there is either 1 or none
{m} M times a character appears before match
{m,} A character at least m appeared before match
{m,n} A character appears at least once before the m ~ n match

 Boundary representation

character Features
^ Matches the beginning of string
$ End of the string
\b Matches a word boundary
\B Matching non-word boundary

Packet Matching

character Features
| About a match in any expression
(from) The brackets character as a grouping
\on one Num matched reference packet string
(?P<name>) Packet surnamed
(?P=name) Reference packet matches the alias name string

Four, re module

  • re.match function

  re.match function tries to match from a starting position of the string pattern matching succeeds, the return is a matching object (the object contains information that matches my IM grace, if not the starting position of the match is successful, match () on It will return None.)

  • re.search method

  the re.search () scans the entire string and returns a successful match to the first character.

  • Re.match and the difference re.search

  re.match matches only the beginning of the string, if the string does not conform to begin regular expression, the match fails, the function returns None; and re.search match the entire string, until a match is found

  • findall method

  Found positive in the string expression matched all substrings, and returned as a list, if no match is found. It is returned to an empty list.

  • Note: match and search is a match; and all findall match.

  • sub Method

  Replace certain characters in the string, it can be selected to match substrings using regular expressions.

  re.sub(pattern,repl,string,count=0)

    • pattern: a regular expression pattern string sub;
    • repl:被替换的字符串(既可以是字符串,也可以是函数)
    • string:要被处理的字符串,要替换的字符串
    • count:替换的次数

五、贪婪模式

python中数量词默认是贪婪模式,总是尝试匹配尽可能多的字符;非贪婪模式相反,总是尝试匹配尽可能少的字符。

在 *、?、+、{m,}、{m,n}后面加上?,可以使贪婪模式变成非贪婪模式。

 

Guess you like

Origin www.cnblogs.com/wanglle/p/11617195.html