Regular
use:
import re
re.match(pattern,string,flags = 0)
pa = re.match('love','i love you') //从头开始匹配
pa.group() //匹配对象
pa.span() //匹配位置元组
功能:从第一个字符开始匹配 不符合返回None
re.search(pattern,string,flags = 0)
pa =re.search('love','i love you')
功能:扫描整个字符串 返回值是第一个出现的
findall(pattern,string,flags = 0)
功能:查找整个字符串 结果返回列表
compile
功能:正则表达式是否正确
single character
==.== matches any character except a newline
==[]== The matching set is any character in []
==az== matches characters other than az
==^== Boundary starts with ..
==$== delimiter ends with ..
==[a-zA-Z0-9]== matches alphanumeric characters
==\d== matches numbers
==\D== matches non-digits
==\w== matches alphanumeric underscore
==\W== matches non-alphanumeric underscores
==\s== matches the null character
==\S== any character except whitespace
=={n}== Modifies the single character before the metacharacter n times
=={n,}== The single character preceding the modifier metacharacter appears at least n times
=={n,m}== The single character before the modifier metacharacter appears >=n <=m times
=={,n}== modifier metacharacter occurrences are 0-n
==+== At least 1 occurrence of a single character before the modifier metacharacter
==*== can match 0 times or any number of times
==? == matches once or 0 times
==\b== Word boundaries\W are separated by word boundaries
==\B== non-word boundary
r'\w' 修饰正则表达式时意义和修饰字符串时意义不同
修饰字符串的时候是失去转义字符的意义
在正则里 是不进行转义
Grouping and Subpatterns
==()== subpattern
==\num== Use the value matched in num times () as the matching character
\w \S can match Chinese
Understand: (?P) (?P=name)
greedy mode
regex default greedy mode
Commonly used non-greedy patterns
==+?==
==*?==
expression modifier
re.I ignore character size
re.M as multiple lines
re.S is treated as a single line
Regular function
sub replace
sub(字符串,源字符串)
reg = re.compile(r'\d'')
res = reg.sub('aa','woyi12')
print(res) //woyaa
Note:
- When using findall to match, () in the regular expression is the capture type to get the content of the parentheses in the matched result
- (?:) non-capturing parentheses match pattern but do not capture
Regular iterator
res = re.finditer('\d','12345')
print(next(res).group()) //1
print(next(res).group()) //2
print(next(res).group()) //3