基础学习——正则表达式（RE模块）

compile函数会生成一个正则表达式对象，其有一系列方法用语匹配功能，同时re模块也提供了一些列方法与正则对象的方法一样。使用compile的好处时，我们定义好一个正则表达式对象后，可以重复只用。

re.match(pattern, string, flags=0) <====> pattern.match(string,flags=0)

>>> from re import *
>>> pattern = compile('hello')
>>> print( pattern.match('hello world and hello world').group())
hello
>>> print(match('hello','hello world and hello world').group())
hello
## 以上为两种不同的匹配方式
 
>>> print(match('hello','and hello world').group())
Traceback (most recent call last):
  File "<pyshell#13>", line 1, in <module>
    print(match('hello','and hello world').group())
AttributeError: 'NoneType' object has no attribute 'group'
>>>
>>> print(search('hello','and hello world').group())
hello
>>>
## 以上是match和search的区别，match只能匹配开头
 
>>> print(search('hello','and Hello world').group())
Traceback (most recent call last):
  File "<pyshell#17>", line 1, in <module>
    print(search('hello','and Hello world').group())
AttributeError: 'NoneType' object has no attribute 'group'
>>> print(search('hello','and Hello world',I | M).group())
Hello
## 以上是函数参数flogs的使用 当使用多个是使用“|”链接， 以下是flags：
I　　大小写不敏感
L　　做本地化识别（local-aware）匹配
M　　多行匹配，影响^和$
S　　使“.”可以匹配换行符在内的所有字符
U　　根据Unicode字符集解析字符。这个标志影响\w, \W,\b, \B
X　　可以允许多行书写你的表达式， 方便添加注释，更灵活（忽略所有空格）

from  re import *
line = 'Cats are smarter than dogs'
matchObj = match(r'(.*) are (.*?) .*', line, M|I)
#括号内是分组，使用group（N）的N做对应取值， 分组内？的是非贪婪模式，即取最少匹配。
print(matchObj.group())
print(matchObj.group(0))
print(matchObj.group(1))
print(matchObj.group(2))
  
执行结果：
Cats are smarter than dogs
Cats are smarter than dogs
Cats
smarter

向前向后查找 (?<=...)XXX(?=...)

r '(?<=FrontString)YourString(?=BackString)'  只匹配YourString

example：如果你想查找XXX字符串，但其前后必须符合一定要求，例如 AAXXXBB这样的才可以，此时就可以使用： r'(?<=AA)XXX(?=BB)', 这样XXX就会被匹配到

回溯引用 \n

r'<h(1-6)>.*?<h(1-6)>'   可以匹配 <h1>...<h3>
r'<h(1-6)>.*?<h\1>'        不能匹配前后不一致的

基础学习——正则表达式（RE模块）

猜你喜欢