Six regular expression: Value Packet (packet parentheses effect and action values backslash 1)

Six regular expression: Value Packet (packet parentheses effect and action values ​​backslash 1)

Mainly used in the text of html tags matched with

Limiting the input label format (must be done consistently)
using () and the value of \ 1, \ 2 taken packet

# 错误示范,以下当html_str = "<h1>hahaha</h2>"的时候,结果一样会输出
# 没有达到前后一致的限制
import re

html_str = "<h1>hahaha</h1>"
ret = re.match(r"<\w*>.*</\w*>", html_str)
print(ret.group())
# 正确示范,()有分组作用,正则表达式中\1可以取到分组的第一个
import re

html_str = "<h1>hahaha</h1>"
ret = re.match(r"<(\w*)>.*</\1>", html_str)
print(ret.group())
# 正确示范,()有分组作用,正则表达式中\1和\2取值顺序
import re

html_str = "<body><h1>hahaha</h1></body>"
ret = re.match(r"<(\w*)><(\w*)>.*</\2></\1>", html_str)
print(ret.group())

When the packet is too much, it can give group name, the value of time to pick up directly by name:

(?P<name>) 		#命名的格式 (注意P是大写的)
(?P=name)  		#取值的格式
import re

html_str = "<body><h1>hahaha</h1></body>"
ret = re.match(r"<(?P<p1>\w*)><(?P<p2>\w*)>.*</(?P=p2)></(?P=p1)>", html_str)
print(ret.group())
Published 47 original articles · won praise 74 · views 7902

Guess you like

Origin blog.csdn.net/Jacky_kplin/article/details/104744999