Table of contents
1. Match the mobile phone number in the text
1.2 Verify whether the mobile phone number is legal in python
1.3 In HiveSQL, query whether a field contains a mobile phone number and replace it
2. Match lines that do not contain a certain string
1.2 Use regular expressions in python to match lines that do not contain a certain string
1. Match the mobile phone number in the text
1.1 Mode Description
pattern = r"^1[3-9]\d{9}$"
说明:
这个正则表达式的意思是以数字1开头,后面跟着3-9之间的数字,再加上任意9个数字,总共11位数字。这个正则表达式可以匹配大部分中国大陆手机号码。
1.2 Verify whether the mobile phone number is legal in python
import re
phone_number = "13812345678"
pattern = r"^1[3-9]\d{9}$"
if re.match(pattern, phone_number):
print("手机号码合法")
else:
print("手机号码不合法")
1.3 In HiveSQL, query whether a field contains a mobile phone number and replace it
select
regexp_replace(json,'desmobile:1[3-9][0-9]{9}','desmobile:') -- 替换手机号
from tableName
where ds='20230411'
and regexp_like(json, 'desmobile:1[3-9][0-9]{9}') -- 匹配包含手机号的记录
limit 2
2. Match lines that do not contain a certain string
1.1 Mode Description
^(?!.*pattern).*
其中,`pattern`是你想要排除的字符串。这个正则表达式的意思是:匹配任意行,但是这些行不能包含`pattern`字符串。
1.2 Use regular expressions in python to match lines that do not contain a certain string
import re
text = """This is a line
This is another line
This line contains the word apple
This line does not contain the word banana
"""
pattern = r"^(?!.*banana).*$"
for line in text.split("\n"):
if re.match(pattern, line):
print(line)
这个代码会输出以下结果:
```
This is a line
This is another line
This line contains the word apple
```
可以看到,只有不包含`banana`字符串的行被匹配到了。