Python【re.sub()+replace()】替换部分字符

将【abc_4.html】或 【abc.html】替换为【abc_all.html

a = 'www.pcauto.com.cn/8725632_4.html'
b = 'www.pcauto.com.cn/8725632.html'

import re

pattern = '/\d{7}([_\d]*\.html)'
repl = '_all.html'

aa = re.search(pattern, a).group(1)
bb = re.search(pattern, b).group(1)

print(aa)
print(bb)

aaa = a.replace(aa, repl)
bbb = b.replace(bb, repl)

print(aaa)
print(bbb)

打印结果

_4.html
.html
www.pcauto.com.cn/8725632_all.html
www.pcauto.com.cn/8725632_all.html
  • 简化版
a = 'www.arye.com.cn/8725632_4.html'

import re

aa = re.search('/\d{7}([_\d]*\.html)', a).group(1)

aaa = a.replace(aa, '_all.html')

print(aaa)

请求头,替换为键值对形式

headers = '''
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0
Accept: */*
Accept-Language: zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2
Accept-Encoding: gzip, deflate, br
Referer: https://blog.csdn.net/u011054333/article/details/70151857
Content-Type: text/plain;charset=UTF-8
Origin: https://blog.csdn.net
Connection: keep-alive
'''.strip()
import re
for kv in re.findall('(.*): (.*)', headers):
    print("'%s': '%s'" % kv)

打印结果

'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0'
'Accept': '*/*'
'Accept-Language': 'zh-CN,zh;q=0.8,zh-TW;q=0.7,zh-HK;q=0.5,en-US;q=0.3,en;q=0.2'
'Accept-Encoding': 'gzip, deflate, br'
'Referer': 'https://blog.csdn.net/u011054333/article/details/70151857'
'Content-Type': 'text/plain;charset=UTF-8'
'Origin': 'https://blog.csdn.net'
'Connection': 'keep-alive'

URL参数替换成字典形式

headers = '''
p_id=11000
c_id=11100
'''.strip()
import re
cookies_dict = dict(re.findall('(.*)=(.*)', headers))
print(cookies_dict)
打印结果
{‘p_id’: ‘11000’, ‘c_id’: ‘11100’}

猜你喜欢

转载自blog.csdn.net/Yellow_python/article/details/81984313
今日推荐