关于python 的re.sub用法

import re
text = “JGood is a handsome boy, he is cool, clever, and so on…”
print(re.sub(r’\s+’, ‘-’, text))
JGood-is-a-handsome-boy,-he-is-cool,-clever,-and-so-on…
print(re.sub(r’is\s+’, ‘-’, text))
JGood -a handsome boy, he -cool, clever, and so on…
print(re.sub(r’\s+.’, ‘.’, text))
JGood is a handsome boy, he is cool, clever, and so on…
text = “JGood is a handsome boy , he is cool , clever , and so on…”
print(re.sub(r’\s+,\s+’, ‘,’,text))
JGood is a handsome boy,he is cool,clever,and so on…
许多资料的介绍如下:

re.sub
  re.sub用于替换字符串中的匹配项。下面一个例子将字符串中的空格 ’ ’ 替换成 ‘-’ :

import re  
  
text = ”JGood is a handsome boy, he is cool, clever, and so on…”  
print re.sub(r‘\s+’, ‘-‘, text)  

re.sub的函数原型为:re.sub(pattern, repl, string, count)

其中第二个函数是替换后的字符串;本例中为’-‘

第四个参数指替换个数。默认为0,表示每个匹配项都替换。

re.sub还允许使用函数对匹配项的替换进行复杂的处理。如:re.sub(r’\s’, lambda m: ‘[’ + m.group(0) + ‘]’, text, 0);将字符串中的空格’ ‘替换为’[ ]’。

自己实验了一下,结果的确把句子中的“ ”替换为“-”

text = “JGood is a handsome boy, he is cool, clever, and so on…”
print re.sub(r’\s+’, ‘-‘, text)
JGood-is-a-handsome-boy,-he-is-cool,-clever,-and-so-on…

好奇之下,把“ r’\s+’ ” 替换为“ r’is\s+’” 结果是把原句中的is改为了-

text = “JGood is a handsome boy, he is cool, clever, and so on…”

print re.sub(r’is\s+’, ‘-‘, text)
JGood -a handsome boy, he -cool, clever, and so on…

自己的开源项目中用到了re.sub(r’\s+,\s+’, ‘, ‘, text),难道是把”,”改为逗号,这没有什么用处啊,很好奇,继续实验,结果如下

print re.sub(r’\s+.’, ‘.’, text)
JGood is a handsome boy, he is cool, clever, and so on…

确实,如果用这个例句,没有任何更改。

不死心,就把例句做了一些更改,多家几个逗号试试。

text=”JGood is a handsome boy, , , he is cool, clever, and so on…”
print re.sub(r’\s+,\s+’, ‘, ‘, text)
JGood is a handsome boy, , he is cool, clever, and so on…

发现,三个逗号没有少,是空格发生了变化。
于是继续探索,在原句每个空格之前加了空格,继续实验

text = “JGood is a handsome boy , he is cool , clever , and so on…”
print re.sub(r’\s+,\s+’, ‘,’, text)
JGood is a handsome boy,he is cool,clever,and so on…

哈哈,原来是把“,”前后的空格给删除了。 顿时领悟了re.sub(pattern, repl, string, count)中PATTERN的作用,找到text中与patern所匹配的形式,把text中与patern所匹配的形式以外的用repl代替。

再次验证一下,把“clever ”的逗号改为空格+句号+空格。

text = “JGood is a handsome boy , he is cool , clever . and so on…”
print re.sub(r’\s+.’, ‘.’, text)
JGood is a handsome boy , he is cool , clever. and so on…

很明显“clever ”后句号前空格被去除。

猜你喜欢

转载自blog.csdn.net/work_you_will_see/article/details/84635756
今日推荐