关于python 的re.sub用法

import re
text = “JGood is a handsome boy, he is cool, clever, and so on…”
print(re.sub(r’\s+’, ‘-’, text))
JGood-is-a-handsome-boy,-he-is-cool,-clever,-and-so-on…
print(re.sub(r’is\s+’, ‘-’, text))
JGood -a handsome boy, he -cool, clever, and so on…
print(re.sub(r’\s+.’, ‘.’, text))
JGood is a handsome boy, he is cool, clever, and so on…
text = “JGood is a handsome boy , he is cool , clever , and so on…”
print(re.sub(r’\s+,\s+’, ‘,’,text))
JGood is a handsome boy,he is cool,clever,and so on…
许多资料的介绍如下：

re.sub
　　re.sub用于替换字符串中的匹配项。下面一个例子将字符串中的空格 ’ ’ 替换成 ‘-’ :

import re  
  
text = ”JGood is a handsome boy, he is cool, clever, and so on…”  
print re.sub(r‘\s+’, ‘-‘, text)

re.sub的函数原型为：re.sub(pattern, repl, string, count)

其中第二个函数是替换后的字符串；本例中为’-‘

第四个参数指替换个数。默认为0，表示每个匹配项都替换。

re.sub还允许使用函数对匹配项的替换进行复杂的处理。如：re.sub(r’\s’, lambda m: ‘[’ + m.group(0) + ‘]’, text, 0)；将字符串中的空格’ ‘替换为’[ ]’。

自己实验了一下，结果的确把句子中的“ ”替换为“-”

text = “JGood is a handsome boy, he is cool, clever, and so on…”
print re.sub(r’\s+’, ‘-‘, text)
JGood-is-a-handsome-boy,-he-is-cool,-clever,-and-so-on…

好奇之下，把“ r’\s+’ ” 替换为“ r’is\s+’” 结果是把原句中的is改为了-

text = “JGood is a handsome boy, he is cool, clever, and so on…”

print re.sub(r’is\s+’, ‘-‘, text)
JGood -a handsome boy, he -cool, clever, and so on…

自己的开源项目中用到了re.sub(r’\s+,\s+’, ‘, ‘, text)，难道是把”,”改为逗号，这没有什么用处啊，很好奇，继续实验，结果如下

print re.sub(r’\s+.’, ‘.’, text)
JGood is a handsome boy, he is cool, clever, and so on…

确实，如果用这个例句，没有任何更改。

不死心，就把例句做了一些更改，多家几个逗号试试。

text=”JGood is a handsome boy, , , he is cool, clever, and so on…”
print re.sub(r’\s+,\s+’, ‘, ‘, text)
JGood is a handsome boy, , he is cool, clever, and so on…

发现，三个逗号没有少，是空格发生了变化。
于是继续探索，在原句每个空格之前加了空格，继续实验

text = “JGood is a handsome boy , he is cool , clever , and so on…”
print re.sub(r’\s+,\s+’, ‘,’, text)
JGood is a handsome boy,he is cool,clever,and so on…

哈哈，原来是把“，”前后的空格给删除了。顿时领悟了re.sub(pattern, repl, string, count)中PATTERN的作用，找到text中与patern所匹配的形式，把text中与patern所匹配的形式以外的用repl代替。

再次验证一下，把“clever ”的逗号改为空格+句号+空格。

text = “JGood is a handsome boy , he is cool , clever . and so on…”
print re.sub(r’\s+.’, ‘.’, text)
JGood is a handsome boy , he is cool , clever. and so on…

很明显“clever ”后句号前空格被去除。

关于python 的re.sub用法

猜你喜欢