python:实现将英文单词,标点符号等,以空格拆分

目录

问题描述:

问题解决:


问题描述:

常见英文句子是以空格为分隔符拆分单词,标点符号和单词之间,并不会有空格,特别的,如果是以' 连接的两个单词,会被误作一个单词。
本程序的目的是实现将英文句子,严格以单词、标点符号等为单位,并且以空格为分隔符。

问题解决:

给定句子如下:

sentence = "The Sony A7 III's write speed, is faster than my previous camera (the Canon EOS've Rebel T6) allowing me."

import re

sentence = "The Sony A7 III's write speed, is !faster than my previous camera (the Canon EOS've Rebel T6) allowing me."

# 定义要添加空格的特殊字符
special_chars = [',', '.', '\'', '’', '“', '”', '(', ')', '[', ']', '{', '}', ':', ';', '?', '!', '-', '--']


# 在特殊字符前添加空格
for char in special_chars:
    if char == '(': #特别的,左括号是在后面加空格
        sentence = sentence = re.sub(rf'([{char}])', r'\1 ', sentence)
    else:
        sentence = re.sub(rf'([{char}])', r' \1', sentence)

print(sentence)

运行结果:

猜你喜欢

转载自blog.csdn.net/weixin_41862755/article/details/130636572
今日推荐