Python clear common web space format

def clean(string):
pattern = re.compile(r'<[^>]+>', re.S)
string = pattern.sub('', string)
string = string.replace('\n', ' ').replace('\r', ' ').replace('&nbsp;', ' ').replace('\t', ' ').replace(" ",'')
string = string.strip()
return string

Guess you like

Origin www.cnblogs.com/yp19970/p/12743741.html