删除utf8范围之外的数据

报错:SyntaxError:Non-UTF-8 code starting with '\x..' in file ...

1 #获得没法识别的字节错误:"Incorrect string value:'\\xF0\\xAB\\x96\\xAF\\xE7\\x9A...',把字节错误的地方换成?
2 errorbytes = [b'\xF0\xAB\x96\xAF\xE7\x9A',b'\xF0\xA8\xA8\x97\xEF\xBC']
3 for eb in errorbytes:
4     data['intro'] = [x.encode('utf8', errors='replace').replace(eb, b'?').decode('utf8'
5                      , errors='replace') for x in list(data['intro'])]

猜你喜欢

转载自www.cnblogs.com/xl717/p/11547266.html