Perfect solution to Python's UnicodeEncodeError: 'gbk' codec can't encode character '\x, etc.' in position 0: error reporting problem

Recently, several projects have encountered this problem in actual practice. After consulting a lot of information and blogs, the solution used at the beginning was:
 

        demo = open(r"demo.txt", "r", encoding="utf-8")
        soup = BeautifulSoup(demo.read(), 'html.parser')
        html_data = soup.find('div', id="J_goodsList")

Use the most original written file and re-read the file

Later I found a simpler method:

text.replace('\xaf','')

 Let pycharm replace it with empty characters when the output encounters changed characters to solve this problem

But yesterday I encountered an HTML original code that contained multiple such characters, which gave me a headache.

So I started trying to use capture:

try:
    print(text.replace('\xaf',''))
except UnicodeEncodeError:
    continue

Although the problem was solved, I found that a lot of things were lost, which was very uncomfortable. I checked more information and blogs this morning and found that I have the best of both worlds:

 

You can set it in the file encoding of pycharm, which will make you laugh at your speechless operation. That's it.

Guess you like

Origin blog.csdn.net/m0_62945506/article/details/122473875