UnicodeEncodeError- 'gbk' codec can't encode character '-xa9' in position 3738- illegal......

Do classwork today-download the HTML source code of a web page to the local, and encountered a gbk encoding problem. You can see the new text created by the code locally, but there is nothing. I read a lot of blogs on Baidu and finally solved the problem perfectly.

before fixing

import requests

res = requests.get('https://localprod.pandateacher.com/python-manuscript/crawler-html/spider-men5.0.html')
book = res.text
print(book)

p = open('E:\\Mypy\\练习作品\\这个书苑不太冷.txt','a+')
p.write(book)
p.close()

solution

will

p = open('E:\\Mypy\\练习作品\\这个书苑不太冷.txt','a+')

change into

p = open('E:\\Mypy\\练习作品\\这个书苑不太冷.txt','a+',encoding='utf-8')

The operation effect chart is as follows
Insert picture description here


to sum up

Since the default encoding for newly created text files in Windows is gbk, it is sufficient to specify the encoding format as utf-8 when creating text, otherwise the default encoding will be used.

Published 23 original articles · praised 7 · visits 1986

Guess you like

Origin blog.csdn.net/weixin_44641176/article/details/101925099