python writes html file Chinese garbled problem
Use the open function to write the html crawled by the crawler into a file, sometimes it will not be garbled in the console, but the Chinese in the html written to the file is garbled
case analysis
Look at the following piece of code:
# 爬虫未使用cookiefrom urllib import requestif __name__ == '__main__': url = "http://www.renren.com/967487029/profile" rsp = request.urlopen(url) html = rsp.read().decode() with open("rsp.html","w")as f: # 将爬取的页面 print(html) f.write(html)
There seems to be no problem, and there will be no Chinese garbled characters in the html output from the console, but in the created html file
solution
Use a parameter of the open method, named encoding="", and add encoding="utf-8"
# 爬虫未使用cookiefrom urllib import requestif __name__ == '__main__': url = "http://www.renren.com/967487029/profile" rsp = request.urlopen(url) html = rsp.read().decode() with open("rsp.html","w",encoding="utf-8")as f: # 将爬取的页面 print(html) f.write(html)
operation result
Thanks for reading, and I hope you all benefit.
This article is reproduced from: https://blog.csdn.net/qq_40147863/article/details/81746445
Recommended tutorial: "python tutorial"