Python crawler garbled problem: the difference between encoding and apparent_encoding


encoding is the encoding method extracted from the charset field in the header in http. If there is no charset field in the header, the default is ISO-8859-1 encoding mode, and Chinese cannot be parsed. This is the reason for garbled characters.

apparent_encoding will analyze the encoding method of the web page from the content of the web page, so apparent_encoding is more accurate than encoding. When the web page appears garbled, you can assign the encoding format of apparent_encoding to encoding.

Guess you like

Origin blog.csdn.net/weixin_64974855/article/details/132638658