The difference between requests.text and requests.content

The difference between requests.text and requests.content

When we use the requests library to crawl network data, we usually encounter encoding problems. After getting the response through the get method of requests, there are usually two output formats: response.text and response.content:

  1. response.content: This is the data directly grabbed from the network without any decoding, so it is a bytes type. In fact, the strings transmitted on the hard disk and on the network are all bytes.

Therefore, when using response.content to output: We can use response.content.decode() to decode into unicode type data that is str string.

Some people like to use resonse.content.decode("utf-8") for decoding, but I prefer to use resonse.content.decode(), because sometimes resonse.content.decode("gbk") etc. Decode other various types of data, so don't use "uft-8" for hard writing.

  1. response.text: This is the data type of str (unicode), which is a string after the requests library decodes response.content. While we need to specify an encoding method when decoding, requests need to guess the encoding method when performing automatic decoding, so judgment errors cannot be avoided, which will lead to garbled decoding.

Therefore, when using response.text for output, we need to customize the encoding format in advance to prevent requests from guessing the encoding method.

E.g:

(2.1) response.encoding ='utf-8' 、response.encoding ='gbk' (It depends on the encoding method of the web page you requested, and modify the corresponding encoding method for different situations)

(2.2) Or you can directly response.encoding = response.apparent_encoding

supplement:

response.encoding: refers to the response content encoding method guessed by the requests library from the http response header

response.apparent_encoding: refers to the encoding method of the response content analyzed by the requests library from the content (more precise)

Guess you like

Origin blog.csdn.net/qestion_yz_10086/article/details/107950573
Recommended