Reptiles and r.content of the difference r.text

1. simple and crude terms:

return unicode text type data is generally defined in encoded form in the page header.

content returns bytes, two type data system.

If you want to extract text with text

But if you want to extract pictures, documents, it is necessary to use content

2. Detailed point of view:

After using request.get method returns a response object, the object which is stored all the information returned by the server, a response including the head, the response status codes.

Wherein the return portion of the page and there will .content .text two objects. If you need to get those pages of raw data, we can get the data r.text or r.content.

  • .text string stored after encoding .content
  • .content intermediate byte code is stored

Generally .text directly with more convenient, return the string, but sometimes will not resolve properly, resulting in the return of a pile of garbage. In this case you need .content.decode ( 'utf-8') , so that it appears normal.

Overall .text is readily available string, .content also encoded, but not all the time .text display are normal (the need for manual coding with .content.decode ())

Guess you like

Origin www.cnblogs.com/wyy1480/p/11516693.html
Recommended