requests and process text content difference

The method will get and post requests object returns a Response object, which object is stored all the information returned by the server, a response including the head, the response status codes. Wherein the return portion of the page and there will .content .text two objects.

That the difference between the two, the intermediate content stored bytecode, the text is stored in the content Beautifulsoup content encoded as a string guess based encoding.

Content direct output, will be found in front of b 'in the flag, the flag byte which is a character string, and text, there is no front, b, for purely ascii code, it can be said that two identical, other text, need correct coding will be displayed properly. In most cases it recommended .text, because the display of Chinese characters, but sometimes garbled, and you need a .content.decode ( 'utf-8'), commonly used in Chinese utf-8 and GBK, GB2312 and so on. This can manually select the text encoding.

So in short, is ready .text string, .content also encoded, but not all the time .text display are normal, which is on the need for manual coding with .content.

Guess you like

Origin www.cnblogs.com/jontyfan/p/12024765.html