Crawler return data hexadecimal encoding problem

Yesterday error found after inspection found a problem when there is json string returned json data conversion interfaces reptiles analysis \x3E, \x2Fsuch encoded string,
begin to address the idea is not to convert the hexadecimal string ordinary results

the reason

Encoding problem. Originally, these returned data need to be parsed by front-end js.
\xThe hexadecimal data
0xat the beginning is the representation method of js, and the beginning is the representation method of python hexadecimal.
Therefore, the returned data is not processed and directly parsed with python will cause an exception.

Solution

Perform encoding format processing on the returned data before processing

res = response.content.decode('unicode_escape').encode('latin1').decode()

Of course, if you want to process the string directly, you need to convert it to bytecode and then encode the byte data.
You can use the following method

res = bytes(res_str, 'utf-8').decode('unicode_escape').encode('latin1').decode('utf-8')

Guess you like

Origin blog.csdn.net/weixin_41822224/article/details/107081152