str 使用encode方法转换为bytes(爬虫的得到的响应以二进制的方式传送)
In [9]: a = "你好"
In [10]: type(a)
Out[10]: str
In [11]: b = a.encode()
In [12]: b
Out[12]: b'\xe4\xbd\xa0\xe5\xa5\xbd'
In [13]: type(b)
Out[13]: bytes
bytes 通过decode转化为 str
In [12]: b
Out[12]: b'\xe4\xbd\xa0\xe5\xa5\xbd'
In [13]: type(b)
Out[13]: bytes
In [14]: c = b.decode()
In [15]: c
Out[15]: '你好'
In [16]: type(c)
Out[16]: str
默认方式都以utf-8的方式编解码。其编解码的方式必须一样,否则会出现乱码
In [17]: a = "你好"
In [18]: b = a.encode("gbk")
In [19]: b.decode()
---------------------------------------------------------------------------
UnicodeDecodeError Traceback (most recent call last)
<ipython-input-19-4169e64150f6> in <module>
----> 1 b.decode()
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc4 in position 0: invalid continuation byte
In [20]: b.decode("gbk")
Out[20]: '你好'