Detailed article:
http://www.cnblogs.com/yuanchenqi/articles/5956943.html
http://www.diveintopython3.net/strings.html
Notes:
1. python2 default encoding is ASCII, python3 in default is unicode
2.unicode divided utf-32 (4 bytes), utf-16 (two bytes), utf-8 (representing 1-4 bytes), so utf-16 is now the most commonly unicode version, but kept in the file or utf8, space saving because utf8
3. In the py3 encode, at the same time also the transcoding type string into bytes, decode while also decoding the bytes back string
The figure only applies to py2
#-*-coding:utf-8-*- __author__ = 'Alex Li' import sys print(sys.getdefaultencoding()) msg = "我爱北京天安门" msg_gb2312 = msg.decode("utf-8").encode("gb2312") gb2312_to_gbk = msg_gb2312.decode("gbk").encode("gbk") print(msg) print(msg_gb2312) print(gb2312_to_gbk) in python2
#-*-coding:utf-8-*- __author__ = 'Alex Li' import sys print(sys.getdefaultencoding()) msg = "我爱北京天安门" msg_gb2312 = msg.decode("utf-8").encode("gb2312") gb2312_to_gbk = msg_gb2312.decode("gbk").encode("gbk") print(msg) print(msg_gb2312) print(gb2312_to_gbk) in python2