python2
1 # all characters are encoded on python2 need to decode to unicode, unicode to re-encode target encoding from 2 str_utf8 = " I am me " 3 Print ( " str_utf8: I am me: " , str_utf8) 4 # to utf -8 converted to unicode . 5 str_utf8_to_unicode = str_utf8.decode ( " UTF-. 8 " ) . 6 Print (str_utf8_to_unicode) . 7 # will be converted to unicode GBK . 8 str_utf8_to_unicode_to_gbk = str_utf8_to_unicode.encode ( " GBK " )
Whether encoding print the program can display the normal operation of the terminal and also has a relationship
python3
Byte string type conversion
. 1 # ! / Usr / bin / Python the env 2 # _ * _ Coding: UTF-_ * _. 8 . 3 # string binary conversion storage only changes, modifications do not involve encoding of each encode / decode all you need to specify the encoding format string corresponding to 4 str1 = " I am me " 5 # string into a binary, use encode. Here encoding format needs to be consistent with the original string encoding format or in python3 will perform transcoding operations . 6 str1_byte = str1.encode (encoding = " UTF-. 8 " ) . 7 # convert binary strings use decode, here If you fill in binary encoding format error may result in binary can not be converted to a string, causes the program error 8 byte_str1 = str1_byte.decode (encoding = " UTF-8 " ) 9 Print (str1, str1_byte, byte_str1)
On python3 character encoding conversion
1 # ! / Usr / bin / env Python 2 # _ * _ Coding: UTF-8 _ _ * 3 # default encoding is the unicode python3 is not required to decode this step, header statement - * - coding: gbk - * - only encoding the file itself, the program inside a string variable or unicode, 4 str2_utf8 = " I am me " 5 # python3 on str2_utf8 default is unicode encoding (the string itself no direct decode method), corresponding to directly encode coding, while python3 which will convert byte type . 6 str2_utf8_to_gbk = str2_utf8.encode (encoding = " gbk " ) . 7 # print I is my gbk encoding type byte . 8 Print (str2_utf8_to_gbk) . 9 # print I is my gbk encoding type corresponding to the byte string, by specifying the correct coding type, can be correctly converted to a string 10 Print(str2_utf8_to_gbk.decode (encoding = " GBK " )) 11 # prints I am me utf-8 encoded byte type, type conversion byte to encode for their own utf-8 type can be 12 str2_utf8_to_utf8_byte = str2_utf8.encode (encoding = " . 8-UTF " ) 13 is Print (str2_utf8_to_utf8_byte) 14 # me what I gbk type to utf-8, since coded as a unicode conversion requires an intermediary, to decode encoded first unicode gbk, in which to encode utf-8; 15 # Python in converts to encode a byte type, so its output value is consistent with the print str2_utf8_to_utf8_byte, to convert it to a string for the decode of its own encoding 16 Print (str2_utf8_to_gbk.decode (encoding = " GBK " ) .encode (encoding = " UTF-. 8" )) 17 # The binary utf-8 converted to a string 18 is Print (str2_utf8_to_gbk.decode (encoding = " GBK " ) .encode (encoding = " utf-8 " ) .decode ( " utf-8 " ))
result:
b '\ xce \ xd2 \ xbe \ xcd \ xca \ xc7 \ xce \ xd2'
I am what I
b '\ XE6 \ X88 \ x91 \ xe5 \ XB0 \ xb1 \ XE6 \ x98 \ XAF \ XE6 \ X88 \ x91'
b '\ xe6 \ x88 \ x91 \ xe5 \ xb0 \ xb1 \ xe6 \ x98 \ xaf \ xe6 \ x88 \ x91'
I'm me
Draw focus:
In comparison to decode python3 python except that the original outer converted to unicode encoding function to further increase the function byte into a string; encode except that the coding format into a corresponding unicode further increased outer converting byte string type of function