# Character Encoding:
binary # Chinese development
ASCII: can only exist in English and Latin characters. A character occupies one byte, 8
GB2312: just over 6,000 Chinese characters in 1980
GBK1.0: save more than 20,000 characters in 1995
GB18030: more than 27,000 Chinese 2000
Unicode unicode: # Universal World intermediary language
UTF-32: deposit a 4 bytes
UTF-16: a deposit account / 2 or more bytes, 65535, can store vast majority of
UTF-8: is a variable length byte, an English ASCII to keep it occupied with 1 byte, 3 bytes of a Chinese.
encoding encode
decode decode
Python. 3 above the default Unicode
encode while encoded into bytes of data types will
encode decoded at the same time, converted into a string type bytes will
b = byte type byte = = [0-255]
# __Author: "hanhankeji" # DATE: 2019/12/19 Import SYS Print (sys.getdefaultencoding ()) # view. 8-default encoding UTF S = " Tesla " Print (S) s_to_gbk = s.encode ( " GBK " ) Print (s_to_gbk)
utf-8 Tesla b '\ xcc \ xd8 \ xcb \ xb9 \ xc0 \ xad'