05-Binary and character encoding

05-Binary and character encoding


1: What is character encoding

Character encoding is a rule or system for mapping characters (including letters, numbers, symbols, and other text elements) to binary data represented internally by a computer. Since computers can only understand and process binary data, character encoding is the bridge that converts human-readable characters into numerical representations that computers can process.

In computers, each character is represented as a number, which corresponds to a specific binary code. Different character encoding systems use different mapping rules to map characters into corresponding numbers.

2: Common character encodings

二进制(0,1)
ASCII
GB2312
GBK
GB18030
Unicode几乎包含了全世界的字符
UTF-8
其他国家字符编码

Notice:

UTF-8 stipulates that English is represented by the ASCII table, occupying one byte, and Chinese is represented by three bytes.

Unicode stipulates that both Chinese and English are represented by two bytes·

Three: Conversion between unicode and characters

  1. chr() function

    The chr() function accepts an integer as a parameter and returns the corresponding Unicode character.

  2. ord() function

    The ord() function accepts a character (a string of length 1) as a parameter and returns the Unicode code point of the character. It converts characters to corresponding integer values.

Demo:

print(chr(23435))
print(ord("宋"))

Output:

23435

Guess you like

Origin blog.csdn.net/qq_51248309/article/details/132860634