05-Binary and character encoding
1: What is character encoding
Character encoding is a rule or system for mapping characters (including letters, numbers, symbols, and other text elements) to binary data represented internally by a computer. Since computers can only understand and process binary data, character encoding is the bridge that converts human-readable characters into numerical representations that computers can process.
In computers, each character is represented as a number, which corresponds to a specific binary code. Different character encoding systems use different mapping rules to map characters into corresponding numbers.
2: Common character encodings
Notice:
UTF-8 stipulates that English is represented by the ASCII table, occupying one byte, and Chinese is represented by three bytes.
Unicode stipulates that both Chinese and English are represented by two bytes·
Three: Conversion between unicode and characters
-
chr() function
The chr() function accepts an integer as a parameter and returns the corresponding Unicode character.
-
ord() function
The ord() function accepts a character (a string of length 1) as a parameter and returns the Unicode code point of the character. It converts characters to corresponding integer values.
Demo:
print(chr(23435))
print(ord("宋"))
Output:
宋
23435