Character encoding and encoding function of Python study notes day1

ord() function
  Get the integer representation of the character
chr() function
  Convert the integer encoding to the corresponding character

'\hexadecimal encoding\hexadecimal encoding' The integer encoding of the character can be written in hexadecimal in


Python like this The string type is str, which is represented by unicode in memory, and one character corresponds to several bytes. If you want to save or transfer, you need to change str into bytes in bytes. Python uses single or double quotes prefixed with b to represent data of type bytes, such as b'ABC'.
Although a = 'ABC' and a1 = b'ABC' appear the same, each character of bytes occupies only one byte.

encode()
  The str expressed in Unicode can be encoded into bytes of the specified character encoding through the encode() function, such as:
    'ABC'.encode('ascii')
    '筚Ang'.encode('utf-8')
    '筚Ang '.encode('GBK')

decode()
  Conversely, to turn bytes into str, you need to use the decode() function.
    b'ABC'.decode('ascii')
    b'\xe7\xaf\xb3\xe6\x98\x82'.decode('utf-8'
  ) will report an error if the bytes data contains bytes that cannot be decoded. If there are only a small number of invalid bytes, you can pass errors='ignore' to ignore errors, such as:
    b'\xe7\xaf\xb3\xe6\x98'.decode('utf-8',errors='ignore')

len()
  Calculate how many characters str contains
    len('ABCDERF')
    len('This is a line of Chinese ')
  Calculate how many bytes bytes contain
    len(b'ABC')
    len(b'\xe7\xaf\xb3\xe6\x98\x82')

Example:
    >>> st = 'This is a line of Chinese'
    #len calculation How many characters str contains
    >>> len(st)
    6
    #len calculates how many bytes str contains after being converted to bytes. When encoded in utf-8, a Chinese character occupies 3 bytes .
    >>> len(st.encode('utf-8'))
    18

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324830792&siteId=291194637