python string basic encoding

Overview: Strings in python are divided into byte characters and non-byte characters

. The default input string in python3 is encoded in non-byte characters, represented by unicode character sets, and can be converted to ascii, utf-8, utf-16 using the encode method Byte characters in various encodings; therefore only non-byte characters are considered standard strings by python3
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> uni_str = 'abc'
>>> type(uni_str)
<class 'str'>
>>> utf8_str = uni_str.encode( 'utf-8')
>>> type(utf8_str)
<class 'bytes'>
>>> asc_str = uni_str.encode('utf-8')
>>> type(asc_str)
<class 'bytes'>
>>> uni_str
'abc'
>>> utf8_str
b'abc'
>>> asc
asc_str  ascii(   
>>> asc_str
b'abc'

The input string in python2 uses ascii-encoded byte characters by default, so Chinese is not supported by default (doubtful), you can use the decode method to convert the default byte-encoded string into non-byte characters, use the unicode character set to represent, and then use The encode method converts the non-byte characters of the unicode character set into characters in other encoding forms such as utf-8 and utf-16; therefore, the encoded string, that is, the byte character, is considered by python2 as a string format
Python 2.7.12 ( default, Dec 4 2017, 14:50:18)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> str = 'abc'
> >> type(str)
<type 'str'>
>>> uni_str = str.decode('ascii')
>>> uni_str
u'abc'
>>> type(uni_str)
<type 'unicode'>
>>> utf8_str = uni_str.encode('utf-8')
>>> utf8_str
'abc'
>>> type(utf8_str)
<type 'str'>

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325130714&siteId=291194637