In front of a string literal, what does the 'b' character do?

Ah Chen 1998
TA contributed 1531 experiences and got over 5 likes

Python 3.x clearly distinguishes the following types:

str = ‘…’

text = sequence of Unicode characters (UTF-16 or UTF-32, depending on how Python was compiled)
bytes = b'...'

text = sequence of octets (integers between 0 and 255)
If you are familiar with Java or C#, consider str as String and bytes as byte[]... If you are familiar with SQL, consider str as NVARCHAR and bytes as BINARY or BLOB... If you are familiar with Windows Registry, consider str like REG_SZ and bytes like REG_BINARY...if you are familiar with C(+) then forget everything you learned about char and strings, because characters are not bytes...this idea is long outdated.

You use str when you want to represent text.

print('שלום עולם')
you use bytes when you want to represent low-level binary data like structs.

NaN = struct.unpack('>d', b'\xff\xf8\x00\x00\x00\x00\x00\x00')[0]
You can encode a str to bytes object.

'\uFEFF'.encode('UTF-8')b'\xef\xbb\xbf'
you can decode bytes into str.

b'\xE2\x82\xAC'.decode('UTF-8')'€'
But you can't mix the two types arbitrarily.

b'\xEF\xBB\xBF' + 'Text with a UTF-8 BOM'Traceback (most recent call last):
File "", line 1, in TypeError: can't concat bytes to
str2.b'…' The notation is a bit confusing because it allows bytes 0x01-0x7F to be specified with ASCII characters instead of hexadecimal numbers.

b'A' == b'\x41'True
but I must stress that characters are not bytes.

'A' == b'A' False
in
Python 2.x Pre-3.0 versions of Python lacked this distinction between text and binary data. Instead, there are:

unicode = in '' '

literal=Unicode character sequence=3.x
str

str = ‘…’

literal = jumbled sequence of bytes/characters
Usually text, encoded in some unspecified encoding.
But also used to represent binary data, like
struct.pack

output.
To simplify 2.x to -3.x conversion, b'...' port literal syntax to Python 2.6 in order to distinguish binary strings (should be bytes in 3.x) from text strings (should be str( See 3.x). This b prefix does nothing in 2.x, but tells the 2to3 script not to convert it to a Unicode string in 3.x.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324344753&siteId=291194637