python bytearray/bytes/string区别(47)

table of Contents

A. The difference between bytes and characters

Concept 1. Byte

2. Character concept

3. The concept of a string

4. byte string concepts

Two .str, bytes and bytearray difference

.String with three bytes / bytearray conversion

1.string converted to encode the encoded bytes

2.bytes converted to decode the decoded string

 

 


A. The difference between bytes and characters

Explaining  ByteArray bytes String  before the difference between the three, it is necessary to look at the difference between bytes and characters:

Concept 1. Byte

Byte (Byte) is a unit of measurement techniques for computer information storage capacity measurement, a string of binary digits handled as a unit, is a small unit configuration information. The most commonly used is eight bytes in byte, i.e., it contains the octet;

Bit (bit) is the smallest unit of data stored inside the computer, it is an eight-bit binary number 11001100;

Byte (byte) is the basic unit of computer data processing, denoted by a capital B Traditionally, 1B (byte, byte) = 8bit (bits);

1 KB = 1024 B(字节);
1 MB = 1024 KB;  (2^10 B)
1 GB = 1024 MB;  (2^20 B)
1 TB = 1024 GB;  (2^30 B)

2. Character concept

Alphabetic characters are used in computers, numbers, and symbols, including: 1,2,3, A, B, C, ~! · # ¥% ...... - * () - +, and so on;

Generally utf-8 encoding, a kanji character occupies 3 bytes;

Generally gbk encoding, a kanji character occupies 2 bytes;

 

3. The concept of a string

String is a sequence of characters, which is an abstract concept, can not be directly stored in the hard -  byte string is to look at a computer , or transmitted to the computer is stored, in Python, program text is represented by a string ;

 

4. byte string concepts

Byte string is a sequence of bytes, it may be a hard disk, a direct memory byte string is to look at the computer . They are called mapping between encoding / decoding - string posters, for operation; 

# !usr/bin/env python
# -*- coding:utf-8 _*-
"""
@Author:何以解忧
@Blog(个人博客地址): shuopython.com
@WeChat Official Account(微信公众号):猿说python
@Github:www.github.com

@File:python_bytes_string_2.py
@Time:2020/2/29 21:25

@Motto:不积跬步无以至千里,不积小流无以成江海,程序人生的精彩需要坚持不懈地积累!
"""


if __name__ == "__main__":

    # 字符串str 转 字节bytes
    s = '猿说python'
    b = s.encode()  # 编码,默认的是UTF-8
    print(b)
    print(type(b))

    # 字节bytes 转 字符串str
    b = b'\xe7\x8c\xbf\xe8\xaf\xb4python'.decode(encoding='UTF-8')  # 解码
    print(b)
    print(type(b))

Output:

b'\xe7\x8c\xbf\xe8\xaf\xb4python'
<class 'bytes'>
猿说python
<class 'str'>

 

 

Two .str, bytes and bytearray difference

1.str character data (such as: text, posters), and bytearray bytes are data bytes (such as: binary data to the computer to see), which are sequences can iterate through.

After 2.str and bytes are immutable sequence, str function through a universal type, such as find (), replace (), islower () function such modification is effectively re-created new objects; ByteArray variable sequences, the original to modify a byte.

3.bytes and generic functions can be used bytearray str type, such as find (), replace (), islower (), etc., can not be used is str formatting operations.

In 4.python 3.x default encoding format is unicode str, such as UTF-8 character set.

 

.String with three bytes / bytearray conversion

1.string converted to encode the encoded bytes

if __name__ == "__main__":
    s = "shuopython.com"
    # 将字符串转换为字节对象
    b2 = bytes(s, encoding='utf8')  # 必须制定编码格式
    # print(b2)

    # 字符串encode将获得一个bytes对象
    b3 = str.encode(s)
    b4 = s.encode()
    print(b3)
    print(type(b3))
    print(b4)
    print(type(b4))

Output:

b'shuopython.com'
<class 'bytes'>
b'shuopython.com'
<class 'bytes'>

 

2.bytes converted to decode the decoded string

if __name__ == "__main__":
    # 字节对象b
    b = bytes("python教程-猿说python","utf-8")

    #方案一:
    s2 = bytes.decode(b)
    # 方案二:
    s3 = b.decode()

    print(s2)
    print(s3)

Output:

python教程-猿说python
python教程-猿说python

Note: If the initialization string bytes contain Chinese must set the encoding format, otherwise an error: TypeError: string argument without an encoding

b = bytes("猿说python")

>>> b = bytes("猿说python")
>>> TypeError: string argument without an encoding

 

you may also like:

1.python bytes

2.python variable data types and data types immutable

3.python bytes and string conversion

 

Reproduced please specify : ape say Python  »  Python ByteArray / bytes / String difference

 

                                                                             Technical exchanges, business cooperation please contact bloggers

                                                                                 Scan code or search: ape say python

No public python tutorial

                                                                                               Ape say python

                                                                                            No. sweep the micro-channel public concern

Published 130 original articles · won praise 92 · views 30000 +

Guess you like

Origin blog.csdn.net/ZhaDeNianQu/article/details/104669252