Binary transfer protocol and text transfer protocol in network transfer

There is too little relevant content, basically nothing, and finally I turned to chatGPT for help, combining part of my own understanding

Go directly to the core

早期的网络协议在传输的时候以字节为单位进行传输,而当字节最高位为1时,通常表示这是一个控制字符,而不是文本字符。因此,只能传输ASCII字符集中的字符(可打印字符)(最高位为0)。

ASCII字符集为八位,首位不参与(始终为0),因此有128个字符。
我认为,很多地方提到的,文本传输协议只支持字符(文本)传输的原因,就是这个字节最高位的问题。

因此,早期想要传输非文本文件(音频,视频等)时,文件的二进制内容,在被分成一个一个字节以后,肯定是不符合“首位不为1”这个要求的,因此,需要用 Base64 编码方式,对文件(二进制)进行编码,变成符合传输要求的内容,然后这个行为,也就是别的地方所说的“将二进制数据转换为(可打印)字符”。
Under what circumstances will we use Base64? Base64 is generally used to transmit binary data under the HTTP protocol. Since the HTTP protocol is a hypertext protocol, it is necessary to convert binary data into character data when transmitting binary data under the HTTP protocol. However, a direct conversion is not possible. Because network transmission can only transmit printable characters.
What are printable characters? According to the ASCII code, the 33 characters 0-31 and 127 belong to control characters, and the 95 characters 32-126 belong to printable characters (see the ASCII code comparison table for the specific comparison relationship), that is to say, network transmission can only transmit These 95 characters, characters not within this range cannot be transmitted. So how can other characters be transmitted? One way is to use Base64.
Base64 is a method of representing binary data using 64 printable characters. These 64 characters include uppercase and lowercase letters, numbers, + and /, and the special character = used to fill in the gaps.
Note: Since base64 encoding uses 8-bit characters to represent 6 bits in the message, the base64 encoded string is approximately 33% larger than the original value.
————————————————
Copyright statement: This article is an original article of CSDN blogger "Little Snail Game", following the CC 4.0 BY-SA copyright agreement, please attach the original source link and this statement.
Original link: https://blog.csdn.net/local_752/article/details/121970823

Base64 encoding converts the original data into 4 printable characters in groups of 3 bytes, so that it meets the transmission requirements (the first bit is not 1), but it also increases its size. After augmentation (increased by 33%), binary files become (the binary form of) printable characters.

Put some relevant content found by chatGPT

Guess you like

Origin blog.csdn.net/qq_41934338/article/details/129143068