HTTP transfer encoding (Transfer-Encoding: chunked)

Reprinted from HTTP transfer encoding to increase the transmission capacity, only to solve a problem | practical HTTP , this article could have been the proper way cloud collection in my notes, but today's review looked at the title when this awkward, reproduced here a bit…...

What is a transfer encoding?

Encoded transmission of the HTTP header, a Transfer-Encodingheader is labeled, it indicates that the currently used transmission coding.

Transfer-Encoding Change the format and transmission of messages manner will use it not only will not reduce the size of content delivery, and may even cause the transmission becomes large, it appears to be an environmentally friendly approach, but in fact, in order to solve specific problems.

Briefly, the transmission must be encoded with a persistent connection to use, in a persistent connection to the data block transmission, and to mark the end of transmission and design, will be explained in detail later.

In the early years in the design, and content encoding used Accept-Encodingto mark the client receives the same type of compression-encoded, encoding needs complex transmission TErequest packet head used for specifying the supported transfer encoding. However, in the latest HTTP / 1.1 protocol specification only defines a transmission encoding: chunked encoding (chunked), it does not need to rely on TEthis head.

These details will be mentioned later. Since the transfer encoding and persistent connections are closely related, then we start to look at what is a persistent connection.

Persistent connections (Persistent Connection)

Popular terms persistent connections, the connection is long, the English called Persistent Connection, in fact, literally understood just fine.

In the early HTTP protocol in order to transfer data broadly divided initiate a request to establish a connection, data transmission, and other steps to close the connection, and lasting connection, this step is to remove the connection is closed, so that the client and server can continue through this connection to transfer content.

Actually, this is in order to improve transmission efficiency, we know that HTTP protocol is built on top of TCP protocol, naturally, like TCP three-way handshake, and other characteristics of slow start, so every time you connect are in fact a valuable resource. In order to improve the performance of HTTP as much as possible, using a persistent connection becomes very important. For this purpose, the HTTP protocol, on the introduction of the relevant mechanisms.

No persistent connection early HTTP / 1.0 protocol, the concept of persistent connections is only introduced at a later stage, then through Connection:Keep-Alivethe head to mark the implementation, the server notifies the client or the opposite end, after sending after the data, do not disconnect the TCP connection, and then also need to use again.

In HTTP / 1.1 protocol, discover the importance of a lasting connection, it stipulates that all connections are to be sustained, unless explicit in a message in advance, by Connection:closethis first part, designated this will turn off after the transport connection .

In fact, HTTP / 1.1 in Connectthe head has not Keep-Alivethis value, and for historical reasons, many client and server, still retains the packet header.

Long connection brings another question, how to determine the current data transmission is complete.

Analyzing the transfer is complete

When early does not support persistent connections, in fact, it could be relied on to determine the current disconnect transmission has ended, most of the browser is so dry, it is not operating specifications. You should use Content-Lengththis header to specify the length of the current transmission of content entity.

The following example, in the case of maintaining persistent connections, dependent Content-Lengthto determine the data sent.

img

Content-LengthHere played entity has sent a response to the end of the judgment basis. Under such circumstances, we would require Content-Lengthto be content and length of the entity, and if not, there will be problems.

img

As shown above, if the Content-Lengthlength is less than the content of the entity is truncated, otherwise unable to determine the current response has ended, the request will cause Padding Length pending state.

理想情况下,我们在响应一个请求的时候,就需要知道它的内容实体的大小。但是在实际应用中,有些时候内容实体的长度并没有那么容易获得。例如内容实体来自网络文件、或者是动态生成的。这个时候如果依然想要提前获取到内容实体的长度,只能开一个足够大的 Buffer,等内容全部缓存好了再计算。

但这并不是一个好的方案,全部缓存到 Buffer 里,第一会消耗更多的内存,第二也会更耗时,让客户端等待过久。

此时就需要一个新的机制,不依赖 Content-Length 的值,来判定当前内容实体是否传输完成,此时就需要 Transfer-Encoding 这个头部来判定。

Transfer-Encoding:chunked

前面也提到,Transfer-Encoding 在最新的 HTTP/1.1 协议里,就只有 chunked 这个参数,标识当前为分块编码传输。

分块编码传输既然只有一个可选的参数,我们就只需要指定它为 Transfer-Encoding:chunked ,后续我们就可以将内容实体包装一个个块进行传输。

分块传输的规则:

1. 每个分块包含一个 16 进制的数据长度值和真实数据。

2. 数据长度值独占一行,和真实数据通过 CRLF(\r\n) 分割。

3. 数据长度值,不计算真实数据末尾的 CRLF,只计算当前传输块的数据长度。

4. 最后通过一个数据长度值为 0 的分块,来标记当前内容实体传输结束。

img

在这个例子中,首先在响应头部里标记了 Transfer-Encoding: chunked,后续先传递了第一个分块 “0123456780”,长度为 b(11 的十六进制),之后分别传输了 “Hello CxmyDev” 和 “123”,最后以一个长度为 0 的分块标记当前响应结束。

chunked 的拖挂

当我们使用 chunked 进行分块编码传输的时候,传输结束之后,还有机会在分块报文的末尾,再追加一段数据,此数据称为拖挂(Trailer)。

拖挂的数据,可以是服务端在末尾需要传递的数据,客户端其实是可以忽略并丢弃拖挂的内容的,这就需要双方协商好传输的内容了。

在拖挂中可以包含附带的首部字段,除了 Transfer-Encoding、Trailer 以及 Content-Length 首部之外,其他 HTTP 首部都可以作为拖挂发送。

一般我们会使用拖挂来传递一些在响应报文开始的时候,无法确定的某些值,例如:Content-MD5 首部就是一个常见的在拖挂中追加发送的首部。和长度一样,对于需要分块编码传输的内容实体,在开始响应的时候,我们也很难算出它的 MD5 值。

img

注意这里在头部增加了 Trailder,用以指定末尾还会传递一个 Content-MD5 的拖挂首部,如果有多个拖挂的数据,可以使用逗号进行分割。

内容编码和传输编码结合

内容编码和传输编码一般都是配合使用的。我们会先使用内容编码,将内容实体进行压缩,然后再通过传输编码分块发送出去。客户端接收到分块的数据,再将数据进行重新整合,还原成最初的数据。

img

传输编码小结

我们对传输编码应该有一定的了解了。这里简单总结一下:

1. 传输编码使用 Transfer-Encoding 首部进行标记,在最新的 HTTP/1.1 协议里,它只有 chunked 这一个取值,表示分块编码。

2. 传输编码主要是为了解决持久连接里将数据分块传输之后,判定内容实体传输结束。

3. 分块的格式:数据长度(16进制)+ 分块数据。

4. 如果还有额外的数据,可以在结束之后,使用 Trailer 进行拖挂传输额外的数据。

The transmission coding often used together with the content encoding.

In addition, the transfer encoding should be all HTTP / 1.1 standard implementation, should have the support, if received after transmission incomprehensible coded message should be returned directly to 501 Unimplemented replies to this status code.

Guess you like

Origin www.cnblogs.com/jamesvoid/p/11297843.html