Getting Started with the HTTP Protocol




The HTTP protocol is the basic protocol of the Internet and the necessary knowledge for web development. The latest version of HTTP/2 makes it a technology hotspot.



This article introduces the historical evolution and design ideas of the HTTP protocol.



1. HTTP/0.9



HTTP is an application layer protocol based on the TCP/IP protocol. It does not involve data packet (packet) transmission, and mainly specifies the communication format between the client and the server, and uses port 80 by default.



The earliest version was version 0.9 released in 1991. This version is extremely simple, with only one command GET.



GET /index.html The

above command indicates that after the TCP connection is established, the client requests the web page index.html from the server.



The protocol stipulates that the server can only respond to strings in HTML format, and cannot respond to other formats.



After the Hello World

server is sent, it closes the TCP connection.



2. HTTP/1.0



2.1 Introduction



In May 1996, the HTTP/1.0 version was released, and the content was greatly increased.



First, content in any format can be sent. This allows the Internet to transmit not only text, but also images, videos, and binary files. This laid the foundation for the great development of the Internet.



Secondly, in addition to the GET command, the POST command and the HEAD command are also introduced, which enriches the interaction between the browser and the server.



Again, the format of HTTP requests and responses has also changed. In addition to the data part, each communication must include header information (HTTP header) to describe some metadata.



Other new features include status code, multi-character set support, multi-part type, authorization, cache, content encoding, and more.



2.2 Request Format



The following is an example of a version 1.0 HTTP request.



GET / HTTP/1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_5) Accept: */*

As you can see, this format has changed a lot from version 0.9.



The first line is the request command, and the protocol version (HTTP/1.0) must be added at the end. It is followed by multi-line header information, describing the situation of the client.



2.3 Response format The



server 's response is as follows.



HTTP/1.0 200 OK

Content-Type: text/plainContent-Length: 137582Expires: Thu, 05 Dec 1997 16:00:00 GMTLast-Modified: Wed, 5 August 1996 15:55:28 GMTServer: Apache 0.84 Format of



Hello World

response is "header information + a blank line (\r\n) + data". Among them, the first line is "protocol version + status code (status code) + status description".



2.4 Content-Type field



Regarding the encoding of characters, version 1.0 stipulates that the header information must be ASCII code, and the following data can be in any format. Therefore, when the server responds, it must tell the client what format the data is in, which is what the Content-Type field does.



Below are some common Content-Type field values.



text/plain



text/html



text/css



image/jpeg



image/png



image/svg+xml



audio/mp4



video/mp4



application/javascript



application/pdf



application/zip



application/atom+xml



These data types are collectively referred to as MIME types, each Values ​​include primary and secondary types, separated by slashes.



In addition to predefined types, manufacturers can also customize types.



The type above application/vnd.debian.binary-package

indicates that the binary data package of the Debian system is sent.



MIME types can also use semicolons at the end to add parameters.



Content-Type: text/html; charset=utf-8

The above type indicates that the web page is sent, and the encoding is UTF-8.



When the client requests, you can use the Accept field to declare which data formats it can accept.



Accept: */*

In the above code, the client declares that it can accept data in any format.



MIME types are not only used in the HTTP protocol, but also in other places, such as HTML pages.







2.5 Content-Encoding field



Since the data to be sent can be in any format, the data can be compressed before sending. The Content-Encoding field specifies the compression method of the data.



Content-Encoding: gzipContent-Encoding: compressContent-Encoding: deflate

When requesting, the client uses the Accept-Encoding field to indicate which compression methods it can accept.



Accept-Encoding: gzip, deflate

2.6 Disadvantages



The main disadvantage of HTTP/1.0 is that only one request can be sent per TCP connection. After sending the data, the connection is closed. If you want to request other resources, you must create a new connection.



The cost of establishing a TCP connection is high because it requires a three-way handshake between the client and the server, and the sending rate is slow at the beginning (slow start). Therefore, the performance of the HTTP 1.0 version is relatively poor. This problem becomes more prominent as more and more external resources are loaded on web pages.



To solve this problem, some browsers use a non-standard Connection field when requesting.



Connection: keep-alive

This field requires the server not to close the TCP connection so that other requests can be reused. The server also responds to this field.



Connection: keep-alive

A reusable TCP connection is established until the client or server actively closes the connection. However, this is not a standard field, and different implementations may behave inconsistently, so it is not a fundamental fix.



3. HTTP/1.1



In January 1997, the HTTP/1.1 version was released, only half a year later than the 1.0 version. It further perfected the HTTP protocol, which has been used for 20 years and is still the most popular version.



3.1



The biggest change in persistent connection version 1.1 is the introduction of persistent connection, that is, the TCP connection is not closed by default and can be reused by multiple requests without declaring Connection: keep-alive.



When the client and server find that the other party has not been active for a period of time, they can actively close the connection. However, the standard practice is that the client sends Connection: close in the last request, explicitly asking the server to close the TCP connection.



Connection: close

Currently, for the same domain name, most browsers allow 6 persistent connections at the same time.



3.2 Pipeline mechanism



Version 1.1 also introduced a pipelining mechanism, that is, in the same TCP connection, the client can send multiple requests at the same time. This further improves the efficiency of the HTTP protocol.



For example, the client needs to request two resources. The previous practice was that in the same TCP connection, the A request was sent first, then waited for the server to respond, and then sent the B request after receiving it. The pipeline mechanism allows the browser to issue A and B requests at the same time, but the server still responds to the A request in order, and then responds to the B request after completion.



3.3 Content-Length field



A TCP connection can now transmit multiple responses, and there must be a mechanism to distinguish which response a packet belongs to. This is the role of the Content-length field, declaring the data length of this response.



Content-Length: 3495

The above code tells the browser that the length of this response is 3495 bytes, and the following bytes belong to the next response.



In version 1.0, the Content-Length field is not required, because the browser finds that the server has closed the TCP connection, indicating that the received packet has been full.



3.4 Blocked transfer encoding The prerequisite for



using the Content-Length field is that the server must know the data length of the response before sending the response.



For some time-consuming dynamic operations, this means that the server cannot send data until all operations are completed, which is obviously inefficient. A better way to deal with it is to generate a block of data, send a block, and use "stream mode" (stream) instead of "buffer mode" (buffer).



Therefore, version 1.1 stipulates that instead of using the Content-Length field, "chunked transfer encoding" can be used. As long as the request or response header has a Transfer-Encoding field, it indicates that the response will consist of an unspecified number of data blocks.



Transfer-Encoding:

Before each non-empty data block chunked, there will be a hexadecimal value indicating the length of the block. Finally, there is a block of size 0, which means that the data for this response has been sent. Below is an example.



HTTP/1.1 200 OKContent-Type: text/plainTransfer-Encoding: chunked

25

This is the data in the first chunk

1C

and this is the second one

3

con

8

sequence

0

3.5 Other Features



Version 1.1 also adds many new verb methods: PUT, PATCH, HEAD, OPTIONS, DELETE.



In addition, a new Host field is added to the header information of the client request, which is used to specify the domain name of the server.



Host: www.example.com

With the Host field, requests can be sent to different websites on the same server, laying the foundation for the rise of virtual hosts.



3.6 Disadvantages



Although version 1.1 allows multiplexing of TCP connections, all data communication within the same TCP connection is performed in order. The server will only proceed to the next response after processing one response. If the previous response is particularly slow, many requests will be queued later. This is called "Head-of-line blocking".



In order to avoid this problem, there are only two ways: one is to reduce the number of requests, and the other is to open more persistent connections at the same time. This leads to a lot of web optimization tricks like merging scripts and style sheets, embedding images into CSS code, domain sharding, and more. This extra work could have been avoided if the HTTP protocol had been better designed.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326585605&siteId=291194637