HTTP simple request-response protocol

 HTTP protocol (HTTP protocol) generally refers to HTTP 

Hyper Text Transfer Protocol (  HTTP ) is a simple request-response protocol that usually runs on top of TCP . It specifies what kind of messages the client may send to the server and what kind of response it gets. The headers of request and response messages are given in ASCII form; [9]   the message content has a MIME -like format. This simple model was responsible for the early success of the Web because it made development and deployment very straightforward.

Chinese name
hypertext transfer protocol
Base Architecture on TCP
foreign name
HTTP
Applicable browsers Firefox、Google Chrome等
working layer Application layer function
It stipulates the information transfer specifications between WWW servers and browsers, and is an agreement that both parties abide by.

Introduction

The World Wide Web (WWW) originated from CERN, a quantum physics laboratory in Geneva , Europe . It was the emergence of WWW technology that enabled the Internet to develop at an unimaginable speed. This TCP/IP- based technology has quickly become the largest information system on the Internet that has been developed for decades in just ten years. Its success is attributed to its simplicity and practicality. Behind the WWW, there are a series of protocols and standards that support it in accomplishing such a grand work. This is the Web protocol suite, which includes the HTTP Hypertext Transfer Protocol.

In 1990, HTTP became the supporting protocol of the WWW. It was proposed by its founder Tim Berners -Lee, the father of WWW, and then the WWW Consortium was established and organized the IETF (Internet Engineering Task Force) group to further improve and release HTTP.

HTTP is an application layer protocol . Like other application layer protocols, it is a protocol for implementing a certain type of specific application, and its functions are implemented by an application running in user space . HTTP is a protocol specification. This specification is recorded in the document and is the implementation program of HTTP that actually communicates through HTTP.

HTTP communicates based on B/S architecture , and the server-side implementation programs of HTTP include httpd , nginx , etc., and its client-side implementation programs are mainly web browsers , such as Firefox , Internet Explorer , Google Chrome , Safari , Opera , etc., in addition , the client’s command line tools also include elink, curl , etc. Web services are based on TCP, so in order to respond to client requests at any time, the Web server needs to listen on port 80/TCP. In this way, the client browser and the web server can communicate through HTTP. 

development stage

0.9

The 0.9 protocol is a simple and fast protocol suitable for various data information, but it is far from meeting the needs of various increasingly developing applications. The 0.9 protocol is an unordered protocol for exchanging information, limited to text. Since the content cannot be negotiated, the handshake and agreement between the two parties do not stipulate the content of the two parties, that is, the pictures cannot be displayed and processed.

1.0

At the 1.0 protocol stage, that is, in 1982, Tim Berners-Lee proposed HTTP/1.0. In the subsequent enrichment and development, HTTP/1.0 has become the most important transaction-oriented application layer protocol . This protocol establishes and tears down a connection for each request/response. It is characterized by simplicity and ease of management, so it meets everyone's needs and has been widely used.

1.1

In the 1.0 protocol, both parties stipulated the connection method and connection type, which has greatly expanded the field of HTTP, but there is not much consideration for the most important speed and efficiency of the Internet. After all, as the author of the protocol, he did not expect that HTTP would become so popular at that time. 

For the specific content of the HTTP1.1 protocol, please refer to RFC  2616.

2.0

The predecessors of HTTP2.0 are HTTP1.0 and HTTP1.1. Although there were only two versions before, the protocol specifications contained in these two versions are huge enough to give any experienced engineer a headache. New versions of network protocols do not immediately replace older versions. In fact, 1.0 and 1.1 have coexisted for a long time, which is due to the slow update of network infrastructure

For the specific content of the HTTP2.0 protocol, please refer to RFC 7540. 

Application scenarios

When HTTP was first born, it was mainly used to obtain content on the WEB side. At that time, the content was not as rich as it is now, the layout was not as exquisite, and there were almost no user interaction scenarios. For this simple scenario of obtaining web content, HTTP performs reasonably well. But with the development of the Internet and the birth of WEB2.0, more content began to be displayed (more image files), layout became more beautiful (more CSS ), and more complex interactions were introduced (more JS ) . The total amount of data loaded and the number of requests when a user opens the homepage of a website are also increasing.

Today, the home page size of most portal websites exceeds 2M, and the number of requests can be as high as 100. Another widespread application is in mobile Internet client apps. The use of HTTP by apps of different natures varies greatly. For e-commerce apps, there may be more than 10 requests to load the home page. For IMs such as WeChat , HTTP requests may be limited to downloading voice and picture files, and the frequency of requests is not high.

working principle

HTTP is based on the client/server model and is connection-oriented. Typical HTTP transaction processing has the following process: 

(1) The client establishes a connection with the server;

(2) The client makes a request to the server;

(3) The server accepts the request and returns the corresponding file as a response according to the request;

(4) The client and server close the connection.

The HTTP connection between the client and the server is a one-time connection, which limits each connection to process only one request. When the server returns the response to this request, it immediately closes the connection and re-establishes the connection for the next request. This one-time connection mainly takes into account that the WWW server faces thousands of users on the Internet and can only provide a limited number of connections. Therefore, the server will not leave a connection in a waiting state. Timely release of the connection can greatly improve the server's performance. effectiveness. 

HTTP is a stateless protocol , that is, the server does not retain any state from its transactions with the client. This greatly reduces the memory load on the server, thereby maintaining a faster response speed . HTTP is an object-oriented protocol. Allows the transfer of data objects of any type . It identifies the content and size of transmitted data through data type and length, and allows compressed transmission of data. When the user defines a hypertext link in an HTML document , the browser will establish a connection with the specified server  through the TCP/IP protocol .

HTTP supports persistent connections , and in HTTP/0.9 and 1.0 the connection is closed after a single request/response pair. In HTTP/1.1, a keep-alive mechanism was introduced where a connection can be reused for multiple requests. Such a persistent connection can significantly reduce request latency because the client does not need to renegotiate the TCP 3-Way-Handshake connection after sending the first request  . Another positive side effect is that, typically, connections become faster over time due to TCP's slow start mechanism.

Version 1.1 of the protocol also features bandwidth optimization improvements over HTTP/1.0. For example, HTTP/1.1 introduced chunked transfer encoding to allow streaming rather than buffering content on persistent connections. HTTP pipelining further reduces latency , allowing clients to send multiple requests before waiting for each response. Another additional feature of the protocol is byte serving, where the server only transmits the portion of the resource that the client explicitly requested.

Technically speaking, the client opens a socket on a specific TCP port ( the port number is usually 80) . If the server has been listening for connections on this well-known port, the connection will be established. The client then sends a request block containing the request method over the connection.

The HTTP specification defines 9 request methods. Each request method specifies a different information exchange method between the client and the server. Commonly used request methods are GET and POST. The server will complete the corresponding operations according to the client's request, return it to the client in the form of a response block, and finally close the connection.

Mode of operation 

In the WWW, "client" and "server" are relative concepts that only exist during a specific connection, that is, a client in one connection may act as a server in another connection. The information exchange process based on the HTTP client/server model is divided into four processes: establishing a connection, sending request information, sending response information, and closing the connection.

Figure 3 One of the ways http operates

HTTP is based on the request/response paradigm. After a client establishes a connection with the server, it sends a request to the server. The format of the request is a uniform resource identifier , a protocol version number, followed by MIME information including request modifiers , client information and possible content. After receiving the request, the server gives corresponding response information. The format is a status line including the protocol version number of the information, a success or error code, followed by MIME information including server information, entity information and possible content. In fact, to put it simply, in addition to HTML files , any server also has an HTTP resident program for responding to user requests. Your browser is an HTTP client and sends a request to the server. When a start file is entered in the browser or a hyperlink is clicked , the browser sends an HTTP request to the server. This request is sent to the address specified by the IP address . URL. The resident program receives the request, performs the necessary operations and sends back the requested file. In this process, the data sent and received on the network has been divided into one or more data packets . Each data packet includes: the data to be transmitted; control information , which tells the network how to process the data packet. TCP/IP determines the format of each data packet. If you weren't told beforehand, you might not know that information is broken into many small chunks for transmission and then put back together again.

Many HTTP communications are initiated by a user agent and include a request for resources on the origin server. The simplest case is probably done over a single connection between the user agent (UA) and the origin server (O).

Things get a little more complicated when one or more intermediaries are present in the request/response chain. There are three types of intermediaries: Proxy, Gateway and Tunnel. A proxy accepts requests based on the absolute format of the URI , rewrites all or part of the message, and sends the formatted request to the server using the URI identifier. A gateway is a receiving proxy that acts as a layer above some other server and, if necessary, can translate requests to the underlying server protocol. A channel acts as a relay point between two connections that do not change messages. Channels are often used when communication needs to go through an intermediary (e.g. firewall, etc.) or when the intermediary cannot identify the content of the message.

 Message format

HTTP messages consist of requests from the client to the server and responses from the server to the client. The request message format is as follows: [6] 

Request line - General information header - Request header - Entity header - Message body

The request line starts with the method field, followed by the URL field and HTTP protocol version field, and ends with CR LF . SP is the delimiter . Except that CF and LF are required in the final CRLF sequence, everything else is optional. For specific information on general information headers, request headers and entity headers, please refer to relevant documents.

The response message format is as follows:

Status line - General information header - Response header - Entity header - Message body

The status code element consists of 3 digits and indicates whether the request is understood or fulfilled. Cause analysis is a brief description of the status code of the original text. Status codes are used to support automatic operations , and cause analysis is used for users. The client does not need to be used to check or display syntax. For specific information on general information headers, response headers and entity headers, please refer to relevant documents.

status message

1xx: information

information

describe

100 Continue

The server receives only part of the request, but once the server does not reject the request, the client should continue sending the rest of the request.

101 Switching Protocols

Server conversion protocol: The server will comply with the client's request and convert it to another protocol.

information

describe

200 OK

Request successful (followed by response documents to GET and POST requests.)

201 Created

The request is created and the new resource is created.

202 Accepted

The request for processing was accepted, but processing was not completed.

203 Non-authoritative Information

The document has been returned normally, but some response headers may be incorrect because a copy of the document was used.

204 No Content

There are no new documents. The browser should continue to display the original document. This status code is useful if the user refreshes the page regularly and the servlet can determine that the user's document is current enough.

205 Reset Content

There are no new documents. But the browser should reset what it displays. Used to force the browser to clear form input content.

206 Partial Content

The client sends a GET request with a Range header and the server completes it.

3xx: redirect

information

describe

300 Multiple Choices

Multiple choices. Linked list. Users can select a link to reach their destination. A maximum of five addresses are allowed.

301 Moved Permanently

The requested page has been moved to the new url.

302 Found

The requested page has been temporarily moved to the new URL.

303 See Other

The requested page can be found under another URL.

304 Not Modified

The document was not modified as expected. The client has a buffered document and makes a conditional request (usually by providing an If-Modified-Since header to indicate that the client only wants documents that are newer than the specified date). The server tells the client that the original buffered document can continue to be used.

305 Use Proxy

Documents requested by the client should be retrieved through the proxy server specified by the Location header.

306 Unused

This code was used in a previous version. It is no longer in use, but the code is still retained.

307 Temporary Redirect

The requested page has been temporarily moved to a new URL.

4xx: Client error

information

describe

400 Bad Request

The server failed to understand the request.

401 Unauthorized

The requested page requires a username and password.

401.1

Login failed.

401.2

Server configuration causes login failure.

401.3

Authorization is not obtained due to ACL restrictions on resources.

401.4

Filter authorization failed.

401.5

ISAPI/CGI application authorization failed.

401.7

Access is denied by the URL authorization policy on the web server. This error code is specific to IIS 6.0.

402 Payment Required

This code is not available yet.

403 Forbidden

Access to the requested page is prohibited.

403.1

Execution access is prohibited.

403.2

Read access is prohibited.

403.3

Write access is prohibited.

403.4

SSL required.

403.5

SSL 128 required.

403.6

IP address rejected.

403.7

Requires client certificate.

403.8

Site access denied.

403.9

Too many users.

403.10

Invalid configuration.

403.11

Password change.

403.12

Access to the mapping table is denied.

403.13

The client certificate was revoked.

403.14

Deny directory listing.

403.15

Client access permission exceeded.

403.16

The client certificate is untrusted or invalid.

403.17

The client certificate has expired or is not yet valid.

403.18

The requested URL cannot be executed in the current application pool. This error code is specific to IIS 6.0.

403.19

CGI cannot be performed for clients in this application pool. This error code is specific to IIS 6.0.

403.20

Passport login failed. This error code is specific to IIS 6.0.

404 Not Found

The server cannot find the requested page.

404.0

(None) – No file or directory found.

404.1

The web site cannot be accessed on the requested port.

404.2

Web services extension locking policy blocks this request.

404.3

MIME mapping policy blocks this request.

405 Method Not Allowed

The method specified in the request is not allowed.

406 Not Acceptable

The server-generated response was unacceptable to the client.

407 Proxy Authentication Required

用户必须首先使用代理服务器进行验证,这样请求才会被处理。

408 Request Timeout

请求超出了服务器的等待时间。

409 Conflict

由于冲突,请求无法被完成。

410 Gone

被请求的页面不可用。

411 Length Required

"Content-Length"未被定义。如果无此内容,服务器不会接受请求。

412 Precondition Failed

请求中的前提条件被服务器评估为失败。

413 Request Entity Too Large

由于所请求的实体的太大,服务器不会接受请求。

414 Request-url Too Long

由于url太长,服务器不会接受请求。当post请求被转换为带有很长的查询信息的get请求时,就会发生这种情况。

415 Unsupported Media Type

由于媒介类型不被支持,服务器不会接受请求。

416 Requested Range Not Satisfiable

服务器不能满足客户在请求中指定的Range头。

417 Expectation Failed

执行失败。

423

锁定的错误。

5xx:服务器错误

消息

描述

500 Internal Server Error

请求未完成。服务器遇到不可预知的情况。

500.12

应用程序正忙于在Web服务器上重新启动。

500.13

Web服务器太忙。

500.15

不允许直接请求Global.asa。

500.16

UNC授权凭据不正确。这个错误代码为IIS 6.0所专用。

500.18

URL授权存储不能打开。这个错误代码为IIS 6.0所专用。

500.100

内部ASP错误。

501 Not Implemented

请求未完成。服务器不支持所请求的功能。

502 Bad Gateway

请求未完成。服务器从上游服务器收到一个无效的响应。

502.1

CGI应用程序超时。

502.2

CGI应用程序出错。

503 Service Unavailable

请求未完成。服务器临时过载或宕机。

504 Gateway Timeout

Gateway timeout.

505 HTTP Version Not Supported

The server does not support the HTTP version specified in the request.

version number

HTTP request and response messages include HTTP version numbers , and there is some confusion about the correct use and interpretation of HTTP version numbers and about the interoperability of HTTP implementations of different protocol conversions . The use and interpretation of HTTP version numbers can be found in RFC 2145.

Guess you like

Origin blog.csdn.net/weixin_64948861/article/details/129271704