HTTP protocol

HTTP protocol

The HTTP (Hypertext Transfer Protocol) protocol is an application layer protocol built on top of the TCP transport protocol. HTTP is an object-oriented protocol belonging to the application layer. Due to its simplicity and speed, it is suitable for distributed hypermedia information systems. It was proposed in 1990, and after years of use and development, it has been continuously improved and expanded.

The main features of the HTTP protocol are as follows.

  1. Support Client/Server mode;
  2. Simple - when a client requests a service from the server, it only needs to specify the service URL, carrying the necessary request parameters or message body
  3. Flexible - HTTP allows the transmission of any type of data object, and the content type of the transmission is marked by the Content-Type in the HTTP message header
  4. Stateless--The HTTP protocol is a stateless protocol. Stateless means that the protocol has no memory capability for transaction processing. The lack of state means that if the previous information is required for subsequent processing, it must be retransmitted, potentially resulting in an increased amount of data transferred per connection. On the other hand, when the server does not need the previous information, its response is faster and the load is lighter.

1. URL of HTTP protocol

An HTTP URI (a URL is a special type of URI that contains enough information to find a resource) has the following format.

http://host[":"port] [abs_path)

Among them, http means to locate network resources through HTTP protocol; host means legal Internet host domain name or IP address; port specifies a port number, if it is empty, the default port 80 is used; abs_path specifies the URI of the requested resource, if no out abs_path, then when it is used as a request URI, it must be given in the form of "/", usually the browser will automatically do this for us.

2. HTTP request message (HttpRequest)

An HTTP request consists of three parts, as follows.

  • HTTP request line
  • HTTP message headers
  • HTTP request body

The request line begins with a method character, separated by spaces, followed by the requested URI and protocol version, in the format:

Method Request-URI HTTP-Version CRLF

Where Method represents the request method, Request-URI is a Uniform Resource Identifier, HTTP-Version represents the HTTP protocol version of the request, CRLF represents carriage return and line feed (except for the CRLF at the end, separate CR or LF characters are not allowed) .

There are several request methods, and the functions of each method are as follows.

  1. GET (get) : request to obtain the resource identified by the Request-URI
  2. POST (modify) : Append new submission data to the resource identified by the Request-URI
  3. HEAD : The response message header for the request to obtain the resource identified by the Request-URI
  4. PUT (add) : request the server to store a resource and use the Request-URI as its identifier
  5. DELETE : Requests the server to delete the resource identified by the Request-URI
  6. TRACE : Request the server to send back the received request information, mainly for testing or diagnosis
  7. CONNECT : reserved for future use
  8. OPTIONS : Request to query the performance of the server, or to query the options and requirements related to the resource

Capture packets through the server and print the GET request header, the content is as follows:

GET / HTTP/1.1
Host: www.baidu.com
Connection: keep-alive
Cache-Control: max-age=0
Upgrade-Insecure-Requests: 1
User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
Accept-Encoding: gzip, deflate, br
Accept-Language: zh-CN,zh;q=0.9
Cookie: BAIDUID=18BAFC9B789FD50EB53E7012C7A149A5:FG=1; PSTM=1523451267; BD_UPN=12314553; ispeed_lsm=0; locale=zh; BD_HOME=1; sug=3; sugstore=0; ORIGIN=2; bdime=0

Request headers allow the client to pass additional information about the request to the server as well as information about the client itself. Commonly used request headers are shown in Table 1. The HTTP request message body is optional, and the commonly used HTTP+XML protocol carries XML information through HTTP request and response message bodies.

Common HTTP request headers

(1) Accept

Specifies what types of information the client accepts. E.g:

Accept: image/gif 

Indicates that the client wishes to accept resources in GIF image format

(2) Accept-Charset

Specifies the character set accepted by the client. E.g:

Accept-Charset: iso-8859-1, gb2312

If this field is not set in the request message, the default is that any character set is acceptable.

(3) Accept-Encoding

Specifies the content encoding acceptable to the client. E.g:

Accept-Encoding: gzip, deflate

If this field is not set in the request message, the server assumes that all content encodings are acceptable to the client.

(4) Accept-Language

Specify the natural language accepted by the client, for example:

Accept-Language: zh-cn

If this header field is not set in the request message, the server assumes that all languages ​​are acceptable to the client.

(5) Authorization

Mainly used to prove that the client has the right to view a resource. When a browser accesses a page, if it receives a response code of 401 (unauthorized) from the server, it can send a request containing the Authorization request header field, asking the server to authenticate it

(6) Host

This header field is required when sending a request to specify the Internet host and port number of the requested resource, usually extracted from the HTTP URL.

(7) User-Agent

Allows the client to tell the server its operating system, browser, and other properties.

(8) Content-Type

Indicates what MIME type the following document is. Servlet defaults to explain, but usually needs to be explicitly specified as text/html. Since Content-Type is often set, HttpServletResponse provides a dedicated method setContentType.

(9) Content-Length

The length of the request message body.

(10) Connection

Connection Type.

3. HTTP response message (HttpResponse)

After processing the request of the HTTP client, the HTTP server returns a response message to the client. The HTTP response is also composed of three parts, namely

  • status line
  • message header
  • response body

The format of the status line is

HTTP-Version Status-Code Reason-Phrase CRLF

Among them, HTTP-Version indicates the version of the server HTTP protocol, and Status-Code indicates the response status code returned by the server. The status code consists of three digits, the first digit defines the category of the response, which has 5 possible values

  1. 1xx: Indication information. Indicates that the request has been received, continue processing
  2. 2xx: Success. Indicates that the request has been successfully received, understood, accepted
  3. 3xx: Redirect. Further action is required to complete the request
  4. 4xx: Client error. The request has a syntax error or the request cannot be fulfilled
  5. 5xX: Server-side error. The server failed to process the request.

Common status codes and status descriptions are shown in Table 1.

status code status description
200 OK: The client request was successful
400 Bad Request: The client request has a syntax error and cannot be understood by the server
401 Unauthorized: The request is not authorized, this status code must be used with the WWW-Authenticate header field
403 Forbidden: The server received the request, but refused to serve
404 Not Found: The requested resource does not exist
500 Internal Server Error: An unexpected error occurred on the server
503 Server Unavailable: The server is currently unable to process the client's request and may return to normal after a period of time

Response headers allow the server to pass additional response information that cannot be placed in the status line, as well as information about the server and further access to the resource identified by the Request-URI. Commonly used response headers are shown below.

(1) Location

Used to redirect the receiver to a new location, the Location response header field is often used when changing the domain name

(2) Server

Contains software information used by the server to process the request, corresponding to the User-Agent request header field

(3) WWW-Authenticate

Must be included in the 401 (Unauthorized) response message. When the client receives the 401 response message and sends the Authorization header field to request the server to verify it, the server response header contains this header field


Record a little bit every day. Content may not be important, but habits are!

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325129236&siteId=291194637