learning target
HTTP protocol overview
HTTP request response
HTTP request method
HTTP response method
HTTP message request message
Response message of HTTP message
General message of HTTP message
URL
We are more familiar with URL ( U
niform R
esource L
ocator, Uniform Resource Locator). The URL is just the address of a web page that needs to be entered when accessing a web page using a web browser or the like. For example, http://hackr.jp/ in the figure below is the URL.
When using protocol scheme names such as http: or https: to obtain access resources, the protocol type must be specified. Letters are case insensitive, appended with a colon (:) at the end
1. Overview of the HTTP protocol
1.1 http protocol is used for communication between client and server
When two computers communicate using the HTTP protocol, one end of a communication line must be the client, and the other end must be the server
The identities of the server and client may change, but it is certain from a communication point of view.
1.2 Communication is achieved through the exchange of requests and responses
1.2.1 Request method
The HTTP protocol stipulates that the request is sent from the client, and finally the server responds to the request and returns.
Example:
ask
analyze
GET
Represents the type of request to access the server, called a method (method)/index.htm
Indicates the requested resource object, also called the requested URL (request-URL).HTTP/1.1
, is the version number of HTTP, which is used to prompt the client to use the HTTP protocol function.
Synthesis: The /index.htm page on the HTTP server was requested
The composition of the request message |
---|
request method |
Request URL |
protocol version |
Optional request header fields |
Content Entity Composition |
1.2.2 Response
analyze:
HTTP/1.1
Indicates the HTTP version corresponding to the server200 ok
The status code (status code) and reason phrase (reason-phrase) indicating the processing resultDate: Tue, 10 Jul 2012 06:50:15 GMT
Displays the date and time when the response was created, which is an attribute in the header field.- followed by a blank line, followed by the resource entity body
Synthesis: The response is successful and the page returned: text/html
The composition of the response message |
---|
protocol version |
status code (numeric code for request or failure) |
The reason phrase that explains the status code |
Optional response header fields |
entity topic |
1.3 HTTP is a protocol that does not save state
HTTP is a stateless protocol that does not save state. The HTTP protocol itself does not save the communication state between the request and the response. That is, no persistence processing is performed.
The HTTP protocol generates a corresponding new response whenever a new request is sent.
The protocol itself does not retain information about all previous request or response messages. This is to process extremely large transactions as quickly as possible to ensure protocol scalability.
1.4 Request URI to locate resources
The HTTP protocol uses URIs to locate resources, so resources anywhere on the Internet can be accessed.
1.4.1 How to specify URI
full request URI
GET http://hackr.jp/index.htm HTTP/1.1
Write the domain name or IP address of the network in the header field Host
GET /index.htm HTTP/1.1
Host: hackr.jp
If you are not accessing a specific resource but are right 服务器本身发起请求
, you can *
replace the URI with a
The following example is to query the HTTP method types supported by the HTTP server
OPTIONS * HTTP/1.1
1.5 HTTP methods
The following methods are available in HTTP/1.1.
GET: get resources
Description: The GET method is used to request access to a resource identified by a URI. After the specified resource is parsed by the server, the response content is returned. If it is a text ocean return, if it is a program execution output.
example:
ask:
GET /index.html HTTP/1.1 Host: www.hackr.jp
Response: Returns the page resource of index.html
ask:
GET /index.html HTTP/1.1
Host: www.hackr.jp
If-Modified-Since: Thu, 12 Jul 2012 07:30:00 GMT
Response: Only return the index.html page resource updated after 7:30 on July 12, 2012. If there is no content update, return with status code 304 Not Modified in response
POST: transfer entity body
illustrate:
The POST method is used to transmit the entity body.
example:
ask:
POST /submit.cgi HTTP/1.1 Host: www.hackr.jp Content-Length: 1560(1560字节的数据)
Response: Return the processing result of the data received by submit.cgi
PUT: transfer file
illustrate:
The PUT method is used to transfer files. Just like the file upload of the FTP protocol, it is required to include the file content in the body of the request message, and then save it to the location specified by the request URI.
Because HTTP/1.1 has no authentication mechanism, it needs to be verified with other authentication mechanisms
example:
ask:
PUT /example.html HTTP/1.1 Host: www.hackr.jp Content-Type: text/html Content-Length: 1560(1560字节的数据)
Response: The response returns status code 204 No Content (for example: the html already exists on the server)
HEAD: Get the header of the message
illustrate:
The HEAD method is the same as the GET method, except that the body of the message is not returned. The time and date used to confirm the validity of the URI and update the resource
example:
ask
HEAD /index.html HTTP/1.1 Host: www.hackr.jp
Response: Return the response header related to index.html
DELETE: delete a file
illustrate:
The DELETE method is used to delete files, which is the opposite method of PUT. The DELETE method deletes the specified resource by the request URI.
HTTP/1.1's DELETE also has no authentication mechanism
example:
ask
DELETE /example.html HTTP/1.1 Host: www.hackr.jp
Response: The response returns status code 204 No Content (for example: the html has been deleted from the server)
OPTIONS: Ask for supported methods
illustrate:
The OPTION method is used to query the methods supported by the resource specified by the URI.
example:
ask
OPTIONS * HTTP/1.1 Host: www.hackr.jp
response
HTTP/1.1 200 OK Allow: GET, POST, HEAD, OPTIONS (返回服务器支持的方法)
TRACE: trace path
illustrate:
The TRACE method is a method for the web server to send the previous request communication ring to the client.
When sending a request, fill in the value in the Max-Forwards header field, and decrement the number by 1 every time it passes through a server. When the value is just reduced to 0, stop the transmission, and finally the server that receives the request returns Response with status code 200 OK. Through the TRACE method, the client can query how the sent request has been modified/tampered. This is because the request to connect to the source target server may be relayed through a proxy, and the TRACE method is used to confirm a series of operations that occur during the connection process. (Easy to cause cross-site tracking attacks)
example:
ask
TRACE / HTTP/1.1
Host: hackr.jp
Max-Forwards: 2
response
HTTP/1.1 200 OK
Content-Type: message/http
Content-Length: 1024
TRACE / HTTP/1.1
Host: hackr.jp
Max-Forwards: 2(返回响应包含请求内容)
CONNECT: Request to connect to the proxy with a tunneling protocol
illustrate:
The CONNECT method requires the establishment of a tunnel when communicating with the proxy server to implement TCP communication with the tunneling protocol. Mainly use the SSL and TLS protocols to encrypt the communication content and transmit it through the network tunnel.
The format of the CONNECT method is shown in the figure below.
CONNECT proxy server name: port number HTTP version
example:
ask:
CONNECT proxy.hackr.jp:8080 HTTP/1.1 Host: proxy.hackr.jp
response:
HTTP/1.1 200 OK(之后进入网络隧道)
1.6 Using the method to issue an order
When sending a request message to the resource specified by the request URI, a command called a method is used.
method | illustrate | Supported HTTP version protocol |
---|---|---|
GET | Access to resources | 1.0、1.1 |
POST | Transfer Entity Principal | 1.0、1.1 |
PUT | transfer files | 1.0、1.1 |
HEAD | get message header | 1.0、1.1 |
DELETE | Delete Files | 1.0、1.1 |
OPTIONS | How to ask for support | 1.1 |
TRACE | Most total path | 1.1 |
CONNECT | Requires a tunneling protocol to connect to the proxy | 1.1 |
LINK | Create connections with resources | 1.1 |
UNLINE | Disconnect relationship | 1.1 |
LINK and UNLINK have been deprecated by HTTP/1.1
1.7 Persistent connections save traffic
In the initial version of the HTTP protocol, a TCP connection must be disconnected without an HTTP communication.
Back then, the transmission content was very small text transmission, so there would be no problem. With the development of HTTP, web pages may contain a large number of pictures and even videos. Therefore, each request should add communication overhead for multiple disconnected TCP connections.
1.7.1 Persistent connection
In order to solve the TCP connection problem, HTTP/1.1
some of them HTTP/1.0
came up with persistent connections. HTTP Persistent Connections
, also known as HTTP keep-alive
or HTTP connection reuse
)
In HTTP/1.1, all connections are persistent by default, but this is not standardized within HTTP/1.0. Both client and server must support it.
Features:
As long as either end does not explicitly disconnect the connection, the TCP connection state is maintained. A request to disconnect is required.
advantage:
- Reduce the overhead of repeatedly establishing connections and reduce the load on the server side
- Reduce the time spent on overhead, and improve the response speed of web pages.
1.7.2 Pipelining
Persistent connections make 管线化(pipeliing)
it possible.
The feature of pipeline technology is that the next request can be sent directly without waiting for a response, so that multiple requests can be sent at the same time without waiting for a response one by one.
Pipeline technology is faster than persistent connections, and the more requests there are, the more obvious the time difference will be.
1.8 State management using cookies
HTTP is a stateless protocol, which reduces resource consumption because it does not save the previous state and response. But if you encounter a web page that requires authentication, don't you have to re-authenticate every time, so people use Cookie technology.
Cookie technology controls client status by writing cookie information in request and response messages.
The cookie will notify the client to save the cookie according to a header field called Set-Cookie in the response message. The next time a request is sent, it will be sent with the cookie value.
After the server discovers the cookie value, it will check the request sent by the client and compare the server's records to find the previous state.
example:
- Request message (status without cookie information)
- Response message (the server generates cookie information)
- Request message (automatically send saved cookie information)
The above is the process of a cookie request.