Protocol application layer protocol --HTTP

I. Overview

1. Hypertext Transfer Protocol HTTP, which is a predetermined rule each communication between the browser and the server;

2. Based on the TCP, stateless (no memory for transaction processing capability);

3. Main course

       The main process HTTP protocol works as shown below, the HTTP protocol and server need to establish a TCP connection, which requires a three-way handshake. And in the third handshake, the client will request packet to the server, the server reply directly respond to messages. After the transfer is complete, release the TCP connection.

 

二. HTTP/1.0与HTTP/1.1

The main drawback 1. HTTP / 1.0 of

       Every request requires twice the cost of a document RTT. Meanwhile the World Wide Web client and the server each time the need to establish a new TCP connection (allocates buffers and variables), increase the burden on the server (Fortunately, each time the browser can open 5-10 parallel TCP connections, use parallel TCP connections can be shortened Response time).

2. HTTP/1.1

       Use a persistent connection. Web server may maintain the connection for a period of time after sending a response, so that subsequent requests from the same client can continue to use in this connection. Finally, HTTP1.1 has two modes: a pipelined manner and non-pipelined manner.

<1> non-pipelined mode: former clients receive a response in order to continue to send the next request. So customers every time access is required objects spent a RTT. Although better than 2 RTT non-persistent connections, but the drawback is the complete server sends an object, TCP connection is in an idle state, a waste of server resources;

<2> a pipelined manner: a new client can be continuously transmitted prior to receipt of an HTTP request response packet. So customers have access to all objects RTT only takes a time. In this way it saves time idle TCP connections, improving download the document efficiency;

 

Three. HTTP packets

       The following describes how the HTTP message, whether it is an HTTP request message, an HTTP response or in part, is divided into three parts: start line (request line, status line), a head (Request header, in response to the head), physical (requesting entity, the responding entity).

1. HTTP request packets

<1> request line: The method comprises a request, the request URL, HTTP protocol version three fields, separated by a space;

<2> request header: the keyword / value composition, i.e. Key / value, one per line, key and value separated by a colon;

<3> blank line: after the last request header is a null line, there will not be the notification server request header;

<4> Data Request: the method generally used in the post, POST generally applicable to the customer needs to fill out forms occasions;

2. HTTP response message

<1> status line: including the HTTP protocol version, status code, separated by a space;

<2> response header: the keyword / value composition, i.e. Key / value, one per line, key and value separated by a colon;

<3> blank line: After the last head is a response to a blank line, the notification server no longer responds to the head;

<4> response data: the data returned by the server;

 

Four. HTTP status code

HTTP protocol which defines five types of status codes, respectively, from 1XX to 5XX.

1. 1XX: Represents the information

100: prompts the client should continue to send the request;

2. 2XX: for success

200: The request successful;

202: Accepted, but not yet processed;

3. 3XX: Redirects

301: Permanent Redirect. The new request to use the new URL;

302: Temporary Redirect. Resources are only temporary move, also need to use the original URL;

4. 4XX: means that the client error

400: syntax error;

404: The requested resource was not found;

5. 5XX: represents a server-side error

500: Server Error;

502: Bad gateway request. As a gateway or proxy attempts to execute the request, received from the remote server in response to an invalid;

 

Five. HTTP cache

Defines the caching mechanism in the HTTP protocol, then how HTTP caching?

1. Pragma

<1> in the HTTP / 1.0 era, to the client cache is set by way of two fields Pragma and Expires, although these two fields as early as disposable, but in order to do the HTTP protocol is backward compatible, you can still see a lot of the site is still to bring these two fields;

<2> In the response packet when the packet field value is "no-cache" when the client is notified not to cache the resource, each had to send a job request to the server;

2. Expires

<1> With Pragma to disable caching, set the cache if you need it, then you have to have something to set the cache time for HTTP / 1.0 is, Expires is to do it;

<2> corresponding to a value of the Expires GMT (Greenwich Mean Time), such as "Mon, 22 Mar 2017 11:12:01 GMT" resources to tell the browser cache expiration time has not been exceeded if the request is not made at this time point, the direct return 200 OK. (From cache);

<3> When used when both together, Pragma higher priority, i.e. when the Pragma provided to disable the cache, Expires gave a defined time has not expired, will find still send a new request;

Supplementary: Expires shortcomings

Now I do not advocate such a way, because it has two fatal flaws:

<1> Expires defined cache time is relative to the time on the server and browser to determine when is based on the system time of the client, if the user modifies the system time of your computer, then the cache time will not any sense;

<2> If a resource cache on the client time expired, but this time in fact the server and the resource has not been updated, so this time the client asks the server to send back to things and then come again, wasting bandwidth and time, which is obviously is unreasonable, we need a mechanism used to determine the correct thing in the end you can use the cache directly;

3. Cache-Control

<1> HTTP / 1.1, you may be provided in the request header Cache-Control to define cache expiration time;

Cache-Control:max-age=3600,must-revalidate

Represents the effective time from the server resources acquired one hour, one hour within a subsequent user requests the resource need to send a new request.

Cache-Control:no-cache,no-store

Indicates to the browser does not use the cache, each request must send a request to the server, and the contents will not be saved to the cache or temporary files.

4. Last-Modified

<1> To address the second issue under the left front, to ensure that when clients save on a resource buffer time expired, but, when in fact the server and the resource has not been updated, the server can handle such a request is correct, rather than re-sending resources to achieve between the client and server to verify cache file update, improve cache reuse rate;

<2> HTTP / 1.1 the new header field;

When <3> server resources delivered to the client, the last time the resource will be changed to: added in the form "Last-Modified GMT" is returned with a response message to the client's head. The client will mark the resource information, request again next time, will send the message that came in the request message, modify the final agreement on the value of server resources if the transfer time, it indicates that the resource has not been modified too;

Supplementary: Last-Modified defects

       If on the server, a resource is modified, but its actual contents did not change, because Last-Modified time match is not on the return of the entire entity to the client (even if the client cache there exactly the same resources).

5. Etag

<1> In order to solve the above-described problem of inaccurate Last-Modified possible;

<2> HTTP / 1.1 the new header field;

<3> server via an algorithm, to the resource a calculated unique identifier (such as md5 flag), when the response to the client's resources, will add the text message response "Etag: the unique identifier" is returned to the clients. The client will save the Etag field, and the next request when its value is sent as a request header request field, the server need only compare the client came Etag Etag is consistent with the resources on your own server, you can resources are relatively well judged in terms of whether clients have been modified;

 

Six. HTTP and HTTPS

       Secure Sockets Layer Hypertext Transfer Protocol HTTPS, for secure data transmission, and on the basis of HTTP joined the SSL protocol, SSL relies on certificates to verify the identity of the server, and encrypted communication between the browser and the server. For details of the HTTPS protocol, in the next section.

 

VII. Summary

1. hypertext transfer protocol HTTP, lays down the rules of mutual communication between the browser and the server, which is based on the TCP protocol, stateless, requires three-way handshake to establish a connection;

2. The main drawback for HTTP / 1.0 request each time a document needs twice the RTT overhead, HTTP / 1.1 uses persistent connections;

3. HTTP1.1 has two modes: a pipelined manner and non-pipelined manner. Pipeline way to save the TCP connection idle time, improve efficiency download documents;

4. HTTP packets divided into three parts: start line (request line, status line), a head (request header, response header), an entity (the requesting entity, the responding entity);

5. HTTP protocol which defines five types of status codes, respectively, from the 1XX 5XX;

6. When the HTTP-related applications we do recommend the use of HTTP caching. Expires can be used to compatibility with older browsers, using the Cache-Control for more precise use of the cache, and then open the ETag Last-Modified feature with further multiplexing cache reduce traffic;

7. Secure Sockets Layer the HTTPS HyperText Transfer Protocol, HTTP is based on the addition of the SSL protocol to ensure the security of data transmission;

Published 19 original articles · won praise 20 · views 5847

Guess you like

Origin blog.csdn.net/qq_15898739/article/details/102960294