http cache

http cache flow chart:

http protocol

Before introducing http caching, as a foundation for knowledge, let's briefly introduce http packets. An HTTP message is the data sent and responded to when a client (such as a browser) communicates with a web server.

The HTTP request consists of three parts: the request line, the message header, and the request body.

The http response is also composed of three parts: the status line, the message header, and the response body

Information related to the cache is contained in the message header (header). If you don't understand the http protocol, first add the knowledge of the http protocol, google a lot, and skip this article.

Relevant cache fields in the http response header

Just request a webpage, here we open the Baidu homepage and open the browser's debugging tool

 

 

Expires

The value of Expires is the expiration time (GMT Greenwich Mean Time) returned by the web server. The next time the browser requests is less than the time returned by the server, the browser directly obtains the data from the cache without sending the request again.

Cache-Control

Common values ​​of Cache-Control are private, public, no-cache, max-age, no-store

private: the client can cache

public : both client and proxy servers are cacheable

max-age=<seconds>: The maximum period of cache storage, beyond which the cache is considered expired (in seconds)

no-cache: This is easily misunderstood, making people think that the response will not be cached. In fact, Cache-Control: no-cache will be cached.

                 It's just that every time it provides response data to the browser, the browser sends a request to the server every time, and the server decides to evaluate the effectiveness of the cache.

no-store: all content is not cached (not really cached)

Forced Cache : Difference Between Expires and Cache-Control

Expires and Cache-control are called forced caching . When the cache is not invalid, the browser does not initiate a request to the server, but directly fetches data from the cache. You may have doubts, both have the same effect, why should both exist?

Expires is something of Http1.0 in ancient times. Now the default browser uses Http1.1 by default, so its role can be basically ignored. Expires has a disadvantage, that is, the absolute time of the server of the returned time. It is not rigorous to compare whether the local time and the server time have expired. The user can modify the local time at will, so the cache can expire at any time, so it is cached by the Cache. -Control:max-age=<seconds> replaced, Cache-Control has higher priority than Expires in http1.1. The feature of software engineering is backward compatibility, Expires has never been abandoned, and it still works for browsers using the http1.0 protocol.

Last-Modified

The server responds to the browser request, telling the browser the last modification time of the resource

Etag

When the server responds to the request, it tells the client (browser) that the currently requested resource is identified by the server (the generation algorithm of Etag is determined by the server, and the algorithms for generating etags may be different for different web servers, and the Http protocol does not require the generation rules of etags) , such as md5sum sha1sum when the file is small, or comprehensively generated according to the file modification time, file size, and file inode file attributes, which will not be described in detail here), we can understand it as the unique identifier of a resource, as long as the file changes The value of Etag also changes.

Comparing caches : Last-Modified/If-Modified-Since and Etag/If-None-Match

Last-Modified and Etag are called comparison caches. The so-called comparison cache, Gu Mingsi thinks, requires the server to compare and judge to tell the client (browser) whether the local cache can be used. When the comparison cache takes effect, the server returns to the client ( The Http Code of the browser) is 304, the server only returns the http header information, and there is no response body. The client, through the status code 304 returned by the server , knows that the local cache has not been modified, and can directly use the local cache, which is greatly less The client request response time.

The general process of comparing the cache is as follows. When the browser requests a resource from the server, the server obtains the last modification time (Last-Modified) of the resource or generates the resource identifier (Etag) according to a certain algorithm, and assigns the Last-Modified or If -Modified-Since is returned to the browser. The browser caches the Last-Modified or Etag and the resource content locally at the same time. When the resource is requested to the server again next time, it will be If-Modified-Since: Mon, 07 Nov 2016 07: The request header of 51:11 GMT or If-None-Match: xxxxxxx" is sent to the server, and the server calculates the Last-Modified or Etag of the resource again. If it is different from the value sent by the client, it indicates that the resource has changed. , then return Http Code 200 to the browser. And return the resource content to the browser, if the same means there is no change, return Http Code 304 to the browser, and it is not necessary to return the resource content to the browser.

For browsers, it is generally in the case of forcing the cache to expire (or pressing F5 to refresh, different browsers may be different, firefox is pressing F5) if the original response header of the resource contains Last-Modified and Etag , the browser will include If-Modified-Since and If-None-Match in the request header when requesting.

Difference between Last-Modified and Etag

Here, you may ask, through Last-Modified, you can know whether the content of the resource has changed, why do you need Etag, isn't this superfluous? , the main reason Etag solves the problem that Last-modified cannot solve, Etag is more rigorous than Last-Modified.

1. Some files may be changed periodically, but their content does not change (only the modified modification time). At this time, we do not want the client to think that the file has been modified.
      2. Some files are modified very frequently, such as Modifications are made within seconds or less (for example, N times are modified within 1s), the granularity that If-Modified-Since can check is s-level, and this modification cannot be judged

3. If you need to cache dynamically generated content, you can use etag to control the cache

It should be noted that if both Last-Modified and Etag exist at the same time, when sending a request, the browser will send both values ​​to the server at one time. There is no priority. Whether the server compares both, or only compares one, Different web servers may have different logics. Not going into specifics.

The meaning of Cache-Control: no-cache in the Http request header

Generally, when you press ctrl+f5 to force a refresh, the request header contains Cache-Control: no-cache. In fact, this is to skip the local forced cache and tell the server to skip the comparison cache, that is, to re-request the resource. For front-end students, when requesting the back-end API interface through GET, they should uniformly bring Cache-Control: no-cache in the request header of ajax.

 

Summarize

1. For mandatory caching, the server notifies the browser of a cache time. During the cache time, the next request will use the cache directly. If it is not within the time, the comparison cache policy will be executed.

2. For the comparison cache, the Etag and Last-Modified in the cache information are sent to the server through a request, and the server verifies it. When the 304 status code is returned, the browser directly uses the cache.

 

 

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324785962&siteId=291194637