HTTP caching mechanism mandatory caching/negotiation caching

HTTP caching mechanism one two three

Why is it cached, how to hit the cache, and when the cache takes effect, but we rarely understand it in actual development. Use the animation form to understand the HTTP caching mechanism and principle from the root.

HTTP caching is very critical for front-end performance optimization . Reading data from the cache and directly requesting data from the server are completely in the sky and underground.

What we are most familiar with is the status code  304 returned by the HTTP server response . 304 means telling the browser that there is cached data locally and can be obtained directly from the local without wasting time from the server.

If the cache expires, continue authenticating from the server 

Why is there a cache?


Purely from the computer point of view, it is more abstract, let's look at a practical example. For example, we usually like to put the books we haven't read on the shelf, and the books we haven't read and read in the box.

If we store all the books in the box, we have to go to the box every time we read a book, so it is very troublesome and time-consuming (the box here can be imagined as a server).

When we start to read a new book, we take it out of the box for the first time, read half of it, and then put it directly on the bookshelf. When we read the book next time, we take it out of the bookshelf directly. The bookshelf here is the cache (a cache warehouse) we will talk about below.

Web caches can be roughly divided into: database caches, server-side caches (proxy server caches, CDN caches), and browser caches.

Browser cache also includes many things: HTTP cache, indexDB, cookies, localstorage, etc. Here we only discuss HTTP caching related content.

Before getting to know HTTP caching in detail, let's clarify a few terms:

  • Cache hit rate: The ratio of the number of requests to get data from the cache to the number of all requests, ideally the higher the better.
  • Expired content: content that exceeds the valid time set and is marked as "Old". Usually expired content cannot be used to reply to the client's request, and must re-request the new content from the origin server or verify that the cached content is still ready.
  • Verification: Verify whether the expired content in the cache is still valid, and refresh the expiration time if the verification is passed.
  • Invalidation: Invalidation is the removal of content from the cache. When the content changes, the invalid content must be removed.

Browser caching is mainly a caching mechanism defined by the HTTP protocol . HTML meta tags, such as:

<META HTTP-EQUIV="Pragma" CONTENT="no-store">

Cache-Control:no-store

The meaning is to let the browser not cache the current page, but the proxy server does not parse the HTML content. Generally, HTTP header information is used to control caching.

 

HTTP headers control caching


It can be roughly divided into two types: strong cache and negotiation cache. If the strong cache hits the cache, it does not need to interact with the server, while the negotiation cache needs to interact with the server regardless of whether it hits or not. The priority of the forced cache is higher than that of the negotiation cache . The specific content is introduced below.

Cache-Control: By specifying the command of the header field Cache-Control, the working mechanism of the cache can be operated.

Figure: The header field Cache-Control can control the behavior of the cache. The parameters of the directive are optional, and multiple directives are separated by ",".

The instruction of the header field Cache-Control can be used in request and response . Cache-Control: private, max-age=0, no-cache

Matching process (if there is already a cache):

The "turtle" of the cache is

The process from when the browser sends a request to when the data request comes back is like the book fetching process mentioned above.

  1. When the browser loads resources, it judges whether it hits the strong cache according to the Expires and Cache-control of the request header. If so, it reads the resource directly from the cache without sending a request to the server.
  2. If the strong cache is not hit, the browser will definitely send a request to the server to verify whether the resource hits the negotiation cache through Last-Modified and Etag. If it is hit, the server will return the request, but will not return the data of the resource, and still read the resource from the cache.
  3. If none of the above two hits, load resources directly from the server.

Animation demo:

 

 

 

strong cache


It can be understood as a caching strategy that does not require verification. For strong caching, there are two fields Expires/Cache-Control in the response header to indicate the rules.

Expires

Expires refers to the time when the cache expires. If this time point is exceeded, it means that the resource expires. One problem is that due to the use of specific time, if the time is wrong or not converted to the correct time zone, it may cause errors in the cache life cycle, and Expires is the standard of HTTP/1.0. Now it is more inclined to use the Cache- Control defined in HTTP/1.1. When the two exist at the same time, the Cache-Control has a higher priority.

Cache-Control

Cache-Control can be composed of multiple fields, mainly with the following values:

1.  max-age: Specify a length of time during which the cache is valid, in units of s. For example, set Cache-Control:max-age=31536000, that is to say, the cache validity period is (31536000 / 24 / 60 * 60) days. When accessing this resource for the first time, the server also returns the Expires field, and the expiration time is one year later.

If the cache is not disabled and the effective time is not exceeded, accessing this resource again will hit the cache, and the resource will not be requested from the server but will be fetched directly from the browser cache.

2.  s-maxage: Same as max-age, covering max-age and Expires, but only applicable to shared caches and ignored in private caches.

3.  public: Indicates that the response can be cached by any object (client sending the request, proxy server, etc.).

4.  private: Indicates that the response can only be cached by a single user (maybe an operating system user, browser user), is non-shared, and cannot be cached by a proxy server.

5.  no-cache: Forces all users who have cached the response to send a request with a validator to the server before using the cached data. Not literally don't cache.

6.  no-store: Caching is prohibited, and data must be re-obtained from the server for each request.

 

Guess you like

Origin blog.csdn.net/qq_34556414/article/details/131779376