One article to understand the http cache mechanism

HTTP message

The browser's cache mechanism is what we call the HTTP cache mechanism, which is based on the cache identifier of the HTTP message. First understand the HTTP message:

request message

Message format: request line – request header (general information header, request header, entity header) – request body (only POST has a request body)

response message

Message format: Status Line – Response Header (General Information Header, Response Header, Entity Header) – Response Body
insert image description here

The general information header refers to the header fields supported by both request and response packets: Cache-Control, Connection, Date, Pragma, Transfer-Encoding, Upgrade, Via respectively;

The entity header is the entity header field of entity information, which are Allow, Content-Base, Content-Encoding, Content-Language, Content-Length, Content-Location, Content-MD5, Content-Range, Content-Type, Etag, Expires , Last-Modified, extension-header.

cache analysis

The way the browser communicates with the server is the response mode, that is: the browser initiates an HTTP request – the server responds to the request. Then, after the browser initiates the request to the server for the first time and gets the request result, it will decide whether to cache the result according to the cache identifier of the HTTP header in the response message, and if so, store the request result and the cache identifier in the browser cache.

insert image description here
The key to the browser caching mechanism is:
1. Every time the browser initiates a request, it will first look up the result of the request and the cache ID in the browser cache
2. Every time the browser gets the returned request result, it will combine the result with the cache The logo is stored in the browser cache.
The above two points ensure that the cache is stored and read for each request.

caching process

The caching process is divided into two parts according to whether it is necessary to re-initiate an HTTP request to the server: mandatory caching and negotiation caching .

1. Mandatory caching

The cache result and cache ID do not exist:

insert image description here
The cache result and cache ID exist, but the result has expired, and the cache is forced to be invalid:
insert image description here
the cache result and cache ID exist, and the result has not yet expired, and the cache is forced to take effect:
insert image description here

The caching rule of mandatory caching is: when the browser sends a request to the server, the server will put the caching rule into the HTTP header of the HTTP response message and return it to the browser together with the request result. The fields that control the mandatory caching are Expires And Cache-Control, where Cache-Control has a higher priority than Expires.

Expires

Expires is a field for HTTP/1.0 to control web page caching. Its value is the expiration time of the result cache returned by the server. That is, when the request is sent again, if the client's time is less than the value of Expires, the cached result will be used directly.

Expires is a field of HTTP/1.0, but now the browser uses HTTP/1.1 by default. By HTTP/1.1, Expires has been replaced by Cache-Control. The reason is that the principle of Expires to control the cache is to use the time of the client and the return of the server. If the time of the client and the server is different due to some reasons, the forced cache will be directly invalidated.

Cache-Control

In HTTP/1.1, Cache-Control is the most important rule, which is mainly used to control web page caching. The main values ​​are:

  1. public: all content will be cached (both client and proxy server cacheable)
  2. private : All content can only be cached by the client, the default value of Cache-Control
  3. no-cache: The client caches the content, but whether to use the cache needs to be verified by negotiating the cache
  4. no-store: All content will not be cached, that is, neither mandatory caching nor negotiation caching is used
  5. max-age=xxx: cached content will expire after xxx seconds
  6. s-maxage: The resource expiration time of the proxy server
  7. immutable: Even if it expires, there is no need to negotiate, and the resource will remain unchanged
  8. max-stale=xxx: resources can be used within xxx seconds after expiration
  9. stale-while-revalidate: During validation (negotiation), return stale resources
  10. stale-if-error: If there is an error in the verification (negotiation), return the expired resource
  11. must-revalidate: It is not allowed to use expired resources after expiration, and must wait for the negotiation to end
2. Negotiation cache

Negotiating caching is the process in which the browser sends a request to the server with the cache identifier after the cache is forced to expire, and the server decides whether to use the cache according to the cache identifier.

Negotiation cache takes effect, return 304:
insert image description here
Negotiation cache failed, return 200 and request result:
insert image description here
Similarly, the identifier of the negotiation cache is returned to the browser together with the request result in the HTTP header of the response message, and the fields controlling the negotiation cache are: Last-Modified/If-Modified-Since and Etag/If-None-Match, where Etag/If-None-Match has a higher priority than Last-Modified/If-Modified-Since.

Last-Modified / If-Modified-Since

Last-Modified is the time when the resource file was last modified on the server when the server responds to the request;

If-Modified-Since means that when the client initiates the request again, it carries the Last-Modified value returned by the previous request, and uses this field value to tell the server the last modified time of the resource returned by the last request. When the server receives the request and finds that the request header contains the If-Modified-Since field, it will compare the value of the If-Modified-Since field with the last modification time of the resource on the server. If the last modification time of the resource on the server is greater than If the field value of If-Modified-Since is set, the resource will be returned with a status code of 200; otherwise, 304 will be returned, indicating that the resource has not been updated and the cache file can continue to be used.

Etag / If-None-Match

Etag is a unique identifier (generated by the server) that returns the current resource file when the server responds to the request;

If-None-Match means that when the client initiates the request again, it carries the unique identifier Etag value returned by the previous request, and uses this field value to tell the server the unique identifier value returned by the last request for the resource. After the server receives the request and finds that the request header contains If-None-Match, it will compare the field value of the If-None-Match with the Etag value of the resource on the server. If they match, 304 will be returned, indicating that the resource has not been updated. , continue to use the cache file; if inconsistent, return to the resource file, and the status code is 200

caching method

Browser cache can be further divided into: memory cache (from memory cache) and hard disk cache (from disk cache) according to the location of the cache resource. The
order in which the browser reads the cache is memory –> disk;
accessing a website –> 200 –> close Blog tab –> reopen the site –> 200(from disk cache) –> refresh –> 200(from memory cache)

  • Memory cache (from memory cache): memory cache has two characteristics, namely fast reading and timeliness:
  1. Fast reading: The memory cache will store the compiled and parsed files directly into the memory of the process, occupying a certain amount of memory resources of the process, so as to facilitate fast reading during the next run.
  2. Timeliness: Once the process is closed, the memory of the process will be cleared.
  • Hard disk cache (from disk cache): Hard disk cache is to directly write the cache into the hard disk file. Reading the cache needs to perform I/O operations on the hard disk file stored in the cache, and then re-parse the cache content. The reading is complicated and the speed is high. Slower than memory cache.

In the browser, the browser will directly store files such as js and pictures in the memory cache after parsing and executing them, so when the page is refreshed, it only needs to read directly from the memory cache (from memory cache); while the css file will be stored Into the hard disk file, so every time the page is rendered, the cache needs to be read from the hard disk (from disk cache).

Summarize

Mandatory caching takes precedence over negotiation caching. If mandatory caching (Expires and Cache-Control) takes effect, the cache will be used directly. If not, negotiation caching will be performed (Last-Modified / If-Modified-Since and Etag / If-None-Match) , the negotiation cache is determined by the server whether to use the cache. If the negotiation cache is invalid, it means that the cache of the request is invalid, and the request result is re-obtained, and then stored in the browser cache; if it is valid, return 304 and continue to use the cache.
insert image description here

Guess you like

Origin blog.csdn.net/weixin_43867717/article/details/124836276