On the browser's caching mechanism

Browser's cache can be divided into HTTP caching and offline cache, respectively, are described below

HTTP caching

Only GET requests can be cached, POST can not be cached.
Modified Time / ETag / Expires / Cache is a caching strategy HTTP protocol

An example of the first to

When we visited Baidu home page a second time, a static file is opened in Chrome's Network panel will find the status of the response is: 200 the OK (from Disk Cache) , should return 304 Not Modified is not it? If you know the answer, then the article can be ignored.
Baidu Home cache files

Cache-Control

Brief introduction

w3.org 的定义是:“The Cache-Control general-header field is used to specify directives which MUST be obeyed by all caching mechanisms along the request/response chain.”

This is a common header fields (that is, the request packet and response packet can spend in the field), used to control HTTP caching behavior.
For example:
Cache-Control: max-Age = 3600, public
publicmeans that the response can be used by anyone cache
max-ageindicates a valid number of seconds that the cache that allows your website to be cached will greatly reduce the download time and bandwidth, but also improve the browser the loading speed is.
You can also set up no-cacheto disable the instruction cache:
Cache-Control: NO-Cache

details

Cache-Control header field is a generic, which means it can be used separately request packets and response packets. In RFC Cache-Control specification of the format:
Cache-Control: cache-directive
a request header, the optional cache-directive values are:

When there is the request header: Cache-Control: max-age = 0, indicates cache needs to be verified (ETag || Last-Modified), if the cache has not expired, may be used.

When there is a request header: Cache-Control: no-cache , browser represents only get the latest file. Response Header and the corresponding no-store.

As a response header, an optional cache-directive values ​​are:

public: a total of caching, caching proxy server can be cached, such as CDN

Private: private cache, can not be a total of caching proxy server caches, may be the user's browser as proxy cache.

max-age = [seconds]: indicates fresh buffer without updating within this time range. Similarly Expires time, but this time is relative, not absolute. After a particular request is successful how many seconds the cache is fresh.

s-maxage = [s]: Similar maxage, except that only applied to the shared cache (such as a proxy).

no-cache: This is not meant not cached, but each time before using the cache are forced to send a request for verification to the source server, check the file that did not change (in fact here and ETag / Last little difference)

NO-Store: is prohibited cache, do not let the browser cache retains copies

must-revalidate: tell the browser, you must verify that this information is checked again expired, the return code is not 200, but 304 a.

proxy-revalidate: similar to must-revalidate, can only be applied in addition to the proxy cache.

no-cache is equivalent to max-age = 0

Cache-Control allows free combination optional value, for example:
Cache-Control: max-age=3600, must-revalidate
this means that the resource is obtained from the origin server, and the cache (freshness) is valid for one hour within a subsequent one hour, the user re-access the resource when the request need not be sent.

Cache check field

Cache-Control field allows the client to decide whether to send the request to the server, such as setting the buffer time has not expired, then the natural to take data from the local cache (the performance of chrome 200 from cache) directly, if the time expired or the buffer resources should not go directly to the cache, it will send a request to the server to go.

We say now the question is, if the client sent a request to the server, then it means that the entity must read the entire contents of the resource back to it?

We try to think - to save the client on a resource buffer time expired, but, when in fact not been updated server and the resource, if the resource data is large, then the client asks the server to send this thing back over and over, whether it is a waste of bandwidth and time?

The answer is yes, then is there a way to let the server know that the client cache file there now, in fact, with all of your files are consistent, then tell the client said 这东西你直接用缓存里的就可以了,我这边没更新过呢,就不再传一次过去了.

In order to enable our customers to realize verify the cache files updated between client and server, to enhance the rate of cache reuse, HTTP defines the following two check methods.

Based on Modified

Request header fields

  • If-Modified-Since: Compare last updated resource is consistent

In response header field

  • Last-Modified: The last time the resource was last modified

When the server resources passed to the client, the last time the resource will change in Last-Modified: GMTthe form of added return together on a header to the client entity.

The client will be marked as a resource on this information, the next time the request again, will the information included with the request packet together to bring the server to do the check, revise the resource on the server if the time value is the same delivery time and it indicates that the resource is not modified, it can be directly returned status code 304.

Based on the check code

Request header fields

  • If-None-Match: Compare ETag are inconsistent

In response header field

  • ETag: matching information resources.

Server through an algorithm to calculate the resources to arrive at a unique identifier (for example md5标志), while the response to the client's resources, will add an entity's first ETag: 唯一标识符return together to the client.

The client will retain the ETag field, and when the next one and the next request to the server with the past. Server need only compare the client came with ETag ETag whether the resource on the same server yourself, you can very well determine whether resources are relatively clients in terms of being modified.

If the server does not match the ETag found, then return directly to the package in the form of regular GET 200 new resources (including of course the new ETag) to the client; if ETag is consistent, then returned directly inform the client 304 directly you can use the local cache.

Why Etag

  • Some files may periodically change, but he does not change the content (modification time only changes), this time we do not want clients to think that this file has been modified, and re-GET;
  • Some files modified very frequently, for example be modified within seconds of time, (say modified N times within 1s), If-Modified-Since the particle size to be able to check the level s, this change can not be determined (or UNIX said recording MTIME only be accurate to seconds)
  • Some servers can not get the exact file was last modified;

Combination of caching policy

Cache practice

Use Expires be compatible with older browsers, using the Cache-Control for more precise use of the cache, category management: HTML, JS, CSS, Photo , Fonts, and then open the ETag with the Last-Modified feature (by default on newer nginx is At the same time open the two functions) to further reduce the flow of complex caching. Tencent Home is 60s cache. The static resources will be a long time settings.
Unable to get static resource: time to bring a bunch of md5 or tags in the file name or parameters.

Examples of the beginning of the

Now the problem at the beginning of the article in retrospect, one might think the answer is very easy to answer them.

Baidu Home of resources not actually send any request after a refresh, because Cache-Control 定义的缓存时间段还没到期. Even if no transmission request in Chrome, but as long as taken from the local cache, the status will appear as a Network panel 200且注明from cachedummy request that only the last leave Response content data.

Enforcement policy browsers

Most current browsers click the Refresh button or press F5 will be forced to request plus Cache-Control:max-age=0request field, so here referred to refresh it refers 选中url地址栏并按回车键(这样不会被强行加上Cache-Control).

HTTP caching is controlled by serviceWorker

Here a brief serviceWorker, and webWorker as it is a kind of running in the browser background thread, but its greater authority, can 拦截HTTP请求, programmatically to control HTTP cache (HTTP cache is global in JS Object caches,其类型为cacheStorage), so only https site can use service worker, of course, localhost is a special case.
In addition serviceWorker full compliance with fetchthe API.

Offline caching

Offline caching scenario is when there is no network APP, may be related to normal operation, the global object in JS applicationCache.

Please try not to use this feature, it has been removed from the Web standard, although some browsers still support it, but it might stop supporting in some future time, use the Service Workers instead. But Service Workers belonging to the technical nature of the experiment, caniuse shown above chrome49 + firefox58 +, IE does not support the entire family, belong to the new technology is not compatible, to stay on the sidelines.

Reference material

On the browser http caching mechanism of
Service Worker first experience
MDN Service Worker
best practice Service Worker
cacheStorage use JavaScript in the Detailed
HTTP Get, Post request Detailed

Original: Big Box  caching mechanism On the browser


Guess you like

Origin www.cnblogs.com/dajunjun/p/11640084.html