HTTP header caching

Browser caching mechanism

The browser caching mechanism is actually mainly the caching mechanism defined by the HTTP protocol (such as  Expires ;  Cache-control , etc.) . However, there are also caching mechanisms that are not defined by the HTTP protocol. For example, using HTML Meta  tags, web developers can add <meta> tags to the <head> node of HTML pages . The code is as follows:

html code

<META HTTP-EQUIV="Pragma" CONTENT="no-cache">

The function of the above code is to tell the browser that the current page is not cached, and each visit needs to go to the server to pull it. It is simple to use, but only supported by some browsers, and not supported by all caching proxy servers, because the proxy does not parse the HTML content itself.

Below I mainly introduce the caching mechanism defined by the HTTP protocol.

ExpiresStrategy _

Expires is a web server response message header field, which tells the browser that the browser can directly fetch data from the browser cache before the expiration time without having to request again.

The following is the response header of the jquery.js web server pulled by the browser in the baby PK project:

 

Note: The Date header field indicates the time when the message was sent, and the description format of the time is defined by rfc822 . For example, Date: Mon,31 Dec 2001 04:25:57GMT .

The web server tells the browser that the cached file is available until 2012-11-28 03:30:01 . The time to send the request is 2012-11-28 03:25:01 , which is cached for 5 minutes.

However, Expires  is an HTTP 1.0 thing, and now the default browsers use HTTP 1.1 by default , so its role is basically ignored.

Cache-control strategy ( focus on )

Cache-Control has the same function as Expires . Both indicate the validity period of the current resource and control whether the browser directly fetches data from the browser cache or re-sends the request to the server to fetch data. It's just that Cache-Control has more choices and more detailed settings . If set at the same time, its priority is higher than Expires .

http protocol header Cache-Control     :

值可以是publicprivateno-cacheno- storeno-transformmust-revalidateproxy-revalidatemax-age

The meaning of the instructions in each message is as follows:

  1. Public indicates that the response can be cached by any buffer.
  2. Private indicates that the whole or part of the response message for a single user cannot be processed by the shared cache. This allows the server to describe only part of the user's response message, which is not valid for other users' requests.
  3. no-cache indicates that the request or response message cannot be cached
  4. no-store is used to prevent important information from being released unintentionally. Sending in the request message will make the request and response messages not use the cache.
  5. max-age indicates that the client can receive responses with a lifetime no longer than the specified time (in seconds).
  6. min-fresh indicates that the client can receive responses with a response time less than the current time plus the specified time.
  7. max-stale indicates that the client can receive response messages beyond the timeout period. If you specify a value for max-stale messages, the client can receive response messages that exceed the value specified in the timeout period.

Still the above request, the value of the Cache-Control header returned by the web server is max-age=300 , which is 5 minutes (the same time as the above Expires , this is not necessary).

 

Last-Modified/If-Modified-Since

Last-Modified/If-Modified-Since should be used with Cache-Control .

l Last-Modified : Indicates the last modification time of the response resource. When the web server responds to the request, it tells the browser when the resource was last modified.  

l If-Modified-Since : When the resource expires (using the max-age identified by Cache-Control ), and it is found that the resource has a Last-Modified declaration, the If-Modified-Since header is added to the request to the web server again , indicating the request time. After the web server receives the request and finds that there is a header If-Modified-Since  , it compares it with the last modification time of the requested resource . If the last modification time is newer, indicating that the resource has been modified again, it will respond to the entire resource content (written in the response message package), HTTP 200 ; if the last modification time is older, indicating that the resource has not been modified, it will respond with HTTP 304 ( No need for a package body, saving browsing ) , telling the browser to continue using the saved cache .   

Etag/If-None-Match

Etag/If-None-Match should also be used with Cache-Control .

l Etag : When the web server responds to the request, it tells the browser the unique identifier of the current resource on the server (the generation rule is determined by the server). In Apache , the value of ETag is obtained by hashing the index node ( INode ), size ( Size ) and last modification time ( MTime ) of the file by default .  

l If-None-Match : When the resource expires (using the max-age identified by Cache-Control ), and it is found that the resource has an Etage declaration, the header If-None-Match ( the value of Etag ) is included when requesting the web server again . After the web server receives the request and finds that there is a header If-None-Match  , it compares it with the corresponding check string of the requested resource and decides to return 200 or 304 .   

What is the Etag of the Last -Modified ?

You might think that using Last-Modified is enough to let the browser know if the local cached copy is fresh enough, why do you need an Etag (entity identifier)? The emergence of Etag in HTTP 1.1 is mainly to solve several problems that are difficult to solve with Last-Modified :

l The last modification marked by Last-Modified can only be accurate to the second level. If some files are modified multiple times within 1 second, it will not be able to accurately mark the modification time of the file.  

l If some files are generated regularly, sometimes the content has not changed, but the Last-Modified has changed, so that the file cannot use the cache  

l It is possible that the server does not accurately obtain the file modification time, or it is inconsistent with the proxy server time.  

Etag is the unique identifier on the server side of the corresponding resource automatically generated by the server or generated by the developer, which can control the cache more accurately. Last-Modified and ETag can be used together. The server will first verify the ETag . If it is consistent, it will continue to compare Last-Modified , and finally decide whether to return 304 .

User behavior and caching

Browser caching behavior is also related to user behavior! ! !

User action

Expires/Cache-Control

Last-Modified/Etag

Enter in the address bar

efficient

efficient

page link jump

efficient

efficient

new window

efficient

efficient

forward, backward

efficient

efficient

F5 refresh

invalid

efficient

Ctrl+F5 refresh

invalid

invalid

Summarize

Browser first request:

 

 

When the browser requests again:

                                                                                             

 from:http://www.cnblogs.com/skynet/archive/2012/11/28/2792503.html

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324781184&siteId=291194637