2019/06/06 M Enterprise Cache Acceleration-Varnish Service Getting Started and Installation Configuration

** Cache is to be layered. Generally, there is a private cache on the client browser. The public cache on the intermediate proxy server or public cache can be opened to users (the main purpose is not to cache users. Private data, this kind of cache server may have multiple levels,
CDN level one, and one level inside your site) When the original server level is really reached, it is likely that 8090% of our entire access traffic is filtered out. Directly responded to by the front-end proxy server
** Insert picture description here
Some dynamic content can also be cached. The administrator of the cache server needs to define relevant strategies. If it is properly defined, the pressure on the internal server is very small.
Http to cache At the time of application, there are roughly two sets of logic

Insert picture description here
1 expiration time judgment mechanism. In HTTP1.0, Expires is used to tell the client when the cache expires. The absolute time is based on expiress to control the same time zone. The same logic is fine, but as long as it is global site, there will be problems
2. conditions requests
via a web server requests and responses The message headers to control

Insert picture description here
before 2.52.11 can be returned to the user directly from the local browser, but the absolute time format may not be compatible time zones, resulting in failure
to deal with this situation,
in agreement HTTP1.1 , A specific header cache control to control the maximum life cycle of maxage (you can define the time, such as 10s, no matter the time, anyway, the length of 10s, so it is easier to take effect for the absolute time control method)
s-maxage can be seen as a control public The maximum life cycle in the cache

Insert picture description here
is compatible with 1,0 and 1.1, 86400 a day
Insert picture description here
The expiration time control logic is very rigid, because when the user requests, the private cache has been cached, as long as it does not expire, you can directly respond to the user. Once it expires, you can only request the superior, and the superior server may not get it, you can get it By the way, if the superior server has expired, you can only go to the back-end server. RS
back-end server will continue to respond even if the data has not changed. B cache, A cache, and finally reach the cache server closest to the customer. This is a very Inefficient method
Insert picture description here
but 1.0 is only the above control logic
. After 1.1., The conditional request control logic is added
(conditional request, not in a hurry to control, which means that each client browser cannot be cached To determine whether the cache is useful, and whether it has expired, go to the upper-level cache server and ask the original timestamp of the most recent timestamp of the cached content. If the server has not been updated, it will respond to the client 304, meaning Is unchanged, can be used, so the client is loaded from the cache service to the client End use)
if it is changed in response with a response code 200, update the latest data about its own cache server to respond

Insert picture description here
own local last modified time stamp information cache entries from the cache downloaded, you can also think they add cache time. The time to get the cache
Insert picture description here
is the first request. 200
Insert picture description here
304 not modified. There is no response at this time.
Insert picture description here
This is the first conditional request. The problem is that it is controlled based on time. It is possible that the speed of time change is very fast. The update time The minimum time is seconds. Now the request is made and the server is reached. The server sees the timestamp and reaches the second level. It has not changed, and it responds to the client. The server has actually changed several times. Really changed, the accuracy is inaccurate
Insert picture description here
Based on the timestamp, it is only to judge the modification time of the data, and it does not matter to the data inside. If the verification code of the data is recorded, this is very accurate. Every time the server responds to the client, it directly tells the client the verification code of this file. What is it, the client also records the verification code, and then the client makes the conditional request again, regardless of the time change, as long as the verification code has not changed, it will respond to 304, if the server says that the verification code has changed, Just use 200 to respond to
if none-match. If your verification code matches my match, it matches 304, and it does not match the 200
Etag extension tag. Each file has an extension tag. When the client sends a request, put your own Etag Request to the other party, and the other party's response can be said that my Etag has changed. This Etag can still complete the control.
Both are conditional verification, whichever one you choose can

Insert picture description here
be judged by a file-based modification timestamp (inaccurate) a check code based on the file (may consume more resources)
can first do not have to second
the following two have spent
request And v is the number of Etag server latsmodified xxx is the last modification time
which does not match will return 200 is determined,

Insert picture description here
the conditional expression based on the request also has a defect, although each client can cache the response, but have to request destination server determination, the determination Although the process does not need to send a response message that contains the body part, this step is indispensable. If you
use the expiration time, you may get the expired content, and you still rely on the transmission when it does not expire, so how?
The two are used together, and the expiration time judgment mechanism is combined with the conditional request. The
client requests the first time and caches the data. The cached

Insert picture description here
content has an expiration time, so it is directly from Returned in the local cache, will not go to the server,
If it expires, it will ask the server, it has expired, the data has not changed, the server indicates that there is no change, and 304 responds, then the cache server is renewed for another date,
so the next request can be directly accessed without expiration. Back to 200, the cache server only needs to update the

Insert picture description here
response
Expires expiration time
cache-control: maxage expiration time (how much time)
cache-control: s-maxage =

Insert picture description here
The second is the conditional request
based on the file modification timestamp to determine, in addition It is based on the file verification code to determine

Insert picture description here
that these two methods can be combined and used at the same time.
If you want to force a refresh, you can also add cache -control to the request, no -cache Do not use cache to respond to me (request messages can also control the cache How to use)
Insert picture description here
There can also be cache-control
Insert picture description here
request message in the response message (to tell the server how to respond to the cache)
no-cache Do not cache response
message (how to use cache-control to control the response message in the http1.1 protocol Cache results in response to requests)
public can be cached by public cache
privat e
No-cache is cached only with private caches, it is not caching, but the backend responds to the requestor whether it is a browser or a reverse proxy server, telling the cache system that this content can be cached, but Expires is useless after caching, meaning It is cached, and the next request cannot be directly responded (meaning that each client request needs to be re-verified (ask the server whether it can be re-validated by revalidation, updated)), conditional request, cannot Used in conjunction with expires)
no-store cache can not really express it, can not be stored
must-realidate with no-cache, must re-check
proxy-relidate proxy server to do the re-calibration

Insert picture description here
Insert picture description here
squid still got it, again under heavy load, performance is still outstanding
Varnish is very lightweight, and its concurrency ability is very good, but once the load is overloaded, the performance is unstable.

Insert picture description here
Corresponding to the varnish version change, the difference in configuration changes is very large, 2.0 to 3.0. You need to learn again, 3.0 to 4.0. It also needs to be re-learned. The
mainstream should be the fourth edition now. The epel warehouse provides

Insert picture description here
generally not the latest, because the epel is generally fully tested.
Varnish is a high-performance http service accelerator. The

Insert picture description here
Insert picture description here
red box content is the core varnish content, management Similar to nginx master, child / cache corresponds to worker process
accept worker loger receiving request
logs / stats to record logs and statistics data
service process
storage / hashing management storage and management of keys on storage cache
command line command line interface
backend communication and backend server The
object expiry for communication
In order to improve the performance of the entire varnish, the statistics of the entire job and the log of the job are not recorded on the disk. All the log information is recorded in the shared memory (shared object). The size is about 86M. The memory space is rotated. Yes, keep separate records, remember to overwrite the previous ones, if you want to keep them. You need to read it from the disk and save it to the disk regularly. If you need this,
varnishlog is recorded in the original format.
Varnishncsa is converted into the http format to record.
To run the daemon, you can periodically take records and generate a record and a
varnish. The work program is to view the
varnishtop of the data recorded in the memory. Sort the
varnishlist to view the
varnishstat statistics of the history information.
Both need to deal with the shared area.

Insert picture description here
** There are two types of varnish configuration files.
One is to configure varnish's own work characteristics such as defining workers. threads defines the maximum number of concurrency and defines the working characteristics of the process
2. As a cache server, you need to configure the cache function, which caches and which cannot be cached, how to define it, this also requires the configuration file
for how the varnish cache works The configuration file is a program file (in
plain words, it is configured with a programming syntax. This programming language is called VCL varnish configure langueages. The configuration file written in this language cannot be used directly. It must be compiled first. Use the VCL compiler tool , Compiled to c Language compatible format, then compile with c compiler, then compile with C compiler, c compiler, convert to C code, compile into a shared object, for various threads to use each other, as a library file, to each thread Link loading)
Every time the configuration file changes, it is not the configuration process, but the configuration of the cache system, which needs to be recompiled and loaded to take effect.
So we have to learn two configurations, one is how to configure the process, and the second is how to configure the cache system
And to be able to command and manage it to compile and configure new files, etc., a command line tool is required to implement the CLI interface. The command line tool is varnishADM. This tool needs to be verified when connected, (telnet interface is not used, web interface graphical interface needs to be charged)
** Insert picture description here
Insert picture description here
** Counter (a certain request, how many times are requested, how many times are hits) '
entire' There are two types of varnish configuration files.
Now in the demonstration process, find a varnish. The internal network card is connected to the original server. Varnish can be reversed and scheduled. The scheduling is very simple. Then synchronize the time with the round robin
**
Insert picture description herefirst
Insert picture description here
Insert picture description here
Insert picture description here
to restart the service
Insert picture description here

Insert picture description here
n2 as Intranet server
Insert picture description here
Insert picture description here
configuration Intranet address Insert picture description here
editing web service
Insert picture description here
Insert picture description here
start service
Insert picture description here
Now look at how the varnish host should be configured to
Insert picture description here
see which files are installed in the
varnish.params configuration process (which address and port to listen to, what way to use as a cache)
varnish.vcl is used To configure the caching strategy, how to cache different types of content, and how long to cache. If the
main program / usr / sbin / varnishd
wants to get logs to a file, you need to start varnishlog or varnishstat as
a memory cache or disk for the daemon process.
Insert picture description here
Cache
Insert picture description here
Since it is a cache server, it is necessary to indicate how to cache, and varnish has three caching mechanisms
(nginx is very simple, the data cache is divided into several groups on the disk, the original data is the hash table in memory)
. For varnish, the data and There is no need to care about how the metadata is organized, but there are three places to know where its data can be placed.
When passing parameters to varnishd
-s indicates how to cache
-malloc All data is cached in memory (varnish cannot Restart at will, all the caches will be invalid after restarting.
-File file cache, you need to put it on the disk, path, size (varnish is a black box, all caches need to be put in a large file, binary files, can not be opened at will See, although it is a file, it is based on the memory index. After restarting, the index is gone, and the cache is useless.)
-Persistent is similar to file, which is based on file storage, black box, but the cache is still valid after restart, and it is in the experiment stage, we can not use
memory kept the release of recycling, will cause a lot of debris, over time will be very inefficient, resulting Most memory space is very efficient
memory allocation in time, it is prone to cracks, once is too large, in fact, not memory efficient, it is recommended that you use a disk cache and cache performance, since such a critical, but also with a disk cache is recommended Solid state drive and use pxi-e interface, high performance solid state drive, if you use several solid state drives as raid is better,

Insert picture description here
there are some high performance solid state drives IO is very powerful IOPS can reach 100,000, so the basic performance is close to The memory is
general. The IO of the SATA interface is generally 70.100MB per second

. There is a jemalloc when the installation program requires it. Malloc is a system call for memory allocation in the C language. Free memory
Insert picture description here
Insert picture description here
can be used to complete concurrent application memory when the memory is used up. When caching, jemalloc is still very important
Insert picture description here
There are two sections in the varnish configuration:
1. Configure the working attributes of the process (how much the maximum number of concurrency and how many sub-processes, the configuration is mainly the configuration options passed to varnishd, nothing more than the listening address and port)
2. vcl, defines the
runtime parameters of the cache strategy , such as modifying the maximum number of concurrent connections of varnish. Varnish is managed using a thread pool. A process has 500 threads and the maximum concurrent is 1,000. These can be modified at runtime because The restart of the varnish service is invalid. These configuration-related
items can be modified at run time. Each item is defined by p. If you do not want to allow people to modify it later, -r sets the parameter to read-only state to

Insert picture description here
Insert picture description here
Insert picture description here
pass the run-time parameters.
Insert picture description here
now open the next file to make simple modifications
Insert picture description here
Insert picture description here
to create the cache directory
Insert picture description here
try to start the process of
Insert picture description here
accessing the address
Insert picture description here
back-end server fails
Insert picture description here
back-end server. You need to define the
Insert picture description here
Insert picture description here
backend server in the VCL. Each one uses a backend to define the
Insert picture description here
reload of the VCL
Insert picture description here
. Successfully.
Insert picture description here
For varnish management, you should use the command line tool to
Insert picture description here
Insert picture description here
specify the key file
through the interface -S to specify the address and port , Interactive and direct knocking commands, two working modes
Insert picture description here
help
Insert picture description here
ping to determine whether the server is alive and
Insert picture description here
get the banner information.
Insert picture description here
Stop, start, view the process status
Insert picture description here
Insert picture description here
, delete, compile, apply VCL
Insert picture description here
available,
active, active,

Insert picture description here
view some configuration files. Content
Insert picture description here
Insert picture description here
means load and compile
Insert picture description here
Switching to use
Insert picture description here
there are some adjustable parameters param
Insert picture description here
-l available to list all
Insert picture description here
Insert picture description here
only look at a certain
Insert picture description here
default value 2
Insert picture description here
Insert picture description here
list all storage
Insert picture description here
list backend host
Insert picture description here
backend has not been tested for health, you need to configure
Insert picture description here
it yourself can be set Whether the back-end server is healthy and
Insert picture description here
exits
Insert picture description here
can also run in non-interactive mode
Insert picture description here

Insert picture description here
The most difficult part to understand is that VCL
iptables (input, output, forward, prerouting, postrouting)
as the cache server after receiving the request is the same logic.
Just received the request for processing, it may be rejected, look at this after receiving the request Whether the request can check the cache (to check the cache, use the logic to check the cache, if there is no need to check the cache, use the logic to not check the cache, if you check the cache, there is a saying that the hit does not hit, how to deal with the cache if it hits, to To go to the back-end server, you need to submit the request to the control. It is
similar to the iptables control. Each step has its own control, so the configuration file can only take effect on a certain effective pipe card, and the configuration on the level is dedicated. , So we call the domain-specific configuration language, c language format,
after each judgment, the next level is to be formulated (there should be a return instruction behind each state engine, indicating who the next hop is after the return of

Insert picture description here
Insert picture description here
Insert picture description here
each sub definition, each layer should have its own appropriate configuration
Insert picture description here
Although the above is no, but the varnish is directly built-in rules in force
Insert picture description here
- v show details
Insert picture description here
Insert picture description here
No matter how modifying these built-in configuration will take effect immediately, this is closely related with the security of the whole system
Insert picture description here
look simpler version 2.0 times
Insert picture description here
the request, receipt of the request, determine if the request is refused ip from the bad guys out,
the entire request method to check whether the cache , If it can be handed over to the hash (the cache is checked, if it hits it is hit, if it is missed, it is miss, then go to the server to fetch
if it cannot be cached, fetch (to the back-end server to fetch the

Insert picture description here
Insert picture description here
cache after receiving the request, can see Whether it hits, hits hit, miss is missed, then fetch fetches the data, after it is taken, it is delivered to the delivery client
to check the cache, and if it cannot be checked, go to the back
pipe, when the request is not the http protocol, it turned agent layer 4, directly to the pipeline processing of nothing, PAM server determines to which

4.0 are separated from the front and rear ends of the
Insert picture description here
Insert picture description here
oval engine, diamond judgment mechanism
Insert picture description here

Published 252 original articles · Likes6 · Visits 60,000+

Guess you like

Origin blog.csdn.net/qq_42227818/article/details/91042407