web系统架构及cache基础，varnish

web cache

程序具有局部性

时间局部性

空间局部性

key-value：

key：访问路径，URL，hash

value：web content

热点数据：

命中率：hit（hit+miss）

文档命中率，从文档个数进行衡量

字节命中率：从内容进行衡量

注意：缓存对象，生命周期，定期清理

缓存空间耗尽，LRU（最近最少使用）

可缓存，不可缓存（用户私有数据）

缓存处理的步骤：

接受请求-->解析请求（提取请求的url及各种首部）-->查询缓存-->新鲜度检测-->创建响应报文-->发送相应-->记录日志

新鲜度检测机制：

过期日期：

HTTP/1.0

Expires:Thu, 01 Jan 1970 00:00:00 GMT

HTTP/1.1：

Cache-Control:max-age=600

有效性再验证：revalidate

如果原始内容未改变，则仅仅响应（不附带body部分），响应码为304（not modified）

如果原始你饿哦让发生改变，则正常响应，响应码为200

如果原始内容小时，则相应404，此时换出浓重的cache object也应该被删除

条件式请求首部：

If-Modified-Since：基于其你去内容的时间戳做验证

If-Unmodified-Since

If_match

If-None-match

Etag：faiy89345

常见的缓存服务开源解决方案：

varnish，squid（类似nginx-->apache），nginx，apache

web cache：

squid，varnish

http://book.varnish-software.com/

DSL：vcl

管理进程：编译VCK并应用新配置，监控varish，初始化varnish，CLI接口

child/cache：

Acceptor：接收新的连接请求

workerthreads：处理用户请求

expiry：清理缓存中的过期对象

日志：share memory log，共享内存日志大小默认一般为90MB，分为两部分，亲一部分为计数器，后一部分请求相关的数据

VCL：varnishconfiguration language

缓存策略配置接口

基于“域”的简单编程语言

内存分配和回收：

malloc(),free()

varnish如何存储缓存对象

file：单个文件，不支持持久机制

malloc：内存

persistent：基于文件的持久存储

配置varnish的三种应用：

1. varnishd应用程序的命令行参数

监听的socket，使用的存储类型等等，额外的配置参数

-p： param=value

-r:param,param,…设定只读参数列表

/etc/varnish/varnish.params

2.-p选项指明的参数：

运行时参数

也可在程序运行中，通过其CLI进行配置

2. vcl配置缓存系统的缓存机制

通过vcl配置文件进行配置

先编译，后应用

依赖于c编译器

安装：

yum install varnish –y

epel源提供了安装包

/etc/logrotate.d/varnish
/etc/varnish
/etc/varnish/default.vcl
/etc/varnish/varnish.params

配置：

cat /usr/lib/systemd/system/varnish.service
man varnishd
cd /etc/varnish/
vim varnish.params

centos 7的epel源自带的时4.0.5版本

在varnish.params文件中定义的默认存储方式为内存存储

VARNISH_STORAGE="malloc,256M"

default.vcl：
	修改
backend default {
    .host = "192.168.137.135";
    .port = "80";
}
host为后端服务器ip
port为后端服务器端口

开启varnish服务：

systemctl start varnish

关闭selinx和防火墙：

setenforce 0
service firewalld stop
service iptables stop

查看6081,6082端口监听：

netstat -tnlp | grep varnish
	tcp        0      0 0.0.0.0:6081            0.0.0.0:*               LISTEN      35302/varnishd      
        tcp        0      0 127.0.0.1:6082          0.0.0.0:*               LISTEN      35293/varnishd      
        tcp6       0      0 :::6081                 :::*                    LISTEN      35302/varnishd

后端服务器开启httpd服务添加html文件

for i in {1..10}; do echo "page ${i} on node35.com" > /var/www/html/test${i}.html; done

访问缓存服务器：

查看：

varnishadm -S /etc/varnish/secret -T 127.0.0.1:6082

[root@node30 varnish]# varnishadm -S /etc/varnish/secret -T 127.0.0.1:6082
200        
-----------------------------
Varnish Cache CLI 1.0
-----------------------------
Linux,3.10.0-693.el7.x86_64,x86_64,-smalloc,-smalloc,-hcritbit
varnish-4.0.5 revision 07eff4c29

Type 'help' for command list.
Type 'quit' to close CLI session.

help
200        
help [<command>]
ping [<timestamp>]
auth <response>
quit
banner
status
start
stop
vcl.load <configname> <filename>
vcl.inline <configname> <quoted_VCLstring>
vcl.use <configname>
vcl.discard <configname>
vcl.list
param.show [-l] [<param>]
param.set <param> <value>
panic.show
panic.clear
storage.list
vcl.show [-v] <configname>
backend.list [<backend_expression>]
backend.set_health <backend_expression> <state>
ban <field> <operator> <arg> [&& <field> <oper> <arg>]...
ban.list

backend.list：列出后端服务器

varnish> backend.list
200        
Backend name                   Refs   Admin      Probe
default(192.168.137.135,,80)   1      probe      Healthy (no probe)

varnishlog命令：

varnishlog读取共享内存信息

输入varnishlog命令，没有显示信息，当有访问时会有输出信息

[root@node30 varnish]# varnishlog 
*   << Request  >> 32779     
-   Begin          req 32778 rxreq
-   Timestamp      Start: 1525315271.586835 0.000000 0.000000
-   Timestamp      Req: 1525315271.586835 0.000000 0.000000
-   ReqStart       192.168.137.2 55600
-   ReqMethod      GET
-   ReqURL         /test1.html
-   ReqProtocol    HTTP/1.1
-   ReqHeader      Host: 192.168.137.130:6081
-   ReqHeader      Connection: keep-alive
-   ReqHeader      Cache-Control: max-age=0
-   ReqHeader      Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
-   ReqHeader      Upgrade-Insecure-Requests: 1
-   ReqHeader      User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) 
                    Chrome/49.0.2623.221 Safari/537.36 SE 2.X MetaSr 1.0
-   ReqHeader      Accept-Encoding: gzip, deflate, sdch
-   ReqHeader      Accept-Language: zh-CN,zh;q=0.8
-   ReqHeader      If-None-Match: "15-56b43b317d678"
-   ReqHeader      If-Modified-Since: Thu, 03 May 2018 02:09:55 GMT
-   ReqHeader      X-Forwarded-For: 192.168.137.2
-   VCL_call       RECV
-   VCL_return     hash
-   ReqUnset       Accept-Encoding: gzip, deflate, sdch
-   ReqHeader      Accept-Encoding: gzip
-   VCL_call       HASH
-   VCL_return     lookup
-   Hit            6
-   VCL_call       HIT
-   VCL_return     deliver
-   RespProtocol   HTTP/1.1
-   RespStatus     200
-   RespReason     OK
-   RespHeader     Date: Thu, 03 May 2018 02:40:27 GMT
-   RespHeader     Server: Apache/2.4.6 (CentOS)
-   RespHeader     Last-Modified: Thu, 03 May 2018 02:09:55 GMT
-   RespHeader     ETag: "15-56b43b317d678"
-   RespHeader     Content-Length: 21
-   RespHeader     Content-Type: text/html; charset=UTF-8
-   RespHeader     X-Varnish: 32779 6
-   RespHeader     Age: 50
-   RespHeader     Via: 1.1 varnish-v4
-   VCL_call       DELIVER
-   VCL_return     deliver
-   Timestamp      Process: 1525315271.586994 0.000159 0.000159
-   RespProtocol   HTTP/1.1
-   RespStatus     304
-   RespReason     Not Modified
-   RespReason     Not Modified
-   RespUnset      Content-Length: 21
-   Debug          "RES_MODE 0"
-   RespHeader     Connection: keep-alive
-   Timestamp      Resp: 1525315271.587269 0.000434 0.000275
-   Debug          "XXX REF 2"
-   ReqAcct        518 0 518 283 0 283
-   End            

*   << Session  >> 32778     
-   Begin          sess 0 HTTP/1
-   SessOpen       192.168.137.2 55600 :6081 192.168.137.130 6081 1525315271.586679 14
-   Link           req 32779 rxreq
-   SessClose      RX_TIMEOUT 5.092
-   End

varnishncsa命令：

[root@node30 varnish]# varnishncsa 
192.168.137.2 - - [03/May/2018:10:45:14 +0800] "GET http://192.168.137.130:6081/test1.html HTTP/1.1" 304 0 "-" 
"Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) 
Chrome/49.0.2623.221 Safari/537.36 SE 2.X MetaSr 1.0"

输入varnishncsa，有访问时会有输出输出信息

varnishtop命令

varnishstat命令

命令行工具：

varnishadm-S /etc/varnish/secret -T IP:PORT

Log：varnishlog

varnishncsa

varnishstat

statistics

varnishstat

Top:

varnishtop

varnish（2）

vcl：

state engine：各引擎之间在一定程度上的相关性，前一个engine如果可以有多个下游engine，则上游engine需要 return指明要转移的下游engine

vcl_recv

vcl_hash

vcl_hit

vcl_miss

vcl_fetch

vcl_deliver

vcl_pipe

vcl_pass

vcl_error

编程语言语法：

（1）//,#,/*…*/用于注释，会被编译器忽略

（2）sub $name：用于定义子例程

sub_vcl_recv{

}

（3）不支持循环

（4）有众多内置的变量，变量的可调用位置与state engine有密切相关性

（5）支持终止语句，return(action)，没有返回值

（6）“域”专用

（7）操作符：=，==，~，&&，||，！

条件语句判断语法：

if(condition){
		
}else{

}

变量赋值：setname=value

unset

req.http.HEADER:调用request报文中的http协议的指定的HEADER首部：
	req.http.X-Forwarded-For
	req.http.Authorization
	req.http.cookie
req.request：请求方法
client.ip：客户端IP

官方文档：https://varnish-cache.org/docs/5.1/reference/vcl.html#varnish-configuration-language

参考博客：http://blog.51cto.com/hao360/1530236

state engine workflow(V3)：
	vcl_recv-->vcl_hash-->vcl_hit-->vcl_deliver
	vcl_recv-->vcl_hash-->vcl_miss-->vcl_fetch-->vcl_delivere
	vcl_recv-->vcl_pass-->vcl_fetch-->vcl_deliver
	vcl_recv-->vcl_pipe
state engine worlflow(V4):
	vcl_recv
	vcl_pass
	vcl_pipe
	vcl_hash
	vcl_hit
	vcl_miss
	
	vcl_backend_fetch
	vcl_backend_response
	vcl_nackend_error

	vcl_purge
	vcl_synth

备份default.vcl

cp default.vcl test.vcl

编辑使用test.vcl：

vim test.vcl

修改sub vcl_recv配置（从官网复制subvcl_recv配置，修改即可）

sub vcl_recv {
    if (req.method == "PRI") {
        /* We do not support SPDY or HTTP/2.0 */
        return (synth(405));
    }
    if (req.method != "GET" &&
      req.method != "HEAD" &&
      req.method != "PUT" &&
      req.method != "POST" &&
      req.method != "TRACE" &&
      req.method != "OPTIONS" &&
      req.method != "DELETE") {
        /* Non-RFC2616 or CONNECT which is weird. */
        return (pipe);
    }

    if (req.method != "GET" && req.method != "HEAD") {
        /* We only deal with GET and HEAD by default */
        return (pass);
    }
    if (req.http.Authorization || req.http.Cookie) {
        /* Not cacheable by default */
        return (pass);
    }
    return (hash);
}

varnishadm -S /etc/varnish/secret -T 127.0.0.1:6082

vcl.load test1 test.vcl
    200        
    VCL compiled.
vcl.list
    200        
    active          0 boot
    available       0 test1

多了一个test1（test1）

使test1生效

vcl.use test1
    200        
    VCL 'test1' now active

这样不便于观察，为了便于观察，修改test.vcl的subvcl_deliver配置

加入

if (obj.hits>0) {
         set resp.http.X-Cache = "HIT";
} else{
         set resp.http.X-Cache = "MISS";
     }

定义在vcl_deliver中，向相应客户端的报文添加一个自定义首部X-Cache

访问资源：

curl -I  192.168.137.130:6081/test6.html

第一次访问首部的X-Cache为MISS

X-Cache: MISS

第二次：X-Cache:HIT

varnish中的内置变量：
	变量种类：
	req
	client
	server
	bereq
	obj
	beresp
storage
			
		 bereq
bereq.http.HEADERS：由varnish发往backend server的请求报文的指定首部
			bereq.request：请求方法
			bereq.url：
			bereq.proto：
			bereq.backend：指明要调用的后端主机
		beresp：
			beresp.proto：
			beresp.status：后端服务器的响应状态码
			beresp.ip
			beresp.backend.name:
			beresp.http:
			beresp.HEADERS:从backend响应的报文首部
			beresp.ttl：后端服务器响应的内容的余下的生存时长
		obj：
			obj.ttl：对象的ttl值
			obj.hits：此对象从缓存中命中的次数

支持虚拟主机：

if (req.http.host  == “www.magedu.com”) {	
	
}

强制对某资源的请求不检查缓存：

在test.vcl的vcl_recv加入

if (req.url ~ "^/test7.html$") {
        return(pass);
}

varnishadm -S /etc/varnish/secret -T 127.0.0.1:6082
vcl.load test4 test.vcl
vcl.use test4
vcl.show test4

访问资源：

curl -I  192.168.137.130:6081/test6.html

访问test6.html，第二次显示已经命中，而访问test7.html一直显示未命中

/admin
/login
if (req.url  ~  “(?i)^/login” || req.url ~  (?i)“^/admin”) {
	return(pass)
}
其中(?i)表示忽略大小写

对特定类类型的资源取消私有的cookie标识

在test.vcl的subvcl_backend_response加入

if (beresp.http.cache-control  !~  "s_maxage") {
                if (bereq.url ~ "(?i)\.jpg$") {
                        set beresp.ttl = 300s;
                        unset beresp.http.Set-Cookie;
                }
                if (bereq.url ~ "(?i)\.css$") {
                        set beresp.ttl =600s;
                        unset beresp.http.Set-Cookie;
                }
      }

重新编译使用：

varnishadm -S /etc/varnish/secret -T 127.0.0.1:6082
vcl.load test5 test.vcl
vcl.use test5
vcl.show test5

参考：https://varnish-cache.org/docs/4.1/reference/vcl.html

backendserver的定义：

backend name {
	.attribute = “value”;
}

.host：BE主机的IP

.port：BE主机监听的PORT

probe:对BE做健康状态监测

. max_connections: 打开连接到此后端的最大数量

后端主机的健康状态监测方式：

probe name {
    .attribute = "value";
}

.url：判定BE健康与否要请求的url

.expected_response：期望响应状态码，默认为200

加入后端：

backend websrv1 {
    .host = "192.168.137.135";
    .port = "80";
    .probe = {
        .url = "/test1.html";
        }
}

backend websrv2 {
    .host = "192.168.137.128";
    .port = "80";
    .probe = {
        .url = "/test1.html";
        }
}

在sub vcl_recv中加入：

if (req.url ~ "(?i)\.(jpg|png|gif)$") {
        set req.backend_hint = websrv1;
    }else {
         set req.backend_hint = websrv2;
    }

后端主机安装并开启httpd

systemctl start httpd

给后端两台主机加入页面：

for i in {1..10}; do echo "test${i} on web${i}" > /var/www/html/test${i}.html; done

缓存服务器重启varnish服务

service varnish restart

重新编译命名使用：

varnishadm -S /etc/varnish/secret -T 127.0.0.1:6082
vcl.load test1 test.vcl
vcl.use test1 
vcl.show test1
backend.list

varnish> backend.list 
    200        
    Backend name                   Refs   Admin      Probe
    default(192.168.137.135,,80)   1      probe      Healthy (no probe)
    websrv1(192.168.137.135,,80)   1      probe      Healthy 8/8
    websrv2(192.168.137.128,,80)   1      probe      Healthy 8/8

此时只有192.168.137.135主机上的/var/www/html/目录下游kobe0.jpg文件

访问缓存服务器：

此静态页面已经请求到192.168.137.135主机上的文件

请求的html文件响应的时192.168.137.128主机上的内容

实现了动静分离

示例2：

backend websrv2 {
    .host = "192.168.137.128";
    .port = "80";
    .probe = {
        .url = "/test1.html";
        }
}
import directors;
sub vcl_init {
        new mycluster = directors.round_robin();
        mycluster.add_backend(websrv1);
        mycluster.add_backend(websrv2);
}
sub vcl_recv {
    # Happens before we check if we have this in cache already.
    #
    # Typically you clean up the request here, removing cookies you don't need,
    # rewriting the request, etc.
    #if (req.url ~ "(?i)\.(jpg|png|gif)$") {
    #   set req.backend_hint = websrv1;
    #} else {
    #    set req.backend_hint = websrv2;
    #}
    if (req.url ~ "(?i)test1\.html$") {
        return(pass);
   }
    set req.backend_hint = mycluster.backend();
}

不同的html文件会轮询

负载均衡算法：

fallback，random，round＿robin，hash

掌握：varnishlog，varnishncsa，varnishtop，varnishstat

web系统架构及cache基础，varnish

猜你喜欢