keepalived+nginx实现HA高可用的web负载均衡

Keepalived 是一种高性能的服务器高可用或热备解决方案， Keepalived 可以用来防止服务器单点故障的发生，通过配合 Nginx 可以实现 web 前端服务的高可用。
Keepalived 以 VRRP 协议为实现基础，用 VRRP 协议来实现高可用性(HA)。 VRRP(Virtual RouterRedundancy Protocol)协议是用于实现路由器冗余的协议， VRRP 协议将两台或多台路由器设备虚拟成一个设备，对外提供虚拟路由器 IP(一个或多个)，而在路由器组内部，如果实际拥有这个对外 IP 的路由器如果工作正常的话就是 MASTER，或者是通过算法选举产生， MASTER 实现针对虚拟路由器 IP 的各种网络功能，如 ARP 请求， ICMP，以及数据的转发等；其他设备不拥有该虚拟 IP，状态是 BACKUP，除了接收 MASTER 的VRRP 状态通告信息外，不执行对外的网络功能。当主机失效时， BACKUP 将接管原先 MASTER 的网络功能。VRRP 协议使用多播数据来传输 VRRP 数据， VRRP 数据使用特殊的虚拟源 MAC 地址发送数据而不是自身网卡的 MAC 地址， VRRP 运行时只有 MASTER 路由器定时发送 VRRP 通告信息，表示 MASTER 工作正常以及虚拟路由器 IP(组)， BACKUP 只接收 VRRP 数据，不发送数据，如果一定时间内没有接收到 MASTER 的通告信息，各 BACKUP 将宣告自己成为 MASTER，发送通告信息，重新进行 MASTER 选举状态。

ip规划如下:定义VIP为:172.16.23.132

nginx1:172.16.23.129 keepalived:172.16.23.129

nginx2:172.16.23.130 keepalived:172.16.23.130

httpd1:172.16.23.128

httpd2:172.16.23.131

上面规划中nginx只提供负载均衡作用,并不实现web访问功能:

[root@master ~]# cat /etc/ansible/hosts|grep "^\[nodes" -A 2
[nodes]
172.16.23.129
172.16.23.130

查看nginx服务状态:

[root@master ~]# ansible nodes -m shell -a "systemctl status nginx"|grep running
   Active: active (running) since 二 2018-12-18 16:33:04 CST; 12min ago
   Active: active (running) since 二 2018-12-18 16:35:51 CST; 10min ago

首先nginx服务正常开启,然后查看后端服务httpd:

[root@master ~]# cat /etc/ansible/hosts|grep "^\[backend_nodes" -A 2
[backend_nodes]
172.16.23.128
172.16.23.131

查看httpd服务状态:

[root@master ~]# ansible backend_nodes -m shell -a "systemctl status httpd"|grep running
   Active: active (running) since 二 2018-12-18 16:29:36 CST; 22min ago
   Active: active (running) since 二 2018-12-18 16:30:03 CST; 21min ago

然后在nginx两台服务器上分别测试负载均衡效果:

[root@master ~]# ansible 172.16.23.129 -m get_url -a "url=http://172.16.23.129/index.html dest=/tmp"|grep status_code
    "status_code": 200, 
[root@master ~]# ansible 172.16.23.129 -m shell -a "cat /tmp/index.html"
172.16.23.129 | CHANGED | rc=0 >>
172.16.23.128

[root@master ~]# ansible 172.16.23.129 -m get_url -a "url=http://172.16.23.129/index.html dest=/tmp"|grep status_code
    "status_code": 200, 
[root@master ~]# ansible 172.16.23.129 -m shell -a "cat /tmp/index.html"
172.16.23.129 | CHANGED | rc=0 >>
172.16.23.131

由上面可以看出nginx1:172.16.23.129上进行测试返回后端httpd服务的web页面:172.16.23.128以及172.16.23.131,测试访问没有问题,负载均衡没有问题

[root@master ~]# ansible 172.16.23.130 -m get_url -a "url=http://172.16.23.130/index.html dest=/tmp"|grep status_code
    "status_code": 200, 
[root@master ~]# ansible 172.16.23.130 -m shell -a "cat /tmp/index.html"
172.16.23.130 | CHANGED | rc=0 >>
172.16.23.128

[root@master ~]# ansible 172.16.23.130 -m get_url -a "url=http://172.16.23.130/index.html dest=/tmp"|grep status_code
    "status_code": 200, 
[root@master ~]# ansible 172.16.23.130 -m shell -a "cat /tmp/index.html"
172.16.23.130 | CHANGED | rc=0 >>
172.16.23.131

由上面可以看见nginx2服务访问后端httpd服务也是完全OK的,于是nginx两台服务负载均衡效果达到,现在在nginx两台服务器上安装keepalived服务:

[root@master ~]# ansible nodes -m shell -a "systemctl status keepalived"|grep running
   Active: active (running) since 二 2018-12-18 16:06:38 CST; 52min ago
   Active: active (running) since 二 2018-12-18 16:05:04 CST; 54min ago

查看VIP信息:发现vip在node1节点上

[root@master ~]# ansible nodes -m shell -a "hostname;ip a|grep ens33|grep -Po '(?<=inet ).*(?=\/)'"
172.16.23.129 | CHANGED | rc=0 >>
node1
172.16.23.129
172.16.23.132

172.16.23.130 | CHANGED | rc=0 >>
node2
172.16.23.130

可以看出VIP落在了nginx1也就是node1节点上,然后通过访问vip看看负载均衡效果:

[root@master ~]# curl http://172.16.23.132
172.16.23.131
[root@master ~]# curl http://172.16.23.132
172.16.23.128

由上面返回结果看,没有任何问题,现在摘掉一台nginx服务器,看看keepalived情况,以及访问vip的情况:

[root@master ~]# ansible 172.16.23.130 -m shell -a "systemctl stop nginx"
172.16.23.130 | CHANGED | rc=0 >>

查看keepalived服务状态,查看vip信息:

[root@master ~]# ansible nodes -m shell -a "systemctl status keepalived"|grep running
   Active: active (running) since 二 2018-12-18 16:05:04 CST; 1h 4min ago
   Active: active (running) since 二 2018-12-18 16:06:38 CST; 1h 3min ago

[root@master ~]# ansible nodes -m shell -a "hostname;ip a|grep ens33|grep -Po '(?<=inet ).*(?=\/)'"
172.16.23.130 | CHANGED | rc=0 >>
node2
172.16.23.130

172.16.23.129 | CHANGED | rc=0 >>
node1
172.16.23.129
172.16.23.132

vip信息没有漂移,keepalived服务状态正常,现在访问vip:

[root@master ~]# curl http://172.16.23.132
172.16.23.128
[root@master ~]# curl http://172.16.23.132
172.16.23.131

通过vip访问web服务没有问题

现在将nginx服务开启,端掉一个节点的keepalived服务:

[root@master ~]# ansible 172.16.23.130 -m shell -a "systemctl start nginx"
172.16.23.130 | CHANGED | rc=0 >>

[root@master ~]# ansible nodes -m shell -a "systemctl status nginx"|grep running
   Active: active (running) since 二 2018-12-18 17:15:48 CST; 18s ago
   Active: active (running) since 二 2018-12-18 16:33:04 CST; 43min ago

[root@master ~]# ansible 172.16.23.130 -m shell -a "systemctl stop keepalived"
172.16.23.130 | CHANGED | rc=0 >>

然后在该节点日志查看如下:tail -f /var/log/message

Dec 18 17:16:50 node2 systemd: Stopping LVS and VRRP High Availability Monitor...
Dec 18 17:16:50 node2 Keepalived[12981]: Stopping
Dec 18 17:16:50 node2 Keepalived_healthcheckers[12982]: Stopped
Dec 18 17:16:51 node2 Keepalived_vrrp[12983]: Stopped
Dec 18 17:16:51 node2 Keepalived[12981]: Stopped Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2
Dec 18 17:16:52 node2 systemd: Stopped LVS and VRRP High Availability Monitor.

[root@master ~]# ansible nodes -m shell -a "systemctl status keepalived"|grep running
   Active: active (running) since 二 2018-12-18 16:06:38 CST; 1h 10min ago


[root@master ~]# ansible nodes -m shell -a "hostname;ip a|grep ens33|grep -Po '(?<=inet ).*(?=\/)'"
172.16.23.130 | CHANGED | rc=0 >>
node2
172.16.23.130

172.16.23.129 | CHANGED | rc=0 >>
node1
172.16.23.129
172.16.23.132

由于断掉的是nginx2也就是node2节点的keepalived服务,所以vip还是在node1上,并没有漂移在node2,查看node1和node2节点上keepalived服务的配置文件:

[root@master ~]# ansible nodes -m shell -a "cat /etc/keepalived/keepalived.conf"
172.16.23.129 | CHANGED | rc=0 >>
! Configuration File for keepalived

global_defs {
   notification_email {
       [email protected]
   }
   notification_email_from [email protected]
   smtp_server smtp.163.com
   smtp_connect_timeout 30
   router_id test
}

vrrp_instance VI_1 {
    state BACKUP
    interface ens33
    virtual_router_id 51
    priority 100
    nopreempt           # 非抢占模式
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        172.16.23.132/24 dev ens33
    }
}

172.16.23.130 | CHANGED | rc=0 >>
! Configuration File for keepalived

global_defs {
   notification_email {
       [email protected]
   }
   notification_email_from [email protected]
   smtp_server smtp.163.com
   smtp_connect_timeout 30
   router_id test
}

vrrp_instance VI_1 {
    state BACKUP 
    interface ens33
    virtual_router_id 51
    priority 99
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        172.16.23.132/24 dev ens33 
    }
}

可以由配置看出,只有优先级不一样以及node1节点设置了nopreempt # 非抢占模式,现在将node2节点的keepalived服务开启,然后将node1节点的keepalived服务关掉,看看vip信息:

[root@master ~]# ansible 172.16.23.130 -m shell -a "systemctl start keepalived"
172.16.23.130 | CHANGED | rc=0 >>

查看node2日志:

Dec 18 17:23:14 node2 systemd: Starting LVS and VRRP High Availability Monitor...
Dec 18 17:23:14 node2 Keepalived[15994]: Starting Keepalived v1.3.5 (03/19,2017), git commit v1.3.5-6-g6fa32f2
Dec 18 17:23:14 node2 Keepalived[15994]: Opening file '/etc/keepalived/keepalived.conf'.
Dec 18 17:23:14 node2 Keepalived[15995]: Starting Healthcheck child process, pid=15996
Dec 18 17:23:14 node2 Keepalived_healthcheckers[15996]: Opening file '/etc/keepalived/keepalived.conf'.
Dec 18 17:23:14 node2 Keepalived[15995]: Starting VRRP child process, pid=15997
Dec 18 17:23:14 node2 systemd: Started LVS and VRRP High Availability Monitor.
Dec 18 17:23:14 node2 Keepalived_vrrp[15997]: Registering Kernel netlink reflector
Dec 18 17:23:14 node2 Keepalived_vrrp[15997]: Registering Kernel netlink command channel
Dec 18 17:23:14 node2 Keepalived_vrrp[15997]: Registering gratuitous ARP shared channel
Dec 18 17:23:14 node2 Keepalived_vrrp[15997]: Opening file '/etc/keepalived/keepalived.conf'.
Dec 18 17:23:24 node2 Keepalived_vrrp[15997]: VRRP_Instance(VI_1) removing protocol VIPs.
Dec 18 17:23:24 node2 Keepalived_vrrp[15997]: Using LinkWatch kernel netlink reflector...
Dec 18 17:23:24 node2 Keepalived_vrrp[15997]: VRRP_Instance(VI_1) Entering BACKUP STATE
Dec 18 17:23:24 node2 Keepalived_vrrp[15997]: VRRP sockpool: [ifindex(2), proto(112), unicast(0), fd(10,11)]

两节点keepalived服务状态,以及vip信息:

[root@master ~]# ansible nodes -m shell -a "systemctl status keepalived"|grep running
   Active: active (running) since 二 2018-12-18 17:23:14 CST; 56s ago
   Active: active (running) since 二 2018-12-18 16:06:38 CST; 1h 17min ago

[root@master ~]# ansible nodes -m shell -a "hostname;ip a|grep ens33|grep -Po '(?<=inet ).*(?=\/)'"
172.16.23.129 | CHANGED | rc=0 >>
node1
172.16.23.129
172.16.23.132

172.16.23.130 | CHANGED | rc=0 >>
node2
172.16.23.130

keepalived+nginx实现HA高可用的web负载均衡

猜你喜欢