ssdb cluster + keepalived build actual combat-4.keepalived configuration
surroundings
Operating system: CentOS Linux release 7.6.1810 (Core)
ssdb: 1.9.7
keepalived: keepalived-2.0.16
IP:
master: 10.80.2.121
slave: 10.80.2.85
vip:10.80.2.156
Install keepalived
1. Install the expansion pack
Depending on the operating system, the required installation package may vary
[root@prodssdb-001 data]# yum install openssl-devel -y
2. Download the keepalived installation package
Download address: keepalived official website
Create a directory under / opt:
[root@prodssdb-001 opt]# mkdir -p /opt/keepalive
[root@prodssdb-001 opt]# cd /opt/keepalive/
Download the installation package:
[root@prodssdb-001 keepalive]# wget https://www.keepalived.org/software/keepalived-2.0.16.tar.gz
3. Install keepalived
Unzip:
[root@prodssdb-001 keepalive]# tar zxvf keepalived-2.0.16.tar.gz
installation:
[root@prodssdb-001 keepalive]# cd keepalived-2.0.16
[root@prodssdb-001 keepalived-2.0.16]# ./configure --prefix=/
[root@prodssdb-001 keepalived-2.0.16]# make
[root@prodssdb-001 keepalived-2.0.16]# make install
After completion, the following files are generated:
[root@prodssdb-001 keepalived-2.0.16]# ls
aclocal.m4 bin ChangeLog config.status CONTRIBUTORS doc install-sh keepalived.spec.in Makefile.am README TODO
ar-lib bin_install compile configure COPYING genhash keepalived lib Makefile.in README.md
AUTHOR build_setup config.log configure.ac depcomp INSTALL keepalived.spec Makefile missing snap
Configure keepalived
Modify the keepalived configuration of the master and slave respectively
1. Configure the main library end keepalived
The main library configuration is as follows:
[root@prodssdb-001 keepalived]# cat keepalived.conf
! Configuration File for keepalived
global_defs {
router_id HA_SSDB #路由标识,在一个局域网里面应该是唯一的
script_user root #脚本执行者
}
vrrp_script check_fence { #定义一个定时脚本
script "/etc/keepalived/check_db.sh"
interval 6 #定义脚本执行间隔,单位秒
weight 20 #定义优先级偏移量,如脚本成功,则优先级需要加上此数值
}
vrrp_instance VI_1 { #定义一个虚拟路由
state MASTER #当前节点在此虚拟路由器上的初始状态;只能有一个是MASTER,余下的都应该为BACKUP;
nopreempt #设置为非抢占模式
interface eth0 #绑定为当前虚拟路由器使用的物理接口;
virtual_router_id 99 #当前虚拟路由器的惟一标识,主备要一致,范围是0-255;
priority 100 #当前主机在此虚拟路径器中的优先级;数字越大,优先级越高,主DR必须大于备用DR,范围1-254;
advert_int 1 #通告发送间隔,包含主机优先级、心跳等。
authentication { #认证配置
auth_type PASS #认证类型,PASS表示简单字符串认证
auth_pass root密码
}
virtual_ipaddress {
10.80.2.156/24 #虚拟IP(VIP)地址,可多设,每行一个
}
track_script {
check_fence
}
}
The configuration defines a virtual route VI_1, sets its initial state to MASTER, and writes a regular detection script check_db.sh, which is a brain splitting script I wrote myself.
2. Configure keepalived from the library
[root@ecloud02-carchat-prod-ssdb02 keepalived]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived
global_defs {
router_id HA_SSDB
script_user root
}
vrrp_script check_fence {
script "/etc/keepalived/check_db.sh"
interval 6
weight 20
}
vrrp_instance VI_1 {
state BACKUP
nopreempt
interface eth0
virtual_router_id 99
priority 90
advert_int 1
authentication {
auth_type PASS
auth_pass root密码
}
virtual_ipaddress {
10.80.2.156/24
}
track_script {
check_fence
}
}
It should be noted that I set the priority of the master library to 100, the priority of the slave library to 90, and the weight of the inspection script to 20.
In this way, if the master library hangs, the virtual ip will float to the slave library:
90(从库priority) + 20(weight) > 100(主库priority)
When the main library returns to normal, VIP will automatically float back to the main library:
90(从库priority) + 20(weight) < 100(主库priority) + 20(weight)
3. Configure anti-brain splitting script
[root@ecloud02-carchat-prod-ssdb02 keepalived]# cat check_db.sh
#!/bin/bash
FENCE_LOG=/root/fence_log
GATEWAY=10.80.2.1
VIP=10.80.2.156
INTKEY='eth0'
NOW=`date +'%Y-%m-%d_%H:%M:%S'`
kill_resource() {
_VIP=$1
_INTKEY=$2
# /sbin/ifconfig $_VIP:$_INTKEY down
/usr/sbin/ip addr del $_VIP dev $_INTKEY
ps aux|grep ssdb-server|grep -v grep|awk '{print $2}'|xargs kill -9
/usr/bin/systemctl stop keepalived
ps aux|grep "keepalived -D"|grep -v grep|awk '{print $2}'|xargs kill -9
}
TMP_LOG=`tail -n43200 $FENCE_LOG`
echo "$TMP_LOG" > $FENCE_LOG
PING_RESULT=`ping -w 5 $GATEWAY 2>&1 | grep -E "100% packet loss|unreachable"`
SSDB_PROC=`ps aux | grep -v grep | grep ssdb-server|wc -l`
echo "$NOW" >> $FENCE_LOG
echo "ping result is: $PING_RESULT" >> $FENCE_LOG
echo "ssdb proc number is: $SSDB_PROC" >> $FENCE_LOG
if [ -n "$PING_RESULT" -o "$SSDB_PROC" -eq 0 ]; then
echo "I have no network connection or ssdb is down!I have to kill VIP and ssdb" >> $FENCE_LOG
kill_resource $VIP $INTKEY
exit 1
else
exit 0
fi
The script will determine the machine's network connectivity status and ssdb service status. Once it is found that the machine cannot ping the gateway within 5 seconds, or the ssdb-server service cannot be found, the system will start the kill mode, shut down the machine's vip, and kill the machine's ssdb Process to achieve the effect of preventing split brain.
verification
Open the master-slave keepalived:
[root@ecloud02-carchat-prod-ssdb01 keepalived-2.0.16]# systemctl start keepalived
1. Network status
View the ip information on the main library:
[root@ecloud02-carchat-prod-ssdb01 keepalived-2.0.16]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:eb:c2:06 brd ff:ff:ff:ff:ff:ff
inet 10.80.2.121/24 brd 10.80.2.255 scope global noprefixroute eth0
valid_lft forever preferred_lft forever
inet 10.80.2.156/24 scope global secondary eth0
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:feeb:c206/64 scope link
valid_lft forever preferred_lft forever
View the ip information on the slave side:
[root@ecloud02-carchat-prod-ssdb02 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
link/ether fa:16:3e:4e:78:6b brd ff:ff:ff:ff:ff:ff
inet 10.80.2.85/24 brd 10.80.2.255 scope global noprefixroute eth0
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe4e:786b/64 scope link
valid_lft forever preferred_lft forever
You can see that the virtual ip is hanging on the main library side.
2. Simulate the main database ssdb service down
Manually kill all ssdb services on the main database node to simulate downtime:
ps aux|grep ssdb-server|grep -v grep|awk '{print $2}'|xargs kill -9
It will be observed that the main library ssdb, keepalived are all closed, and the virtual IP floats to the slave library side
3. Simulate the main library ssdb network interruption
Manually shut down the network of the master library so that it cannot ping the gateway:
the master library ssdb and keepalived are all closed, and the virtual ip floats to the slave