ssdb cluster + keepalived build actual combat-4.keepalived configuration

ssdb cluster + keepalived build actual combat-4.keepalived configuration

surroundings

Operating system: CentOS Linux release 7.6.1810 (Core)
ssdb: 1.9.7
keepalived: keepalived-2.0.16
IP:
master: 10.80.2.121
slave: 10.80.2.85

vip:10.80.2.156

Install keepalived

1. Install the expansion pack

Depending on the operating system, the required installation package may vary

[root@prodssdb-001 data]# yum install openssl-devel -y

2. Download the keepalived installation package

Download address: keepalived official website

Create a directory under / opt:

[root@prodssdb-001 opt]# mkdir -p /opt/keepalive
[root@prodssdb-001 opt]# cd /opt/keepalive/

Download the installation package:

[root@prodssdb-001 keepalive]# wget https://www.keepalived.org/software/keepalived-2.0.16.tar.gz

3. Install keepalived

Unzip:

[root@prodssdb-001 keepalive]# tar zxvf keepalived-2.0.16.tar.gz

installation:

[root@prodssdb-001 keepalive]# cd keepalived-2.0.16
[root@prodssdb-001 keepalived-2.0.16]# ./configure --prefix=/
[root@prodssdb-001 keepalived-2.0.16]# make
[root@prodssdb-001 keepalived-2.0.16]# make install

After completion, the following files are generated:

[root@prodssdb-001 keepalived-2.0.16]# ls
aclocal.m4  bin          ChangeLog   config.status  CONTRIBUTORS  doc      install-sh       keepalived.spec.in  Makefile.am  README     TODO
ar-lib      bin_install  compile     configure      COPYING       genhash  keepalived       lib                 Makefile.in  README.md
AUTHOR      build_setup  config.log  configure.ac   depcomp       INSTALL  keepalived.spec  Makefile            missing      snap

Configure keepalived

Modify the keepalived configuration of the master and slave respectively

1. Configure the main library end keepalived

The main library configuration is as follows:

[root@prodssdb-001 keepalived]# cat keepalived.conf
! Configuration File for keepalived

global_defs {
   router_id HA_SSDB             #路由标识,在一个局域网里面应该是唯一的
   script_user root              #脚本执行者
}

vrrp_script check_fence {        #定义一个定时脚本
    script "/etc/keepalived/check_db.sh"
    interval 6                   #定义脚本执行间隔,单位秒
    weight 20                    #定义优先级偏移量,如脚本成功,则优先级需要加上此数值
}

vrrp_instance VI_1 {             #定义一个虚拟路由
    state MASTER                 #当前节点在此虚拟路由器上的初始状态;只能有一个是MASTER,余下的都应该为BACKUP;
    nopreempt                    #设置为非抢占模式
    interface eth0               #绑定为当前虚拟路由器使用的物理接口;
    virtual_router_id 99         #当前虚拟路由器的惟一标识,主备要一致,范围是0-255;
    priority 100                 #当前主机在此虚拟路径器中的优先级;数字越大,优先级越高,主DR必须大于备用DR,范围1-254;
    advert_int 1                 #通告发送间隔,包含主机优先级、心跳等。
    authentication {             #认证配置  
        auth_type PASS           #认证类型,PASS表示简单字符串认证
        auth_pass root密码       
    }
    virtual_ipaddress {
        10.80.2.156/24           #虚拟IP(VIP)地址,可多设,每行一个
    }
    track_script {
        check_fence
    }
}

The configuration defines a virtual route VI_1, sets its initial state to MASTER, and writes a regular detection script check_db.sh, which is a brain splitting script I wrote myself.

2. Configure keepalived from the library

[root@ecloud02-carchat-prod-ssdb02 keepalived]# cat /etc/keepalived/keepalived.conf
! Configuration File for keepalived

global_defs {
   router_id HA_SSDB
   script_user root
}

vrrp_script check_fence {
    script "/etc/keepalived/check_db.sh"
    interval 6
    weight 20
}

vrrp_instance VI_1 {
    state BACKUP
    nopreempt
    interface eth0
    virtual_router_id 99
    priority 90
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass root密码
    }
    virtual_ipaddress {
        10.80.2.156/24
    }
    track_script {
        check_fence
    }
}

It should be noted that I set the priority of the master library to 100, the priority of the slave library to 90, and the weight of the inspection script to 20.
In this way, if the master library hangs, the virtual ip will float to the slave library:

90(从库priority) + 20(weight) > 100(主库priority) 

When the main library returns to normal, VIP will automatically float back to the main library:

90(从库priority) + 20(weight) < 100(主库priority) + 20(weight)

3. Configure anti-brain splitting script

[root@ecloud02-carchat-prod-ssdb02 keepalived]# cat check_db.sh 
#!/bin/bash
FENCE_LOG=/root/fence_log
GATEWAY=10.80.2.1
VIP=10.80.2.156
INTKEY='eth0'
NOW=`date +'%Y-%m-%d_%H:%M:%S'`

kill_resource() {
    _VIP=$1
    _INTKEY=$2
#    /sbin/ifconfig $_VIP:$_INTKEY down
    /usr/sbin/ip addr del $_VIP dev $_INTKEY
    ps aux|grep ssdb-server|grep -v grep|awk '{print $2}'|xargs kill -9
    /usr/bin/systemctl stop keepalived
    ps aux|grep "keepalived -D"|grep -v grep|awk '{print $2}'|xargs kill -9
}

TMP_LOG=`tail -n43200 $FENCE_LOG`
echo "$TMP_LOG" > $FENCE_LOG

PING_RESULT=`ping -w 5 $GATEWAY 2>&1 | grep -E "100% packet loss|unreachable"`
SSDB_PROC=`ps aux | grep -v grep | grep ssdb-server|wc -l`

echo "$NOW" >> $FENCE_LOG
echo "ping result is: $PING_RESULT" >> $FENCE_LOG
echo "ssdb proc number is: $SSDB_PROC" >> $FENCE_LOG
if [ -n "$PING_RESULT" -o "$SSDB_PROC" -eq 0 ]; then
    echo "I have no network connection or ssdb is down!I have to kill VIP and ssdb" >> $FENCE_LOG
    kill_resource $VIP $INTKEY
    exit 1 
else 
    exit 0
fi

The script will determine the machine's network connectivity status and ssdb service status. Once it is found that the machine cannot ping the gateway within 5 seconds, or the ssdb-server service cannot be found, the system will start the kill mode, shut down the machine's vip, and kill the machine's ssdb Process to achieve the effect of preventing split brain.

verification

Open the master-slave keepalived:

[root@ecloud02-carchat-prod-ssdb01 keepalived-2.0.16]# systemctl start keepalived

1. Network status

View the ip information on the main library:

[root@ecloud02-carchat-prod-ssdb01 keepalived-2.0.16]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether fa:16:3e:eb:c2:06 brd ff:ff:ff:ff:ff:ff
    inet 10.80.2.121/24 brd 10.80.2.255 scope global noprefixroute eth0
       valid_lft forever preferred_lft forever
    inet 10.80.2.156/24 scope global secondary eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:feeb:c206/64 scope link 
       valid_lft forever preferred_lft forever

View the ip information on the slave side:

[root@ecloud02-carchat-prod-ssdb02 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether fa:16:3e:4e:78:6b brd ff:ff:ff:ff:ff:ff
    inet 10.80.2.85/24 brd 10.80.2.255 scope global noprefixroute eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::f816:3eff:fe4e:786b/64 scope link 
       valid_lft forever preferred_lft forever

You can see that the virtual ip is hanging on the main library side.

2. Simulate the main database ssdb service down

Manually kill all ssdb services on the main database node to simulate downtime:

ps aux|grep ssdb-server|grep -v grep|awk '{print $2}'|xargs kill -9

It will be observed that the main library ssdb, keepalived are all closed, and the virtual IP floats to the slave library side

3. Simulate the main library ssdb network interruption

Manually shut down the network of the master library so that it cannot ping the gateway:
the master library ssdb and keepalived are all closed, and the virtual ip floats to the slave

Published 136 original articles · Like 58 · Visits 360,000+

Guess you like

Origin blog.csdn.net/sunbocong/article/details/94401014