MHA高可用群集架构配置实验

文章目录

一、MHA概述
二、实验步骤
三、MHA架构部署

3.1、实验环境
3.2、安装数据库
3.3、配置一主两从
3.4、搭建MHA

四、验证MHA架构

一、MHA概述

日本DeNA公司youshimaton（现就职于Facebook公司）开发
一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件
由MHA Manager（管理节点）和MHA Node（数据节点）组成

二、实验步骤

1、部署MHA架构
1）数据库安装
2）一主两从
3）MHA搭建

2、故障模拟并验证
1）主库失效
2）备选主库成为主库，实现高可用

三、MHA架构部署

3.1、实验环境

在vmware虚拟机中开4台服务器，4台均为Centos7版本
在这里插入图片描述
修改4台服务器主机名并在4台服务器上添加ip和主机名对应关系。

hostnamectl set-hostname manager         ####修改每台主机名
su
hostnamectl set-hostname master
su
hostnamectl set-hostname slave1
su
hostnamectl set-hostname slave2
su

vim /etc/hosts              ##添加映射
192.168.5.201    manager
192.168.5.203	 master
192.168.5.204    slave1
192.168.5.205    slave2

3.2、安装数据库

需要在节点服务器master和两个slave服务器上安装mysql。
安装方法可以参考我前面的博客，点此跳~~~~~~

3.3、配置一主两从

修改三台节点服务器mysql 的主配置文件/etc/my.cnf，注意三台服务器的 server-id 不能一样

---配置主服务器：
vim /etc/my.cnf
[mysql]
server-id = 1
#开启二进制日志
log_bin = master-bin
#允许从服务器进行同步
log-slave-updates = true

---配置从服务器slave1：
vim /etc/my.cnf
[mysql]
server-id = 2
#开启二进制日志
log_bin = master-bin
 #使用中继日志进行同步
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index

---配置从服务器slave2：
vim /etc/my.cnf
[mysql]
server-id = 3
log_bin = master-bin
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index

关闭防火墙，启动数据库

#在三台服务器上分别创建这两个个软链接
ln -s /usr/local/mysql/bin/mysql /usr/sbin/
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/

#启动mysql
/usr/local/mysql/bin/mysqld_safe --user=mysql &

#关闭防火墙和安全功能
systemctl stop firewalld.service
setenforce 0

数据库用户权限配置
三台节点服务器都开启以下权限

mysql -uroot  -pabc123     ###进入数据库，密码是之前设置的”abc123“

mysql> grant replication slave on *.* to 'myslave'@'192.168.5.%' identified by '123';
###允许从服务器使用myslave这个用户名去复制主服务器数据

mysql> grant all privileges on *.* to 'mha'@'192.168.5.%' identified by 'manager';
###允许manager服务器使用mha这个用户名的所有权限

mysql> flush privileges;   ##刷新

####下面三条授权按理论是不用添加的，但是做案例实验环境时候通过 MHA 检查MySQL 主从有报错，报两个从库通过主机名连接不上主库，所以所有数据库加上下面的授权。

mysql> grant all privileges on *.* to 'mha'@'master' identified by 'manager';
mysql> grant all privileges on *.* to 'mha'@'slave1' identified by 'manager';
mysql> grant all privileges on *.* to 'mha'@'slave2' identified by 'manager';

在 master服务器上查看二进制文件和同步点

mysql> show master status;
+-------------------+----------+--------------+------------------+-------------------+
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-------------------+----------+--------------+------------------+-------------------+
| master-bin.000001 |      154 |              |                  |                   |
+-------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)

接下来在 slave1和 slave2服务器上分别执行同步并设置两个从库为只读模式

mysql> change master to master_host='192.168.5.203',master_user='myslave',master_password='123',master_log_file='master-bin.000001',master_log_pos=154; 
mysql> start slave;   ###开启同步
mysql> set global read_only=1;   ###设置为为只读模式

3.4、搭建MHA

关闭4台服务器的防火墙

systemctl stop firewalld.service 
setenforce 0

在4台服务器上都安装 MHA 依赖的环境，首先安装 epel 源。

yum install epel-release --nogpgcheck
yum install -y perl-DBD-MySQL \
perl-Config-Tiny \
perl-Log-Dispatch \
perl-Parallel-ForkManager \
perl-ExtUtils-CBuilder \
perl-ExtUtils-MakeMaker \
perl-CPAN

在4台服务器上都安装node 组件（我这边用的Centos7.4，所以用node0.57版本），
下面是在master服务器上演示安装node，实际4台都要安装。

[root@master ~]# mkdir /abc 
[root@master ~]# mount.cifs //192.168.1.109/MHA /abc  ##将真机的文件夹共享过来
[root@master ~]# cd /abc/mha/
[root@master mha]# tar zxvf mha4mysql-node-0.57.tar.gz -C /opt
[root@master mha]# cd /opt/mha4mysql-node-0.57/
[root@master mha4mysql-node-0.57]# perl Makefile.PL 
[root@master mha4mysql-node-0.57]# make & make install

在 MHA-manager 上安装 manager 组件（！注意：一定要先安装node 组件才能安装manager 组件）

[root@manager mha]# cd /abc/mha/
[root@manager mha]# ls
cmake-2.8.6.tar.gz             mha4mysql-node-0.57.tar.gz  ruby-2.4.1.tar.gz
mha4mysql-manager-0.57.tar.gz  mysql-5.6.36.tar.gz         ruby安装.png
[root@manager mha]# tar zxvf mha4mysql-manager-0.57.tar.gz -C /opt
[root@manager mha]# cd /opt/mha4mysql-manager-0.57/
[root@manager mha4mysql-manager-0.57]# perl Makefile.PL 
[root@manager mha4mysql-manager-0.57]# make & make install

manager 和node安装后在/usr/local/bin 下面会生成几个工具，主要包括以下几个

[root@manager mha4mysql-manager-0.57]# cd /usr/local/bin 
[root@manager bin]# ls
masterha_check_ssh 检查 MHA 的 SSH 配置状况
masterha_check_repl 检查 MySQL 复制状况
masterha_manger 启动 manager的脚本
masterha_check_status 检测当前 MHA 运行状态
masterha_master_monitor 检测 master 是否宕机
masterha_master_switch 控制故障转移（自动或者手动）
masterha_conf_host 添加或删除配置的 server 信息
masterha_stop  关闭manager
save_binary_logs 保存和复制 master 的二进制日志
apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用于其他的 slave
filter_mysqlbinlog 去除不必要的 ROLLBACK 事件（MHA 已不再使用这个工具）
purge_relay_logs 清除中继日志（不会阻塞 SQL 线程）

为4台服务器配置ssh无密码认证
在 manager 上配置到所有数据库节点的无密码认证

[root@manager ~]# ssh-keygen -t rsa //一路按回车键
[root@manager ~]# ssh-copy-id 192.168.5.203   ##需要输入终端登录密码
[root@manager ~]# ssh-copy-id 192.168.5.204
[root@manager ~]# ssh-copy-id 192.168.5.205

在 master 上配置到数据库节点slave1和slave2的无密码认证

[root@master ~]# ssh-keygen -t rsa
[root@master ~]# ssh-copy-id 192.168.5.204
[root@master ~]# ssh-copy-id 192.168.5.205

在 slave1上配置到数据库节点master和slave2的无密码认证

[root@slave1  ~]# ssh-keygen -t rsa
[root@slave1  ~]# ssh-copy-id 192.168.5.203
[root@slave1  ~]# ssh-copy-id 192.168.5.205

在 slave2上配置到数据库节点master和slave2的无密码认证

[root@slave2 ~]# ssh-keygen -t rsa
[root@slave2  ~]# ssh-copy-id 192.168.5.203
[root@slave2  ~]# ssh-copy-id 192.168.5.204

修改MHA配置文件，在manager 服务器上复制相关脚本到/usr/local/bin 目录

[root@manager ~]# cp -ra /opt/mha4mysql-manager-0.57/samples/scripts /usr/local/bin
//拷贝后会有四个执行文件
[root@atlas ~]# ll /usr/local/bin/scripts/
总用量 32
-rwxr-xr-x 1 mysql mysql 3648 5 月 31 2015 master_ip_failover  #自动切换时 VIP 管理的脚本
-rwxr-xr-x 1 mysql mysql 9872 5 月 25 09:07 master_ip_online_change #在线切换时 vip 的管理
-rwxr-xr-x 1 mysql mysql 11867 5 月 31 2015 power_manager #故障发生后关闭主机的脚本
-rwxr-xr-x 1 mysql mysql 1360 5 月 31 2015 send_report #因故障切换后发送报警的脚本

复制上述的自动切换时 VIP 管理的脚本到/usr/local/bin 目录，这里使用脚本管理 VIP

[root@manager ~]# cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin

修改内容如下：（删除原有内容，输入以下内容）

[root@manager ~]# vim /usr/local/bin/master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
use Getopt::Long;
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);

my $vip = '192.168.5.200';
my $brdc = '192.168.5.255';
my $ifdev = 'ens33';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
my $exit_code = 0;
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";

GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {

my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

由于是直接复制的，需要输入“% s/^#//g”命令将多余的#号去掉

在这里插入图片描述

别忘了在第一行加上#号

在这里插入图片描述

创建 MHA 软件目录并拷贝配置文件。

[root@manager ~]# mkdir /etc/masterha
[root@manager ~]# cp /opt/mha4mysql-manager-0.57/samples/conf/app1.cnf /etc/masterha/
[root@manager ~]# vim /etc/masterha/app1.cnf

[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/usr/local/mysql/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=manager
ping_interval=1
remote_workdir=/tmp
repl_password=123
repl_user=myslave
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.5.204 -s 192.168.5.205
shutdown_script=""
ssh_user=root
user=mha

[server1]
hostname=192.168.5.203
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.5.204
port=3306

[server3]
hostname=192.168.5.205
port=3306

测试 ssh 无密码认证，如果正常最后会输出 successfully，如下所示

[root@manager ~]# masterha_check_ssh -conf=/etc/masterha/app1.cnf
Sat Feb  8 10:28:09 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Feb  8 10:28:09 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sat Feb  8 10:28:09 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sat Feb  8 10:28:09 2020 - [info] Starting SSH connection tests..
Sat Feb  8 10:28:12 2020 - [debug] 
Sat Feb  8 10:28:10 2020 - [debug]  Connecting via SSH from [email protected](192.168.5.204:22) to [email protected](192.168.5.203:22)..
Sat Feb  8 10:28:11 2020 - [debug]   ok.
Sat Feb  8 10:28:11 2020 - [debug]  Connecting via SSH from [email protected](192.168.5.204:22) to [email protected](192.168.5.205:22)..
Sat Feb  8 10:28:11 2020 - [debug]   ok.
Sat Feb  8 10:28:12 2020 - [debug] 
Sat Feb  8 10:28:09 2020 - [debug]  Connecting via SSH from [email protected](192.168.5.203:22) to [email protected](192.168.5.204:22)..
Sat Feb  8 10:28:10 2020 - [debug]   ok.
Sat Feb  8 10:28:10 2020 - [debug]  Connecting via SSH from [email protected](192.168.5.203:22) to [email protected](192.168.5.205:22)..
Sat Feb  8 10:28:11 2020 - [debug]   ok.
Sat Feb  8 10:28:13 2020 - [debug] 
Sat Feb  8 10:28:10 2020 - [debug]  Connecting via SSH from [email protected](192.168.5.205:22) to [email protected](192.168.5.203:22)..
Sat Feb  8 10:28:11 2020 - [debug]   ok.
Sat Feb  8 10:28:11 2020 - [debug]  Connecting via SSH from [email protected](192.168.5.205:22) to [email protected](192.168.5.204:22)..
Sat Feb  8 10:28:12 2020 - [debug]   ok.
Sat Feb  8 10:28:13 2020 - [info] All SSH connection tests passed successfully.

测试健康状态

[root@manager bin]# masterha_check_repl -conf=/etc/masterha/app1.cnf
Sat Feb  8 10:31:56 2020 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Feb  8 10:31:56 2020 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Sat Feb  8 10:31:56 2020 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Sat Feb  8 10:31:56 2020 - [info] MHA::MasterMonitor version 0.57.
Sat Feb  8 10:31:58 2020 - [info] GTID failover mode = 0
Sat Feb  8 10:31:58 2020 - [info] Dead Servers:
Sat Feb  8 10:31:58 2020 - [info] Alive Servers:
Sat Feb  8 10:31:58 2020 - [info]   192.168.5.203(192.168.5.203:3306)
Sat Feb  8 10:31:58 2020 - [info]   192.168.5.204(192.168.5.204:3306)
Sat Feb  8 10:31:58 2020 - [info]   192.168.5.205(192.168.5.205:3306)
Sat Feb  8 10:31:58 2020 - [info] Alive Slaves:
Sat Feb  8 10:31:58 2020 - [info]   192.168.5.204(192.168.5.204:3306)  Version=5.6.26-log (oldest major version between slaves) log-bin:enabled
Sat Feb  8 10:31:58 2020 - [info]     Replicating from 192.168.5.203(192.168.5.203:3306)
Sat Feb  8 10:31:58 2020 - [info]     Primary candidate for the new Master (candidate_master is set)
Sat Feb  8 10:31:58 2020 - [info]   192.168.5.205(192.168.5.205:3306)  Version=5.6.26-log (oldest major version between slaves) log-bin:enabled
Sat Feb  8 10:31:58 2020 - [info]     Replicating from 192.168.5.203(192.168.5.203:3306)
Sat Feb  8 10:31:58 2020 - [info] Current Alive Master: 192.168.5.203(192.168.5.203:3306)
Sat Feb  8 10:31:58 2020 - [info] Checking slave configurations..
Sat Feb  8 10:31:58 2020 - [warning]  relay_log_purge=0 is not set on slave 192.168.5.204(192.168.5.204:3306).
Sat Feb  8 10:31:58 2020 - [warning]  relay_log_purge=0 is not set on slave 192.168.5.205(192.168.5.205:3306).
Sat Feb  8 10:31:58 2020 - [info] Checking replication filtering settings..
Sat Feb  8 10:31:58 2020 - [info]  binlog_do_db= , binlog_ignore_db= 
Sat Feb  8 10:31:58 2020 - [info]  Replication filtering check ok.
Sat Feb  8 10:31:58 2020 - [info] GTID (with auto-pos) is not supported
Sat Feb  8 10:31:58 2020 - [info] Starting SSH connection tests..
Sat Feb  8 10:32:01 2020 - [info] All SSH connection tests passed successfully.
Sat Feb  8 10:32:01 2020 - [info] Checking MHA Node version..
Sat Feb  8 10:32:02 2020 - [info]  Version check ok.
Sat Feb  8 10:32:02 2020 - [info] Checking SSH publickey authentication settings on the current master..
Sat Feb  8 10:32:02 2020 - [info] HealthCheck: SSH to 192.168.5.203 is reachable.
Sat Feb  8 10:32:03 2020 - [info] Master MHA Node version is 0.57.
Sat Feb  8 10:32:03 2020 - [info] Checking recovery script configurations on 192.168.5.203(192.168.5.203:3306)..
Sat Feb  8 10:32:03 2020 - [info]   Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/usr/local/mysql/data --output_file=/tmp/save_binary_logs_test --manager_version=0.57 --start_file=master-bin.000001 
Sat Feb  8 10:32:03 2020 - [info]   Connecting to [email protected](192.168.5.203:22).. 
  Creating /tmp if not exists..    ok.
  Checking output directory is accessible or not..
   ok.
  Binlog found at /usr/local/mysql/data, up to master-bin.000001
Sat Feb  8 10:32:03 2020 - [info] Binlog setting check done.
Sat Feb  8 10:32:03 2020 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..
Sat Feb  8 10:32:03 2020 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.5.204 --slave_ip=192.168.5.204 --slave_port=3306 --workdir=/tmp --target_version=5.6.26-log --manager_version=0.57 --relay_log_info=/home/mysql/relay-log.info  --relay_dir=/home/mysql/  --slave_pass=xxx
Sat Feb  8 10:32:03 2020 - [info]   Connecting to [email protected](192.168.5.204:22).. 
  Checking slave recovery environment settings..
    Opening /home/mysql/relay-log.info ... ok.
    Relay log found at /home/mysql, up to relay-log-bin.000001
    Temporary relay log file is /home/mysql/relay-log-bin.000001
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Sat Feb  8 10:32:18 2020 - [info]   Executing command : apply_diff_relay_logs --command=test --slave_user='mha' --slave_host=192.168.5.205 --slave_ip=192.168.5.205 --slave_port=3306 --workdir=/tmp --target_version=5.6.26-log --manager_version=0.57 --relay_log_info=/home/mysql/relay-log.info  --relay_dir=/home/mysql/  --slave_pass=xxx
Sat Feb  8 10:32:18 2020 - [info]   Connecting to [email protected](192.168.5.205:22).. 
  Checking slave recovery environment settings..
    Opening /home/mysql/relay-log.info ... ok.
    Relay log found at /home/mysql, up to relay-log-bin.000001
    Temporary relay log file is /home/mysql/relay-log-bin.000001
    Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.
 done.
    Testing mysqlbinlog output.. done.
    Cleaning up test file(s).. done.
Sat Feb  8 10:32:34 2020 - [info] Slaves settings check done.
Sat Feb  8 10:32:34 2020 - [info] 
192.168.5.203(192.168.5.203:3306) (current master)
 +--192.168.5.204(192.168.5.204:3306)
 +--192.168.5.205(192.168.5.205:3306)

Sat Feb  8 10:32:34 2020 - [info] Checking replication health on 192.168.5.204..
Sat Feb  8 10:32:34 2020 - [info]  ok.
Sat Feb  8 10:32:34 2020 - [info] Checking replication health on 192.168.5.205..
Sat Feb  8 10:32:34 2020 - [info]  ok.
Sat Feb  8 10:32:34 2020 - [info] Checking master_ip_failover_script status:
Sat Feb  8 10:32:34 2020 - [info]   /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.5.203 --orig_master_ip=192.168.5.203 --orig_master_port=3306 


IN SCRIPT TEST====/sbin/ifconfig ens33:1 down==/sbin/ifconfig ens33:1 192.168.5.200===

Checking the Status of the script.. OK 
Sat Feb  8 10:32:34 2020 - [info]  OK.
Sat Feb  8 10:32:34 2020 - [warning] shutdown_script is not defined.
Sat Feb  8 10:32:34 2020 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

//注意：第一次配置需要去master服务器上手动开启虚拟IP

[root@master ~]# /sbin/ifconfig ens33:1 192.168.5.200/24

启动 MHA，查看当前MHA状态

[root@manager ~]#  nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
[1] 129929     ####已启动

[root@manager ~]# masterha_check_status --conf=/etc/masterha/app1.cnf
app1 (pid:10832) is running(0:PING_OK), master:192.168.5.203
        ###可以看见当前的master是5.203master节点服务器

四、验证MHA架构

在master节点服务器上模拟宕掉服务器

[root@master ~]# pkill -9 mysql

在manager服务器上查看MHA状态

[root@manager ~]# masterha_check_status -conf=/etc/masterha/app1.cnf
app1 (pid:36150) is running(0:PING_OK), master:192.168.5.204
                         ###可以看见master已经变为5.204slave1节点服务器

并且虚拟的IP会漂移到slave1服务器上，不影响此架构的主从复制功能。

[root@slave1 ~]# ifconfig
ens33:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.5.200  netmask 255.255.255.0  broadcast 192.168.5.255
        ether 00:0c:29:34:57:c1  txqueuelen 1000  (Ethernet)

时光慢旅

发布了59 篇原创文章 · 获赞 66 · 访问量 1万+

私信关注