DAY 65 MySQL's highly available MHA cluster

MHA overview

What is MHA

MHA (MasterHigh Availability) is an excellent set of software for failover and master-slave replication in a MySQL high-availability environment.

The emergence of MHA is to solve the problem of MySQL single point of failure.

During the MySQL failover process, MHA can automatically complete the failover operation within 0-30 seconds.

MHA can ensure data consistency to the greatest extent during the failover process to achieve high availability in the true sense

Composition of MHA

1) MHA Node (data node)

  • MHA Node runs on each MySQL server.

2) MHA Manager (management node)

  • MHA Manager can be deployed on an independent machine to manage multiple master-slave clusters; it can also be deployed on a slave node.

  • MHA Manager will regularly detect the master node in the cluster. When the master fails, it can automatically promote the slave with the latest data as the new master, and then re-point all other slaves to the new master. The entire failover process is completely transparent to the application

Features of MHAs

  • During the automatic failover process, MHA tries to save the binary log from the downtime master server to ensure that the data is not lost to the greatest extent .

  • Using semi-synchronous replication can greatly reduce the risk of data loss. If only one slave has received the latest binary log, MHA can apply the latest binary log to all other slave servers, thus ensuring data consistency of all nodes .

  • At present, MHA supports one master and multiple slave architecture, at least three servers, that is, one master and two slaves

Preparations for MySQL MHA construction

 Experiment ideas

1) MHA architecture: ① database installation ② one master and two slaves ③ MHA construction

2) Fault simulation: ①The main library fails ②The alternate main library becomes the main library ③The original faulty main library is restored and rejoined to MHA to become the slave library

Experiment preparation

node server system CPU name IP address Installation Tools and Services
MHA manager server CentOS7.4(64 bit) manager 192.168.137.40 MHA node and manager components
Master server CentOS7.4(64 bit) mysql1 192.168.137.70 MHA node components
Slave1 server CentOS7.4(64 bit) mysql2 192.168.137.60 MHA node components
Slave2 server CentOS7.4(64 bit) mysql3 192.168.137.50 MHA node components

Build MySQL MHA

1. Configure master-slave replication

1.1 All servers close the firewall and selinux

systemctl stop firewalld 
systemctl disable firewalld
setenforce 0

1.2 Master, Slave1, Slave2 nodes install mysql5.7

1.3 Modify the host names of the Master, Slave1, and Slave2 nodes

#Master 节点
hostnamectl set-hostname mysql1
su 
#Slave1节点
hostnamectl set-hostname mysql2
su ​
#Slave2节点
hostnamectl set-hostname mysql3
su

1.4 Add host mapping relationship in Master, Slave1, Slave2

#三台添加同样的配置
vim /etc/hosts
192.168.137.70 mysql1
192.168.137.60 mysql2
192.168.137.50 mysql3

 Network Connectivity Test

 

1.5 Modify the Mysql master configuration file /etc/my.cnf of the Master, Slave1, and Slave2 nodes

Master node, open the binary log.

Slave1, Slave2 nodes, open binary log and relay log

#Master 节点
vim /etc/my.cnf
[mysqld]
server-id = 1
log_bin = master-bin         #开启二进制日志
log-slave-updates = true     #允许slave从master复制数据时可以写入到自己的二进制日志
-------------------------------------------- ​
systemctl restart mysqld     #重启mysql
 ​
#Slave1节点
vim /etc/my.cnf
[mysqld]
server-id = 2               #三台服务器的 server-id 不能相同
log_bin = master-bin        #开启中继日志
relay-log = relay-log-bin   
relay-log-index = slave-relay-bin.index
----------------------------------------------------- ​
systemctl restart mysqld
 ​
#Slave2节点
vim /etc/my.cnf
[mysqld]
server-id = 3               #三台服务器的 server-id 不能相同
log_bin = master-bin        #开启中继日志
relay-log = relay-log-bin   #添加定义中继日志文件位置与名称一般与二进制文件在同一目录
relay-log-index = slave-relay-bin.index 
------------------------------------------------
systemctl restart mysqld     #重启mysql

Restart all three servers

1.6 Create two soft links on the Master, Slave1, and Slave2 nodes

ln -s /usr/local/mysql/bin/mysql /usr/sbin/
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/ ​
ls /usr/sbin/mysql*     #查看软链接

 1.7 Log in to the database for authorization

Authorization for master-slave synchronization of all database nodes mysql

grant replication slave on *.* to 'myslave'@'192.168.137.%' identified by 'abc123';   
#从服务器同步使用

 All database nodes are authorized to the manager server 

#所有数据库节点进行manager授权
grant all privileges on *.* to 'my'@'192.168.137.%' identified by 'abc123'; 
#为了防止因主机名导致的连接不上问题,再次对登录地址进行授权​ 
grant all privileges on *.* to 'my'@'mysql1' identified by 'abc123';              
grant all privileges on *.* to 'my'@'mysql2' identified by 'abc123';
grant all privileges on *.* to 'my'@'mysql3' identified by 'abc123';
flush privileges;   #刷新权限

 1.8 Configure master-slave synchronization

View binary files and synchronization points on the Master node

#在 Master 节点查看二进制文件和同步点(即偏移量)
show master status;

  Perform synchronization operations on Slave1 and Slave2 nodes:

#在 Slave1、Slave2 节点执行同步操作
change master to 
master_host='192.168.137.70',
master_user='myslave',
master_password='abc123',
master_log_file='master-bin.000001',
master_log_pos=1743; 
------------------------------------------------ ​
start slave;     #开启同步,如有报错执行 reset slave;
------------------------------------ ​
#在 Slave1、Slave2 查看节点状态
show slave status\G     
#确保 IO 和 SQL 线程都是 Yes,代表同步正常。
  Slave_IO_Running: Yes
  Slave_SQL_Running: Yes

General Slave_IO_Running: Possibility of No

  • network issue
  • There is a problem with my.cnf configuration
  • The password, file name, and pos offset are incorrect
  • Firewall is not turned off

If the beginning of conn appears, all machines need to be restored and redone

Two slave libraries are set to read-only mode

 #两个从库必须设置为只读模式
 set global read_only=1;

 Insert data into the master main database

#在Master主库插入数据,测试数据库同步##
create database my; 
use my;
create table city(id int);
insert into city values (1);

 Verify data synchronization from database

#从数据库中验证是否同步成功
show databases;
use my;
show tables;
select * from city;

 

 Configure MHA

2.1 Install the MHA software
The environment that MHA depends on is installed on all servers, first install the epel source

The version of each operating system is different, here CentOS7.4 must choose version 0.57.
The node component must be installed on all servers first, and finally the manager component must be installed on the MHA-manager node, because the manager component depends on the node component
 

#安装epel源
yum install epel-release --nogpgcheck -y
 ​
#安装 MHA依赖环境
yum install -y perl-DBD-MySQL \
perl-Config-Tiny \
perl-Log-Dispatch \
perl-Parallel-ForkManager \
perl-ExtUtils-CBuilder \
perl-ExtUtils-MakeMaker \
perl-CPAN

First install the epel source on all servers

 The environment on which all servers install MHA depends

 To install the MHA package, you must first install the node component on  all servers

Install node components on all servers

#将安装包上传到/opt/目录中,解压安装node组件
cd /opt/
tar zxvf mha4mysql-node-0.57.tar.gz
cd mha4mysql-node-0.57
perl Makefile.PL
make && make install 

 Finally, install the manager component on the MHA manager node

在 MHA manager 节点上安装 manager 组件##(manager组件依赖node 组件)
cd /opt/
tar zxvf mha4mysql-manager-0.57.tar.gz
cd mha4mysql-manager-0.57
perl Makefile.PL
make && make install

 

 After the manager component is installed, several tools will be generated under /usr/local/bin, mainly including the following 

masterha_check_ssh         #检查 MHA 的 SSH 配置状况
masterha_check_repl        #检查 MySQL 复制状况
masterha_manger            #启动 manager的脚本
masterha_check_status      #检测当前 MHA 运行状态
masterha_master_monitor    #检测 master 是否宕机
masterha_master_switch     #控制故障转移(自动或者手动)
masterha_conf_host         #添加或删除配置的 server 信息
masterha_stop              #关闭manager

After the node component is installed, several scripts will be generated under /usr/local/bin/ (these tools are usually triggered by the MHAManager script without manual operation) mainly as follows

save_binary_logs        #保存和复制 master 的二进制日志
apply_diff_relay_logs   #识别差异的中继日志事件并将其差异的事件应用于其他的slave
filter_mysqlbinlog      #去除不必要的 ROLLBACK 事件(MHA 已不再使用这个工具)
purge_relay_logs        #清除中继日志(不会阻塞 SQL 线程)

2.2 Configure passwordless authentication on all servers

Configure passwordless authentication to all database nodes on the manager node

ssh-keygen -t rsa             #一路按回车键,生成密钥。"-t rsa"指定密钥的类型。
ssh-copy-id 192.168.137.70    #将公钥传给所有数据库节点,形成免密码连接登录
ssh-copy-id 192.168.137.60
ssh-copy-id 192.168.137.50

 Configure passwordless authentication to the database nodes mysql2 and mysql3 on the mysql1 master node server

ssh-keygen -t rsa
#将公钥传给两个从节点,形成免密码连接登录
ssh-copy-id 192.168.137.60
ssh-copy-id 192.168.137.50

 Configure passwordless authentication to database nodes mysql1 and mysql3 on mysql2, namely slave1

ssh-keygen -t rsa
ssh-copy-id 192.168.137.70 #master服务器
ssh-copy-id 192.168.137.50 #slave2服务器

 Configure passwordless authentication to database nodes mysql1 and mysql2 on mysql3

ssh-keygen -t rsa
ssh-copy-id 192.168.137.70 #master服务器
ssh-copy-id 192.168.137.60 #slave1服务器

 2.3 Configure MHA on the manager node

Copy the relevant scripts to the /usr/local/bin/ directory on the manager node

在 manager 节点上复制相关脚本到/usr/local/bin 目录
cp -rp /opt/mha4mysql-manager-0.57/samples/scripts /usr/local/bin
#拷贝后会有四个执行文件
ll /usr/local/bin/scripts/
------------------------------------------------------ ​
 master_ip_failover          #自动切换时 VIP 的管理脚本
 master_ip_online_change     #在线切换时 VIP 的管理脚本
 power_manager               #故障发生后关闭主机的脚本
 send_report                 #因故障切换后发送报警的脚本

 Copy master_ip_failover (the VIP management script during automatic switching) to the /usr/local/bin directory, where the master_ip_failover script is used to manage VIP and failover.

cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin

 Modify the /usr/local/bin/master_ip_failover script, delete the original content, and add all the content again

#修改内容如下:(删除原有内容,直接复制并修改vip相关参数。可在拷贝前输入 :set paste 解决vim粘贴乱序问题)
vim /usr/local/bin/master_ip_failover
----------------------------------------------
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';
 
use Getopt::Long;
 
my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
#############################添加内容部分#########################################
my $vip = '192.168.137.10';                                                                      #指定vip的地址
my $brdc = '192.168.137.255';                                                            #指定vip的广播地址
my $ifdev = 'ens33';                                                                            #指定vip绑定的网卡
my $key = '1';                                                                                          #指定vip绑定的虚拟网卡序列号
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";           #代表此变量值为ifconfig ens33:1 192.168.80.200
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";            #代表此变量值为ifconfig ens33:1 192.168.80.200 down
my $exit_code = 0;                                                                                      #指定退出状态码为0
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
##################################################################################
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);
exit &main();
 
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
## A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}
 
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
 

Modify content

 2.4manager node edit configuration file, manage mysql node server

Create the MHA software directory and copy the configuration file, and use the app1.cnf configuration file to manage the mysql node server.

 Copy app1.cnf to master_ip_failover

vim /etc/masterha/app1.cnf
 
[server default]
manager_log=/var/log/masterha/app1/manager.log            #manager日志
manager_workdir=/var/log/masterha/app1                    #manager工作目录
master_binlog_dir=/usr/local/mysql/data                   #master保存binlog的位置,这里的路径要与master里配置的binlog的路径一致,以便MHA能找到
master_ip_failover_script=/usr/local/bin/master_ip_failover #设置自动failover时候的切换脚本,也就是上面的那个脚本
master_ip_online_change_script=/usr/local/bin/master_ip_online_change #设置手动切换时候的切换脚本
password=abc123  #设置mysql用户的密码,这个密码是前文中创建监控用户的那个密码
user=my          #设置mysql用户,这个用户是前文中创建监控用户的用户
ping_interval=1  #设置监控主库,发送ping包的时间间隔,默认是3秒,尝试三次没有回应的时候自动进行failover
remote_workdir=/tmp #设置远端mysql在发生切换时binlog的保存位置
repl_password=123123 #主从复制时设置的密码
repl_user=myslave  #主从复制时设置的用户
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.137.60 -s 192.168.137.70   #指定检查的从服务器IP地址
shutdown_script=""  #设置故障发生后关闭故障主机脚本(该脚本的主要作用是关闭主机防止发生脑裂,这里没有使用)
ssh_user=root       #设置ssh的登录用户名
 
[server1]
hostname=192.168.137.70
port=3306
 
[server2]
candidate_master=1  
#设置为候选master,设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个从库不是集群中最新的slave
 
check_repl_delay=0
#默认情况下如果一个slave落后master 超过100M的relay logs的话,MHA将不会选择该slave作为一个新的master, 因为对于这个slave的恢复需要花费很长时间;通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master
hostname=192.168.137.60
port=3306
 
[server3]
hostname=192.168.137.50
port=3306

 2.5 The first configuration needs to manually open the virtual IP on the Master node

 /sbin/ifconfig ens33:1 192.168.137.10/24

2.6 Test ssh passwordless authentication on the manager node

Test ssh passwordless authentication on the manager node, if it is normal, it will output successfully at last, as shown below

masterha_check_ssh -conf=/etc/masterha/app1.cnf

 2.7 Test the mysql master-slave connection on the manager node

Test the master-slave connection of mysql on the manager node, and MySQL Replication Health is OK the words appear at the end, indicating that it is normal. As follows

masterha_check_repl -conf=/etc/masterha/app1.cnf
 

2.8 Start MHA on the manager node

nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
--remove_dead_master_conf   #该参数代表当发生主从切换后,老的主库的 ip 将会从配置文件中移除。
--manger_log                #日志存放位置。
--ignore_last_failover      #在缺省情况下,如果 MHA 检测到连续发生宕机,且两次宕机间隔不足 8 小时的话,则不会进行 Failover, 之所以这样限制是为了避免 ping-pong 效应(来回切换导致脑裂)。该参数代表忽略上次 MHA 触发切换产生的文件,默认情况下,MHA 发生切换后会在 app1.failover.complete 日志文件中记录,下次再次切换的时候如果发现该目录下存在该文件将不允许触发切换, 除非在第一次切换后删除该文件,为了方便,这里设置为--ignore_last_failover。
 ​
 ●使用&后台运行程序:结果会输出到终端;使用Ctrl+C发送SIGINT信号,程序免疫;关闭session发送SIGHUP信号,程序关闭。
 ●使用nohup运行程序:结果默认会输出到nohup.out;使用Ctrl+C发送SIGINT信号,程序关闭;关闭session发送SIGHUP信号,程序免疫。
 ●使用nohup和&配合来启动程序 nohup ./test &:同时免疫SIGINT和SIGHUP信号。
 

2.9 View the MHA status and MHA log on the manager node, you can see the address of the master

#查看 MHA 状态,可以看到当前的 master 是 Mysql1 节点。
masterha_check_status --conf=/etc/masterha/app1.cnf
 ​
#查看 MHA 日志,也可以看到当前的 master 是 192.168.137.70
cat /var/log/masterha/app1/manager.log | grep "current master"

2.10 Check whether the VIP address 192.168.52.10 exists on Mysql1

Check whether the VIP address 192.168.52.10 of Mysql1 exists, this VIP address will not disappear because the manager node stops the MHA service

ifconfig
#若要关闭 manager 服务,可以使用如下命令。
masterha_stop --conf=/etc/masterha/app1.cnf
#或者可以直接采用 kill 进程 ID 的方式关闭。

 3. Fault simulation

Stop the mysql service on Mysql1, MHA will automatically modify the content of the app1.cnf file, and delete the downtime mysql1 node. mysql2 will automatically take over the VIP and become the new master

3.1 Stop the mysql service on the Master node Mysql1

systemctl stop mysqld
或
pkill -9 mysql

3.2 Monitor and observe log records on the manager node, the manager elects mysql2 as the new master server

tail -f /var/log/masterha/app1/manager.log
 ​

4. Troubleshooting

4.1 Repair mysql1 (that is, repair the original primary node)

systemctl restart mysqld

4.2 Repair master-slave data

View binary log files and synchronization points in the new main library server Mysql2

show master status;

4.3 Execute the synchronization operation on the original main library server mysql1, and synchronize the data in the current main library

#在原主库服务器 mysql1 执行同步操作,同步现在主库中的数据
change master to master_host='192.168.137.60',master_user='myslave',master_password='abc123',master_log_file='master-bin.000001',master_log_pos=1743;
 
start slave; #开启从节点

4.4 Modify the configuration file app1.cnf on the manager node

Re-add the record of the three mysql node servers, because it will automatically delete the primary node when it detects that the primary node fails.

Add mysql1 as a new candidate master

vim /etc/masterha/app1.cnf
......
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.137.70 -s 192.168.137.50
......
[server1]
hostname=192.168.137.60
port=3306
 ​
[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.137.70
port=3306
 ​
[server3]
hostname=192.168.137.50
port=3306

4.5 Start MHA on the manager node

 nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &

Guess you like

Origin blog.csdn.net/weixin_57560240/article/details/130795794