MySQL-MHA high availability configuration and failover

1. Overview of MHA:

1 Overview:

(1) MHA (Master High Availability) is an excellent set of software for failover and master-slave replication in a MySQL high-availability environment.
(2) The emergence of MHA is to solve the problem of MySQL single point.
(3) During the MySQL failover process, MHA can automatically complete the failover operation within 0-30 seconds.
(4) MHA can ensure data consistency to the greatest extent during failover to achieve high availability in the true sense.

2. Composition of MHA:

(1) MHA Node (data node)
MHA Node runs on each MySQL server.

(2) MHA Manager (management node)
MHA Manager can be deployed on an independent machine to manage multiple master-slave clusters; it can also be deployed on a slave node.
MHA Manager will regularly detect the master node in the cluster. When the master fails, it can automatically promote the slave with the latest data as the new master, and then re-point all other slaves to the new master. The entire failover process is completely transparent to the application.

3. The characteristics of MHA:

(1) During the automatic failover process, MHA tries to save the binary log from the downtime master server to ensure that the data is not lost to the greatest extent (2
) Using semi-synchronous replication can greatly reduce the risk of data loss. If only one slave has already After receiving the latest binary log, MHA can apply the latest binary log to all other slave servers, so the data consistency of all nodes can be guaranteed. (3)
Currently, MHA supports a master-multiple-slave architecture with at least three services, namely One master and two slaves

4. The working principle of MHA:

(1) Save binary log events (binlog events) from the crashed master

(2) Identify the slave log with the latest update

(3) Apply the differential relay log (relay log) to other slaves

(4) Apply the binary log events saved from the master

(5) Promote a new slave to become the new master

(6) Use other slaves to connect to the new master for replication

2. Build MySQL MHA:

  • MHA: In order to solve the failover, the data should be saved as much as possible and the consistency of all nodes should be maintained.

  • Graphic analysis:

insert image description here

1. Configure master-slave replication:

MHA manager节点服务器:192.168.174.12
master节点服务器:192.168.174.15
slave1节点服务器:192.168.174.18
slave2节点服务器:192.168.174.19

1. Turn off the firewall, security mechanism (all servers):

systemctl stop firewalld
systemctl disable firewalld
setenforce 0

2. Modify the host name:

hostnamectl set-hostname manager
su
hostnamectl set-hostname master
su
hostnamectl set-hostname slave1
su
hostnamectl set-hostname slave2
su

3. Add host mapping relationship in Master, Slave1, Slave2

#三台添加同样的配置
vim /etc/hosts
192.168.174.15 master
192.168.174.18 slave1
192.168.174.19 slave2

insert image description here

4. Modify the Mysql master configuration file /etc/my.cnf of the Master, Slave1, and Slave2 nodes

##Master 节点##
vim /etc/my.cnf
[mysqld]
server-id = 1
log_bin = master-bin
log-slave-updates = true
 
systemctl restart mysqld

insert image description here


##Slave1节点##
 vim /etc/my.cnf
 [mysqld]
 server-id = 2               #三台服务器的 server-id 不能相同
 log_bin = master-bin
 relay-log = relay-log-bin
 relay-log-index = slave-relay-bin.index
 relay_log_recovery=1
 
 systemctl restart mysqld

insert image description here

 ##Slave2节点##
 vim /etc/my.cnf
 [mysqld]
 server-id = 3               #三台服务器的 server-id 不能相同
 log_bin = master-bin
 relay-log = relay-log-bin
 relay-log-index = slave-relay-bin.index
 relay_log_recovery=1
 
 
 systemctl restart mysqld     #重启mysql

insert image description here

5. Create two soft links on the Master, Slave1, and Slave2 nodes:

ln -s /usr/local/mysql/bin/mysql /usr/sbin/
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/

6. Configure mysql one master and two slaves:

(1) Three databases are authorized for MySQL:

mysql -uroot -pabc123
grant replication slave on *.* to 'myslave'@'192.168.174.%' identified by '123456';		
#从数据库同步使用
grant all privileges on *.* to 'mha'@'192.168.174.%' identified by 'mhamanager';		#manager 使用
 
grant all privileges on *.* to 'mha'@'master' identified by 'manager';				
#防止从库通过主机名连接不上主库
grant all privileges on *.* to 'mha'@'slave1' identified by 'manager';
grant all privileges on *.* to 'mha'@'slave2' identified by 'manager';
flush privileges;

(2) View binary files and synchronization points on the Master node:

show master status;

insert image description here

(3) Perform synchronization operations on Slave01 and Slave02 nodes:

change master to master_host='192.168.174.15',master_user='myslave',master_password='abc123',master_log_file='master-bin.000001',master_log_pos=1749;
start slave;

(4) View data synchronization results on Slave01 and Slave02 nodes:

show slave status\G		
//确保 IO 和 SQL 线程都是 Yes,代表同步正常。
Slave_IO_Running: Yes
Slave_SQL_Running: Yes

insert image description here

insert image description here

(5) Two slave libraries must be set to read-only mode:

set global read_only=1;

insert image description here

(6) Insert data to test database synchronization:

create database test;
use test;
create table test(id int);
insert into test(id) values (1);

insert image description here

insert image description here

2. Configure MHA:

(1) Install the MHA-dependent environment on all servers, first install the epel source

yum install epel-release --nogpgcheck -y

yum install -y perl-DBD-MySQL \
perl-Config-Tiny \
perl-Log-Dispatch \
perl-Parallel-ForkManager \
perl-ExtUtils-CBuilder \
perl-ExtUtils-MakeMaker \
perl-CPAN

2) To install the MHA software package, you must first install the node component on all servers
. The version of each operating system is different. Here, CentOS7.6 chooses version 0.57.
The node component must be installed on all servers first, and finally the manager component must be installed on the MHA-manager node, because the manager depends on the node component.

cd /opt
tar zxvf mha4mysql-node-0.57.tar.gz
cd mha4mysql-node-0.57
perl Makefile.PL
make && make install

(3) Install the manager component on the MHA manager node

cd /opt
tar zxvf mha4mysql-manager-0.57.tar.gz
cd mha4mysql-manager-0.57
perl Makefile.PL
make && make install

3. Manager and node tool use:

(1) Tool introduction:

After the #manager component is installed, several tools will be generated under /usr/local/bin, mainly including the following:
masterha_check_ssh checks the SSH configuration status of MHA
masterha_check_repl checks MySQL replication status
masterha_manger starts the manager script
masterha_check_status checks the current MHA running status
masterha_master_monitor detection master is down
masterha_master_switch control failover (automatic or manual)
masterha_conf_host add or delete configured server information
masterha_stop close manager

After the #node component is installed, several scripts will be generated under /usr/local/bin (these tools are usually triggered by MHAManager scripts without manual operation) mainly as follows: save_binary_logs save and copy the binary log of the master apply_diff_relay_logs identify
the
differences Follow up log events and apply their differences to other slaves
filter_mysqlbinlog Remove unnecessary ROLLBACK events (MHA no longer uses this tool)
purge_relay_logs Purge relay logs (does not block SQL threads)

(2) Configure passwordless authentication on all servers:

① Configure passwordless authentication to all database nodes on the manager node:

ssh-keygen -t rsa 				#一路按回车键
ssh-copy-id 192.168.174.15
ssh-copy-id 192.168.174.18
ssh-copy-id 192.168.174.19

② Configure passwordless authentication to database nodes slave01 and slave02 on the master:

ssh-keygen -t rsa
ssh-copy-id 192.168.174.18
ssh-copy-id 192.168.174.19

③ Configure passwordless authentication to the database node master and slave02 on slave01:

ssh-keygen -t rsa
ssh-copy-id 192.168.174.15
ssh-copy-id 192.168.174.19

(4) Configure passwordless authentication to the database node master and slave01 on slave02:

ssh-keygen -t rsa
ssh-copy-id 192.168.174.15
ssh-copy-id 192.168.174.18

4. Configure MHA on the manager node:

(1) Copy the relevant scripts to the /usr/local/bin directory on the manager node:

cp -rp /opt/mha4mysql-manager-0.57/samples/scripts /usr/local/bin
//拷贝后会有四个执行文件
ll /usr/local/bin/scripts/
----------------------------------------------------------------------------------------------------------
master_ip_failover  		#自动切换时 VIP 管理的脚本
master_ip_online_change 	#在线切换时 vip 的管理
power_manager 				#故障发生后关闭主机的脚本
send_report 				#因故障切换后发送报警的脚本
----------------------------------------------------------------------------------------------------------

(2) Copy the above-mentioned VIP management script during automatic switching to the /usr/local/bin directory, where the master_ip_failover script is used to manage VIP and failover

cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin

(3) The modified content is as follows: (delete the original content, directly copy and modify vip related parameters)

vim /usr/local/bin/master_ip_failover
use strict;
use warnings FATAL => 'all';

use Getopt::Long;

my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port);
my $vip = '192.168.1174.200';									#指定vip的地址
my $brdc = '192.168.174.255';								#指定vip的广播地址
my $ifdev = 'ens33';										#指定vip绑定的网卡
my $key = '1';												#指定vip绑定的虚拟网卡序列号
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";		#代表此变量值为ifconfig ens33:1 192.168.174.200
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";		#代表此变量值为ifconfig ens33:1 192.168.174.200 down
my $exit_code = 0;											#指定退出状态码为0
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {

my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
## A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

(4) Create the MHA software directory and copy the configuration file, here use the app1.cnf configuration file to manage the mysql node server:

mkdir /etc/masterha
cp /opt/mha4mysql-manager-0.57/samples/conf/app1.cnf /etc/masterha

vim /etc/masterha/app1.cnf						#删除原有内容,直接复制并修改节点服务器的IP地址
[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/usr/local/mysql/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=mhamanager
ping_interval=1
remote_workdir=/tmp
repl_password=123456
repl_user=myslave
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.174.18 -s 192.168.174.19
shutdown_script=""
ssh_user=root
user=mha

[server1]
hostname=192.168.174.15
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.174.18
port=3306

[server3]
hostname=192.168.174.19
port=3306

----------------------------------------------------------------------------------------------------------
[server default]
manager_log=/var/log/masterha/app1/manager.log      #manager日志
manager_workdir=/var/log/masterha/app1            #manager工作目录
master_binlog_dir=/usr/local/mysql/data/         #master保存binlog的位置,这里的路径要与master里配置的binlog的路径一致,以便MHA能找到
master_ip_failover_script=/usr/local/bin/master_ip_failover  #设置自动failover时候的切换脚本,也就是上面的那个脚本
master_ip_online_change_script=/usr/local/bin/master_ip_online_change  #设置手动切换时候的切换脚本
password=manager			#设置mysql中root用户的密码,这个密码是前文中创建监控用户的那个密码
ping_interval=1				#设置监控主库,发送ping包的时间间隔,默认是3秒,尝试三次没有回应的时候自动进行failover
remote_workdir=/tmp			#设置远端mysql在发生切换时binlog的保存位置
repl_password=123		    #设置复制用户的密码
repl_user=myslave			#设置复制用户的用户
report_script=/usr/local/send_report     #设置发生切换后发送的报警的脚本
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.174.18 -s 192.168.174.19	#指定检查的从服务器IP地址
shutdown_script=""			#设置故障发生后关闭故障主机脚本(该脚本的主要作用是关闭主机防止发生脑裂,这里没有使用)
ssh_user=root				#设置ssh的登录用户名
user=mha					#设置监控用户root

[server1]
hostname=192.168.174.15
port=3306

[server2]
hostname=192.168.174.18
port=3306
candidate_master=1
#设置为候选master,设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个从库不是集群中最新的slave

check_repl_delay=0
#默认情况下如果一个slave落后master 超过100M的relay logs的话,MHA将不会选择该slave作为一个新的master, 因为对于这个slave的恢复需要花费很长时间;通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master

[server3]
hostname=192.168.174.19
port=3306
----------------------------------------------------------------------------------------------------------

5. For the first configuration, you need to manually enable the virtual IP on the Master node:

/sbin/ifconfig ens33:1 192.168.174.200/24

6. Test ssh on the manager node:

masterha_check_ssh -conf=/etc/masterha/app1.cnf

insert image description here

7. Test the mysql master-slave connection on the manager node:

masterha_check_repl -conf=/etc/masterha/app1.cnf

insert image description here

8. Start MHA on the manager node:

nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &

#生产中java 服务启动的方式

insert image description here

9. View MHA status:

masterha_check_status --conf=/etc/masterha/app1.cnf

insert image description here

10. Check the log:

cat /var/log/masterha/app1/manager.log | grep "current master"

insert image description here

11. View master VIP:

查看master 的 VIP 地址 192.168.174.200 是否存在,这个 VIP 地址不会因为 manager 节点停止 MHA 服务而消失。
ifconfig

insert image description here

3. Fault simulation:

1. Monitor and observe log records on the manager node:

tail -f /var/log/masterha/app1/manager.log



#正常自动切换一次后,MHA 进程会退出。HMA 会自动修改 app1.cnf 文件内容,将宕机的 master 节点删除。查看 slave1 是否接管 VIP
ifconfig

2. Stop the mysql service on the master node master

systemctl stop mysqld
或
pkill -9 mysql

3. Check whether the VIP has drifted:

正常自动切换一次后,MHA 进程会退出。HMA 会自动修改 app1.cnf 文件内容,将宕机的 master 节点删除。查看 slave1 是否接管 VIP
ifconfig

Algorithm for failover of the standby master library:
1. Generally, the slave library is judged from (position/GTID) to judge whether it is good or bad. If the data is different, the slave closest to the master becomes the candidate master.
2. If the data is consistent, select an alternative main library according to the order of the configuration files.
3. Set the weight (candidate_master=1), and the candidate master is forced to be designated according to the weight.
(1) By default, if a slave lags behind the master's relay logs by 100M, it will fail even if it has weight.
(2) If check_repl_delay=0, even if there are many logs behind, it is forced to be selected as the backup master.

4. Fault recovery:

(1) restore mysql:

systemctl restart mysqld

(2) Repair the master-slave:
view the binary files and synchronization points on the current master library server Mysql2

show master status;
+-------------------+----------+--------------+------------------+-------------------+
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-------------------+----------+--------------+------------------+-------------------+
| master-bin.000002 |      154 |              |                  |                   |
+-------------------+----------+--------------+------------------+-------------------+

(3) Execute the synchronization operation on the original main database server mysql1:

change master to master_host='192.168.174.15',master_user='myslave',master_password='123456',master_log_file='master-bin.000002',master_log_pos=154;

start slave;

(4) Modify the configuration file app1.cnf on the manager node (add this record, because it will disappear automatically when it detects the failure)

vim /etc/masterha/app1.cnf
......
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.174.18 -s 192.168.174.19
......
[server1]
hostname=192.168.174.15
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.174.18
port=3306

[server3]
hostname=192.168.174.19
port=3306

(5) Start MHA on the manager node

nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &

4. Summary:

  • MHA (Master High Availability) is a high availability (HA) solution based on MySQL. The function of MHA is to ensure the high availability and continuity of MySQL service by automatically switching to the backup server when the main server fails.

1. The configuration of MHA includes three main components: Master Server, MHA Node and MHA Manager. The following are the steps for high availability configuration and failover of MHA:

2. Install MHA Manager and related dependencies on Master Server and Replica Server. MHA Manager is a Python script, which needs to depend on Perl script, Python-cryptography library, SSH client, etc.

3. Configure the configuration file masterha_manager.conf of MHA Manager. The configuration file needs to include the connection information of the MySQL Master server and the Replica server, SSH connection information, and so on.

4. Configure the SSH connection to ensure that the Master Server and Replica Server can SSH to each other. This can be achieved by configuring the SSH public key and private key.

5. Start the MHA manager. You can use the following command to start MHA Manager:
masterha_manager --conf=/etc/masterha/masterha_manager.conf --remove_dead_master_conf
Configure the failover function of MHA Manager. The use of a custom switch script can be realized by configuring the master_ip_failover_script parameter.

6. Test the failover function of MHA. You can simulate the failure of the Master Server to check whether the MHA can automatically switch to the Replica Server.

7. It should be noted that before using MHA and failover, all data should be backed up to prevent accidental data loss and damage.

In general, MHA is a reliable and stable high-availability solution that can effectively improve the availability and continuity of MySQL services. Configuring MHA and performing failover requires certain experience in system management and MySQL technology. It is recommended to carefully understand and study relevant technical documents and best practices in actual use to ensure high availability and data security.

Guess you like

Origin blog.csdn.net/Riky12/article/details/131903695