MySQL MHA high-availability cluster deployment and failover (detailed graphic and text)

MySQL MHA

1. MHA concept

MHA (MasterHigh Availability) is an excellent failover and master-slave replication software in a MySQL high-availability environment.
The emergence of MHA is to solve the single point problem of MySQL.
During MySQL failover, MHA can automatically complete the failover operation within 0-30 seconds.
MHA can guarantee data consistency to the greatest extent during the failover process, so as to achieve high availability in the true sense.

1. The composition of MHA

●MHA Node (data node)
MHA Node runs on each MySQL server.

●MHA Manager (management node)
MHA Manager can be deployed on a separate machine to manage multiple master-slave clusters; it can also be deployed on a slave node.
MHA Manager will periodically detect the master node in the cluster. When the master fails, it can automatically promote the slave with the latest data to the new master, and then redirect all other slaves to the new master. The entire failover process is completely transparent to the application.

2. Features of MHA

●During the automatic failover process, MHA tries to save the binary log from the down master server to ensure that data is not lost to the greatest extent
●Using semi-synchronous replication can greatly reduce the risk of data loss. If only one slave has received the latest Binary log, MHA can apply the latest binary log to all other slave servers, so the data consistency of all nodes can be guaranteed.
●Currently MHA supports one-master and multiple-slave architecture with at least three services, namely one master and two slaves

2. Build MySQL+MHA

Experimental ideas

1. MHA architecture
Database installation
One master and two slaves
MHA construction

2. Fault simulation
Simulate the failure of the main library. The
alternative main library becomes the main library. The
original failed main library is restored and re-added to the MHA to become the slave library

Experimental environment, installation package

mha4mysql-manager-0.57.tar.gz, mha4mysql-node-0.57.tar.gz
extraction code: 2bmd

Shell script one-click deployment-source code compile and install MySQL

Host operating system IP address Installation package/software/tool
Mhamanager CentOS7-3 192.168.184.10 MHAnode components, MHAmanager components
mysql1 CentOS7-3 192.168.184.20 mysql-boost-5.7.20.tar.gz, MHAnode component
mysql2 CentOS7-3 192.168.184.30 mysql-boost-5.7.20.tar.gz, MHAnode component
mysql3 CentOS7-3 192.168.184.40 mysql-boost-5.7.20.tar.gz, MHAnode component
netstat -natp | grep 3306

Insert picture description here

1. All servers, turn off the system firewall and security mechanism

systemctl stop firewalld
systemctl disable firewalld
setenforce 0

Insert picture description here

2. Modify the host name of the master (192.168.184.20), Slave1 (192.168.184.30) and Slave2 (192.168.184.40) nodes

hostnamectl set-hostname mysql1
su -

hostnamectl set-hostname mysql2
su -

hostnamectl set-hostname mysql3
su -

Insert picture description here

3. Modify the main configuration file /etc/my.cnf of the three MySQL servers

master:mysql1(192.168.184.20)

vim /etc/my.cnf
[mysqld]
server-id = 20
log_bin = master-bin
log-slave-updates = true

systemctl restart mysqld
ln -s /usr/local/mysql/bin/mysql /usr/sbin/
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/

Insert picture description here
Insert picture description here

slave:mysql2(192.168.184.30)、mysql3(192.168.184.40)

vim /etc/my.cnf
server-id = 30  
#server-id = 40  mysql3则为40,三台服务器 server-id 不能一样
log_bin = master-bin
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index

systemctl restart mysqld
ln -s /usr/local/mysql/bin/mysql /usr/sbin/
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/

Insert picture description here
Insert picture description here

4. Configure MySQL as one master and two slaves

MySQL authorization for all MySQL servers

mysql -e "grant replication slave on *.* to 'myslave'@'192.168.184.%' identified by '123123';"
mysql -e "grant all privileges on *.* to 'mha'@'192.168.184.%' identified by 'manager';"
mysql -e "grant all privileges on *.* to 'mha'@'mysql1' identified by 'manager';"
mysql -e "grant all privileges on *.* to 'mha'@'mysql2' identified by 'manager';"
mysql -e "grant all privileges on *.* to 'mha'@'mysql3' identified by 'manager';"

Insert picture description here

View binary files and synchronization points on the Master node

mysql -e "show master status;"

例:每个人的二进制文件名或者偏移量都可能不一样,记住自己的
+-------------------+----------+--------------+------------------+-------------------+
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+-------------------+----------+--------------+------------------+-------------------+
| master-bin.000001 |     4193 |              |                  |                   |
+-------------------+----------+--------------+------------------+-------------------+

Insert picture description here

Perform synchronization operations on Slave1 and Slave2 nodes

mysql -e "change master to master_host='192.168.184.20',master_user='myslave',master_password='123123',master_log_file='master-bin.000001',master_log_pos=4193;"

mysql -e "start slave;" 
mysql -e "show slave status\G" | awk '/Running:/{print}'

Insert picture description here

Slave1 and Slave2 nodes are set to read-only mode

mysql -e "set global read_only=1;"

Insert picture description here

5. Master-slave replication verification

Create library in Master
Insert picture description here

Slave query library verification
Insert picture description here
Master-slave replication is completed

6. Install MHA software

1. On all serversInstall the environment that MHA depends on. First install the epel source, you needOnline source installation

Then install node components on all servers

mv /etc/yum.repos.d/repos.bak/CentOS-* /etc/yum.repos.d/
yum list

yum install epel-release --nogpgcheck -y

yum install -y perl-DBD-MySQL \
perl-Config-Tiny \
perl-Log-Dispatch \
perl-Parallel-ForkManager \
perl-ExtUtils-CBuilder \
perl-ExtUtils-MakeMaker \
perl-CPAN

将软件包mha4mysql-node-0.57.tar.gz放入/opt目录下

cd /opt
tar zxvf mha4mysql-node-0.57.tar.gz
cd mha4mysql-node-0.57
perl Makefile.PL
make && make install

For each operating system version is different, here CentOS7.4 must choose version 0.57.
The node component must be installed on all servers first, and finally the manager component on the MHA-manager node, because the manager depends on the node component.

2. Install the manager component on the MHA manager node

将软件包mha4mysql-manager-0.57.tar.gz放入/opt目录下

cd /opt
tar zxvf mha4mysql-manager-0.57.tar.gz
cd mha4mysql-manager-0.57
perl Makefile.PL
make && make install

#manager componentAfter installation, several tools will be generated under /usr/local/bin, mainly including the following:

masterha_check_ssh Check the SSH configuration status of MHA
masterha_check_repl Check MySQL replication status
masterha_manger Start the manager script
masterha_check_status Check current MHA running status
masterha_master_monitor Check whether the master is down
masterha_master_switch Control failover (automatic or manual)
masterha_conf_host Add or delete configured server information
masterha_stop Close manager

#node componentAfter installation, several scripts will be generated under /usr/local/bin (these tools are usually triggered by MHAManager scripts without human operation). The main ones are as follows:

save_binary_logs Save and copy master's binary log
apply_diff_relay_logs Identify different relay log events and apply their different events to other slaves
filter_mysqlbinlog Remove unnecessary ROLLBACK events

Insert picture description here


7. Configure passwordless authentication on all servers

1. Configure passwordless authentication to all [database nodes] on the manager (192.168.184.10) node

ssh-keygen -t rsa 				#一路按回车键
ssh-copy-id 192.168.184.20
ssh-copy-id 192.168.184.30
ssh-copy-id 192.168.184.40

Insert picture description here

2. Configure passwordless authentication to the database nodes mysql2 (192.168.184.30) and mysql3 (192.168.184.40) on mysql1 (192.168.184.20)

ssh-keygen -t rsa
ssh-copy-id 192.168.184.30
ssh-copy-id 192.168.184.40

3. Configure passwordless authentication to the database nodes mysql1 (192.168.184.20) and mysql3 (192.168.184.40) on mysql2 (192.168.184.30)

ssh-keygen -t rsa
ssh-copy-id 192.168.184.20
ssh-copy-id 192.168.184.40

4. Configure passwordless authentication to the database nodes mysql1 (192.168.184.20) and mysql2 (192.168.184.30) on mysql3 (192.168.184.40)

ssh-keygen -t rsa
ssh-copy-id 192.168.184.20
ssh-copy-id 192.168.184.30

8. Configure MHA on the manager node

1. Copy related scripts on the manager node to the /usr/local/bin directory

cp -rp /opt/mha4mysql-manager-0.57/samples/scripts /usr/local/bin

#复制后会有四个执行文件
ll /usr/local/bin/scripts/

Insert picture description here

master_ip_failover Script for VIP management during automatic switching
master_ip_online_change VIP management when switching online
power_manager Script to shut down the host after a failure
send_report Script to send alarm after failover

2. Copy the above-mentioned VIP management script during automatic switching to the /usr/local/bin directory, Here use the master_ip_failover script to manage VIP and failover

cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin

Insert picture description here

3. The modification content is as follows: (delete the original content, directly copy and modify VIP related parameters, VIP customization)

echo '' > /usr/local/bin/master_ip_failover

vim /usr/local/bin/master_ip_failover

Insert picture description here

#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';

use Getopt::Long;

my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
#############################添加内容部分#########################################
my $vip = '192.168.184.200';								#指定vip的地址
my $brdc = '192.168.184.255';								#指定vip的广播地址
my $ifdev = 'ens33';										#指定vip绑定的网卡
my $key = '1';												#指定vip绑定的虚拟网卡序列号
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";		#代表此变量值为ifconfig ens33:1 192.168.184.200
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";		#代表此变量值为ifconfig ens33:1 192.168.184.200 down
my $exit_code = 0;											#指定退出状态码为0
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
##################################################################################
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);

exit &main();

sub main {
    
    

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {
    
    

my $exit_code = 1;
eval {
    
    
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
    
    
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
    
    

my $exit_code = 10;
eval {
    
    
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
    
    
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
    
    
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
    
    
&usage();
exit 1;
}
}
sub start_vip() {
    
    
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
## A simple system call that disable the VIP on the old_master
sub stop_vip() {
    
    
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
    
    
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

4. Create the MHA software directory and copy the configuration file. Use the app1.cnf configuration file to manage the mysql node server. The configuration file is generally placed in the /etc/ directory

mkdir /etc/masterha
cp /opt/mha4mysql-manager-0.57/samples/conf/app1.cnf /etc/masterha

Insert picture description here

#Delete the original content, directly copy and modify the IP address of the node server
Insert picture description here

echo '' > /etc/masterha/app1.cnf
vim /etc/masterha/app1.cnf	

[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/usr/local/mysql/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
password=manager
ping_interval=1
remote_workdir=/tmp
repl_password=123123
repl_user=myslave
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.184.30 -s 192.168.184.40
shutdown_script=""
ssh_user=root
user=mha

[server1]
hostname=192.168.184.20
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.184.30
port=3306

[server3]
hostname=192.168.184.40
port=3306

[server default]
manager_log=/var/log/masterha/app1/manager.log       #manager日志
manager_workdir=/var/log/masterha/app1.log    		#manager工作目录
master_binlog_dir=/usr/local/mysql/data/          #master保存binlog的位置,这里的路径要与master里配置的binlog的路径一致,以便MHA能找到
master_ip_failover_script=/usr/local/bin/master_ip_failover            #设置自动failover时候的切换脚本,也就是上面的那个脚本
master_ip_online_change_script=/usr/local/bin/master_ip_online_change  #设置手动切换时候的切换脚本
password=manager			#设置mysql中root用户的密码,这个密码是前文中创建监控用户的那个密码
ping_interval=1				#设置监控主库,发送ping包的时间间隔1秒,默认是3秒,尝试三次没有回应的时候自动进行failover
remote_workdir=/tmp			#设置远端mysql在发生切换时binlog的保存位置
repl_password=123123		#设置复制用户的密码
repl_user=myslave			#设置复制用户的用户
report_script=/usr/local/send_report     #设置发生切换后发送的报警的脚本
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.184.30 -s 192.168.184.40	#指定检查的从服务器IP地址
shutdown_script=""			#设置故障发生后关闭故障主机脚本(该脚本的主要作用是关闭主机防止发生脑裂,这里没有使用)
ssh_user=root				#设置ssh的登录用户名
user=mha					#设置监控用户root

[server1]
hostname=192.168.184.20
port=3306

[server2]
hostname=192.168.184.30
port=3306
candidate_master=1
#设置为候选master,设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中最新的slave

check_repl_delay=0
#默认情况下如果一个slave落后master 超过100M的relay logs的话,MHA将不会选择该slave作为一个新的master, 因为对于这个slave的恢复需要花费很长时间;通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master

[server3]
hostname=192.168.184.40
port=3306

Open vip manually on the Master server

ifconfig ens33:1 192.168.184.200/24

Insert picture description here

5. Test ssh passwordless authentication on the manager node. If it is normal, it will output successfully.

masterha_check_ssh -conf=/etc/masterha/app1.cnf

Insert picture description here

6. Test the mysql master-slave connection on the manager node. Finally, the word MySQL Replication Health is OK appears, indicating that it is normal.

masterha_check_repl -conf=/etc/masterha/app1.cnf

Insert picture description hereInsert picture description here
7. Start MHA on the manager node

nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &

Insert picture description here


- -remove_dead_master_conf This parameter means that when the master-slave switch occurs, the ip of the old master library will be removed from the configuration file.
- -eat_log Log storage location.

- -ignore_last_failover: By default, if MHA detects continuous outages and the interval between two outages is less than 8 hours, it will not failover. The reason for this restriction is to avoid the ping-pong effect. This parameter means ignoring the file generated by the last MHA trigger switch. By default, after the MHA switch occurs, it will be recorded in the log directory, which is the log app1.failover.complete file set above. If the directory is found the next time you switch again If the file exists under the file, it will not be allowed to trigger the switch, unless the file is deleted after the first switch. For convenience, set it to –ignore_last_failover here.

  • Check the MHA status, you can see that the current master is the Mysql1 node.

    masterha_check_status --conf=/etc/masterha/app1.cnf
    
  • Check the MHA log, you can also see that the current master is 192.168.184.20, as shown below.

    cat /var/log/masterha/app1/manager.log | grep "current master"
    

Insert picture description here

  • Check whether the VIP address 192.168.184.200 of Mysql1 exists. This VIP address will not disappear because the manager node stops the MHA service.

    ifconfig
    
  • To shut down the manager service, you can use the following command.

    masterha_stop --conf=/etc/masterha/app1.cnf
    或者可以直接采用 kill 进程 ID 的方式关闭
    

9. Fault simulation

Monitor the observation log records on the manager node

tail -f /var/log/masterha/app1/manager.log

Insert picture description here

Stop the mysql service on the Master node Mysql1

systemctl stop mysqld

Insert picture description here

After a normal automatic switchover, the MHA process will exit. HMA will automatically modify the content of the app1.cnf file and delete the down mysql1 node.

Check whether mysql2 takes over VIP

ifconfig

Insert picture description here
View the dynamic log of the manager node
Insert picture description here


The algorithm for failover to the alternative master library:
1. Generally, the judgment of the slave library is to judge the pros and cons of the slave (position/GTID), and the data is different. The slave closest to the master becomes the alternative master.

2. In the case of the same data, select the alternative main library according to the order of the configuration files.

3. Set the weight (candidate_master=1), and specify the candidate master according to the weight.

  • By default, if a slave is 100M behind the relay logs of the master, it will fail even if it has weight.
  • If check_repl_delay=0, even if there are many logs behind, it is mandatory to select it as the candidate master.

Fault repair

1. Repair the master

systemctl restart mysqld

Insert picture description here

2. Repair master-slave

View the binary files and synchronization points on the current main database server Mysql2 (192.168.184.30)

mysql -e "show master status;"

Insert picture description here

Perform synchronization operation on the original main database server mysql1 (192.168.184.20)

mysql -e "change master to master_host='192.168.184.30',master_user='myslave',master_password='123123',master_log_file='master-bin.000001',master_log_pos=1595;"

mysql -e "start slave;"
mysql -e "show slave status\G" | awk '/Running:/{print}'

Insert picture description here

3. Modify the configuration file app1.cnf on the manager node (add this record again, because it will automatically disappear when it detects failure)

vim /etc/masterha/app1.cnf
……
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.184.20 -s 192.168.184.40
......
[server1]
hostname=192.168.184.30
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.184.20
port=3306

[server3]
hostname=192.168.184.40
port=3306

Insert picture description here

4. Start MHA on the manager node

masterha_stop --conf=/etc/masterha/app1.cnf

nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &

masterha_check_status --conf=/etc/masterha/app1.cnf

Insert picture description here

Guess you like

Origin blog.csdn.net/weixin_51432770/article/details/113865951