MySQL高可用架构之MHA

一.简介：

MHA（Master High Availability）目前在MySQL高可用方面是一个相对成熟的解决方案，它由日本DeNA公司youshimaton（现就职于 Facebook公司）开发，是一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件。在MySQL故障切换过程中，MHA能做到在 0~30秒之内自动完成数据库的故障切换操作，并且在进行故障切换的过程中，MHA能在最大程度上保证数据的一致性，以达到真正意义上的高可用。

该软件由两部分组成：MHA Manager（管理节点）和MHA Node（数据节点）。MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群，也可以部署在一台slave节点上。MHA Node运行在每台MySQL服务器上，MHA Manager会定时探测集群中的master节点，当master出现故障时，它可以自动将最新数据的slave提升为新的master，然后将所有其他的slave重新指向新的master。整个故障转移过程对应用程序完全透明。

在MHA自动故障切换过程中，MHA试图从宕机的主服务器上保存二进制日志，最大程度的保证数据的不丢失，但这并不总是可行的。例如，如果主服务器硬件故障或无法通过ssh访问，MHA没法保存二进制日志，只进行故障转移而丢失了最新的数据。使用MySQL 5.5的半同步复制，可以大大降低数据丢失的风险。MHA可以与半同步复制结合起来。如果只有一个slave已经收到了最新的二进制日志，MHA可以将最新的二进制日志应用于其他所有的slave服务器上，因此可以保证所有节点的数据一致性。

目前MHA主要支持一主多从的架构，要搭建MHA,要求一个复制集群中必须最少有三台数据库服务器，一主二从，即一台充当master，一台充当备用master，另外一台充当从库，因为至少需要三台服务器，出于机器成本的考虑，淘宝也在该基础上进行了改造，目前淘宝TMHA已经支持一主一从。

MHA工作原理总结为如下：

（1）从宕机崩溃的master保存二进制日志事件（binlog events）;

（2）识别含有最新更新的slave；

（3）应用差异的中继日志（relay log）到其他的slave；

（4）应用从master保存的二进制日志事件（binlog events）；

（5）提升一个slave为新的master；

（6）使其他的slave连接新的master进行复制；

MHA软件由两部分组成，Manager工具包和Node工具包.

二.部署MHA

实验环境介绍；
server1     172.25.17.1    node1
server2     172.25.17.2    node2
server4     172.25.17.4    node3
server5     172.25.17.5    作为监测节点；manager

步骤一：搭建一主两从

（1）主数据库：server1:

此时的server1为master,server2,server4为两个slave

所需安装包： mysql-5.7.24-1.el7.x86_64.rpm-bundle.tar

server1   搭建mysql 环境：
[root@server1 ~]# tar xf mysql-5.7.24-1.el7.x86_64.rpm-bundle.tar #解压缩
[root@server1 ~]# yum install -y mysql-community-client-5.7.24-1.el7.x86_64.rpm mysql-community-common-5.7.24-1.el7.x86_64.rpm  mysql-community-libs-5.7.24-1.el7.x86_64.rpm mysql-community-server-5.7.24-1.el7.x86_64.rpm mysql-community-libs-compat-5.7.24-1.el7.x86_64.rpm   #安装包

[root@server1 ~]# systemctl start mysqld   #启动Mysqld
[root@server1 ~]#  cat /var/log/mysqld.log | grep password  #寻找初始密码
2019-02-26T04:20:30.795660Z 1 [Note] A temporary password is generated for root@localhost: vkgfv-Y2bq5Y
[root@server1 ~]# mysql_secure_installation  #数据库初始化

Securing the MySQL server deployment.

Enter password for user root: 

The existing password for the user account root has expired. Please set a new password.

New password: 

Re-enter new password: 
The 'validate_password' plugin is installed on the server.
The subsequent steps will run with the existing configuration
of the plugin.
Using existing password for root.

Estimated strength of the password: 100 
Change the password for root ? ((Press y|Y for Yes, any other key for No) : 

 ... skipping.
By default, a MySQL installation has an anonymous user,
allowing anyone to log into MySQL without having to have
a user account created for them. This is intended only for
testing, and to make the installation go a bit smoother.
You should remove them before moving into a production
environment.

Remove anonymous users? (Press y|Y for Yes, any other key for No) : y
Success.


Normally, root should only be allowed to connect from
'localhost'. This ensures that someone cannot guess at
the root password from the network.

Disallow root login remotely? (Press y|Y for Yes, any other key for No) : y
Success.

By default, MySQL comes with a database named 'test' that
anyone can access. This is also intended only for testing,
and should be removed before moving into a production
environment.


Remove test database and access to it? (Press y|Y for Yes, any other key for No) : y
 - Dropping test database...
Success.

 - Removing privileges on test database...
Success.

Reloading the privilege tables will ensure that all changes
made so far will take effect immediately.

Reload privilege tables now? (Press y|Y for Yes, any other key for No) : y
Success.

All done! 

数据库搭建完成！！！！

接下来就是配置主数据库的部署操作了：

1.首先编辑配置文件：
[root@server1 ~]# vim /etc/my.cnf

 29 server_id=1
 30 gtid_mode=ON
 31 enforce_gtid_consistency=ON
 32 log_slave_updates=ON
 33 log_bin=binlog
[root@server1 ~]# systemctl restart mysqld
2.进入数据库设置主节点：
[root@server1 ~]# mysql -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.7.24 MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
#安装主节点插件
mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
Query OK, 0 rows affected (0.12 sec)
#安装从节点插件
mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
Query OK, 0 rows affected (0.04 sec)


mysql>  SET GLOBAL rpl_semi_sync_slave_enabled=1;
Query OK, 0 rows affected (0.00 sec)

mysql> SET GLOBAL rpl_semi_sync_master_timeout=10000;
Query OK, 0 rows affected (0.00 sec)

mysql> show variables like '%rpl%';
+-------------------------------------------+------------+
| Variable_name                             | Value      |
+-------------------------------------------+------------+
| rpl_semi_sync_master_enabled              | OFF        |
| rpl_semi_sync_master_timeout              | 10000      |
| rpl_semi_sync_master_trace_level          | 32         |
| rpl_semi_sync_master_wait_for_slave_count | 1          |
| rpl_semi_sync_master_wait_no_slave        | ON         |
| rpl_semi_sync_master_wait_point           | AFTER_SYNC |
| rpl_semi_sync_slave_enabled               | ON         |
| rpl_semi_sync_slave_trace_level           | 32         |
| rpl_stop_slave_timeout                    | 31536000   |
+-------------------------------------------+------------+
9 rows in set (0.00 sec)

mysql> show status like '%rpl%';
+--------------------------------------------+-------+
| Variable_name                              | Value |
+--------------------------------------------+-------+
| Rpl_semi_sync_master_clients               | 0     |
| Rpl_semi_sync_master_net_avg_wait_time     | 0     |
| Rpl_semi_sync_master_net_wait_time         | 0     |
| Rpl_semi_sync_master_net_waits             | 0     |
| Rpl_semi_sync_master_no_times              | 0     |
| Rpl_semi_sync_master_no_tx                 | 0     |
| Rpl_semi_sync_master_status                | OFF   |
| Rpl_semi_sync_master_timefunc_failures     | 0     |
| Rpl_semi_sync_master_tx_avg_wait_time      | 0     |
| Rpl_semi_sync_master_tx_wait_time          | 0     |
| Rpl_semi_sync_master_tx_waits              | 0     |
| Rpl_semi_sync_master_wait_pos_backtraverse | 0     |
| Rpl_semi_sync_master_wait_sessions         | 0     |
| Rpl_semi_sync_master_yes_tx                | 0     |
| Rpl_semi_sync_slave_status                 | OFF   |
+--------------------------------------------+-------+
15 rows in set (0.00 sec)

mysql> create database westos;
Query OK, 1 row affected (0.09 sec)

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| westos             |
+--------------------+
5 rows in set (0.00 sec)

mysql> exit
Bye
[root@server1 ~]# ls
mha4mysql-manager-0.58-0.el7.centos.noarch.rpm           mysql-community-minimal-debuginfo-5.7.24-1.el7.x86_64.rpm
mha4mysql-manager-0.58.tar.gz                            mysql-community-server-5.7.24-1.el7.x86_64.rpm
mha4mysql-node-0.58-0.el7.centos.noarch.rpm              mysql-community-server-minimal-5.7.24-1.el7.x86_64.rpm
mysql-5.7.24-1.el7.x86_64.rpm-bundle.tar                 mysql-community-test-5.7.24-1.el7.x86_64.rpm
mysql-community-client-5.7.24-1.el7.x86_64.rpm           perl-Config-Tiny-2.14-7.el7.noarch.rpm
mysql-community-common-5.7.24-1.el7.x86_64.rpm           perl-Email-Date-Format-1.002-15.el7.noarch.rpm
mysql-community-devel-5.7.24-1.el7.x86_64.rpm            perl-Log-Dispatch-2.41-1.el7.1.noarch.rpm
mysql-community-embedded-5.7.24-1.el7.x86_64.rpm         perl-Mail-Sender-0.8.23-1.el7.noarch.rpm
mysql-community-embedded-compat-5.7.24-1.el7.x86_64.rpm  perl-Mail-Sendmail-0.79-21.el7.noarch.rpm
mysql-community-embedded-devel-5.7.24-1.el7.x86_64.rpm   perl-MIME-Lite-3.030-1.el7.noarch.rpm
mysql-community-libs-5.7.24-1.el7.x86_64.rpm             perl-MIME-Types-1.38-2.el7.noarch.rpm
mysql-community-libs-compat-5.7.24-1.el7.x86_64.rpm      perl-Parallel-ForkManager-1.18-2.el7.noarch.rpm



[root@server1 ~]# yum install -y mha4mysql-node-0.58-0.el7.centos.noarch.rpm
                 
  
Complete!

（2）从数据库：server2.server4:

此部分server4与server2的操作基本相同，所以不再赘述server4.

server2数据库的安装以及初始化：

server2:

[root@server2 ~]# tar xf mysql-5.7.24-1.el7.x86_64.rpm-bundle.tar 
[root@server2 ~]# yum install -y mysql-community-client-5.7.24-1.el7.x86_64.rpm mysql-community-common-5.7.24-1.el7.x86_64.rpm  mysql-community-libs-5.7.24-1.el7.x86_64.rpm mysql-community-server-5.7.24-1.el7.x86_64.rpm mysql-community-libs-compat-5.7.24-1.el7.x86_64.rpm

Complete!
[root@server2 ~]# systemctl start mysqld
[root@server2 ~]#  cat /var/log/mysqld.log | grep password
2019-02-26T04:20:29.641252Z 1 [Note] A temporary password is generated for root@localhost: ukAa3edg#_&f
[root@server2 ~]# mysql_secure_installation

Securing the MySQL server deployment.

Enter password for user root: 

The existing password for the user account root has expired. Please set a new password.

New password: 

Re-enter new password: 
The 'validate_password' plugin is installed on the server.
The subsequent steps will run with the existing configuration
of the plugin.
Using existing password for root.

Estimated strength of the password: 100 
Change the password for root ? ((Press y|Y for Yes, any other key for No) : 

 ... skipping.
By default, a MySQL installation has an anonymous user,
allowing anyone to log into MySQL without having to have
a user account created for them. This is intended only for
testing, and to make the installation go a bit smoother.
You should remove them before moving into a production
environment.

Remove anonymous users? (Press y|Y for Yes, any other key for No) : y
Success.


Normally, root should only be allowed to connect from
'localhost'. This ensures that someone cannot guess at
the root password from the network.

Disallow root login remotely? (Press y|Y for Yes, any other key for No) : y
Success.

By default, MySQL comes with a database named 'test' that
anyone can access. This is also intended only for testing,
and should be removed before moving into a production
environment.


Remove test database and access to it? (Press y|Y for Yes, any other key for No) : y
 - Dropping test database...
Success.

 - Removing privileges on test database...
Success.

Reloading the privilege tables will ensure that all changes
made so far will take effect immediately.

Reload privilege tables now? (Press y|Y for Yes, any other key for No) : yy
Success.

All done!

从节点一的设置【server2】

[root@server2 ~]# vim /etc/my.cnf

 29  server_id=2    #此时server4设置的server_id=3
 30 gtid_mode=ON
 31 enforce_gtid_consistency=ON
 32 log_slave_updates=ON
 33 log_bin=binlog
 34 
[root@server2 ~]# systemctl restart mysqld
[root@server2 ~]# mysql -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.24-log MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

#设置server2作为172.25.17.1的从节点：
mysql> change master to master_host='172.25.17.1',master_user='repl',master_password='Westos-6',master_auto_position=1;
Query OK, 0 rows affected, 2 warnings (0.46 sec)

#启动从节点

mysql> start slave;
Query OK, 0 rows affected (0.08 sec)

server2作为从节点1启动后查看状态时：

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.25.17.1
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000001
          Read_Master_Log_Pos: 447
               Relay_Log_File: server2-relay-bin.000002
                Relay_Log_Pos: 654
        Relay_Master_Log_File: binlog.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 447
              Relay_Log_Space: 863
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
                  Master_UUID: da3b896e-397d-11e9-9492-5254001cd5e6
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: da3b896e-397d-11e9-9492-5254001cd5e6:1
            Executed_Gtid_Set: da3b896e-397d-11e9-9492-5254001cd5e6:1
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

ERROR: 
No query specified

server2作为从数据库的部署：

安装两个插件以及全局设置

mysql> INSTALL PLUGIN rpl_semi_sync_master SONAME 'semisync_master.so';
Query OK, 0 rows affected (0.15 sec)

mysql> INSTALL PLUGIN rpl_semi_sync_slave SONAME 'semisync_slave.so';
Query OK, 0 rows affected (0.05 sec)


mysql> SET GLOBAL rpl_semi_sync_slave_enabled=1;
Query OK, 0 rows affected (0.00 sec)

设置完成！

可以查看刚才主库建立的一个database;

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| mysql              |
| performance_schema |
| sys                |
| westos             |
+--------------------+
5 rows in set (0.01 sec)

mysql> exit
Bye

server2安装作为节点可需要的工具；

[root@server2 ~]# yum install -y mha4mysql-node-0.58-0.el7.centos.noarch.rpm

步骤二，MHA的配置：

测试主机的配置(server5)；

1.安装作为manager的工具：

[root@server5 ~]# yum install mha4mysql-manager-0.58-0.el7.centos.noarch.rpm perl-* mha4mysql-node-0.58-0.el7.centos.noarch.rpm

2.创建MHA的工作目录，并且创建相关配置文件

touch /etc/masterha/app1.cnf

3.修改配置文件；（注意，配置文件中的注释需要去掉，我这里是为了解释清楚）：

vim /etc/masterha/app1.cnf

[server default]
manager_workdir=/etc/masterha/app1.log   //设置manager的工作目录             
manager_log=/var/log/masterha.log         //设置manager的日志
master_binlog_dir=/etc/masterha                         
#master_ip_failover_script= /usr/local/bin/master_ip_failover  //设置master 保存binlog的位置，以便MHA可以找到master的日志  
#master_ip_online_change_script= /usr/local/bin/master_ip_online_change  
//设置手动切换时候的切换脚本
password=Westos-6         //设置mysql中root用户的密码，这个密码是前文中创建监控用户的那个密码
user=root              //设置监控用户root
remote_workdir=/tmp    //设置远端mysql在发生切换时binlog的保存位置
repl_password=Westos-6     //设置复制用户的密码
repl_user=repl         //设置复制环境中的复制用户名
#report_script=/usr/local/send_report    //设置发生切换后发送的报警的脚本
#secondary_check_script= /usr/local/bin/masterha_secondary_check -s server03 -s server02            
#shutdown_script=""   //设置故障发生后关闭故障主机脚本（该脚本的主要作用是关闭主机放在发生脑裂,这里没有使用）
ssh_user=root           //设置ssh的登录用户名
 


[server1]
hostname=172.25.17.1
port=3306

[server2]
hostname=172.25.17.2
port=3306
candidate_master=1   //设置为候选master，如果设置该参数以后，发生主从切换以后将会将此从库提升为主库，即使这个主库不是集群中事件最新的slave
check_repl_delay=0   //默认情况下如果一个slave落后master 100M的relay logs的话，MHA将不会选择该slave作为一个新的master，因为对于这个slave的恢复需要花费很长时间，通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时，这个参数对于设置了candidate_master=1的主机非常有用，因为这个候选主在切换的过程中一定是新的master


[server4]
hostname=172.25.17.4
port=3306
no_master=1

步骤三，免密登陆的配置：

【需要实现主从，从从以及manager跟各个节点之间的免密登陆】

1.manager5主机上生成秘钥：

[root@server5 ~]# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Created directory '/root/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
8c:b8:45:b4:f6:2e:a5:b9:42:5e:4b:cb:13:8b:15:8f root@server5
The key's randomart image is:
+--[ RSA 2048]----+
|      .          |
|     . .         |
|      +          |
|     +.+         |
|    . o+S        |
|    .oE=.        |
|   o.*+=.        |
|    + *o         |
|     ...         |
+-----------------+

2.server5分别将钥匙发送给其余三台主机

[root@server5 ~]# scp -r .ssh/ server1:
The authenticity of host 'server1 (172.25.17.1)' can't be established.
ECDSA key fingerprint is 79:84:7e:7b:1c:e7:39:8b:c0:1b:88:71:ec:49:36:85.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'server1,172.25.17.1' (ECDSA) to the list of known hosts.
root@server1's password: 
id_rsa                                                                    100% 1679     1.6KB/s   00:00    
id_rsa.pub                                                                100%  394     0.4KB/s   00:00    
known_hosts                                                               100%  181     0.2KB/s   00:00    
[root@server5 ~]# scp -r .ssh/ server2:
The authenticity of host 'server2 (172.25.17.2)' can't be established.
ECDSA key fingerprint is fd:a4:81:a5:c2:57:bf:0a:c4:a4:dd:38:87:cf:56:5f.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'server2,172.25.17.2' (ECDSA) to the list of known hosts.
root@server2's password: 
id_rsa                                                                    100% 1679     1.6KB/s   00:00    
id_rsa.pub                                                                100%  394     0.4KB/s   00:00    
known_hosts                                                               100%  362     0.4KB/s   00:00    
[root@server5 ~]# scp -r .ssh/ server4:
The authenticity of host 'server4 (172.25.17.4)' can't be established.
ECDSA key fingerprint is ee:4a:6f:f9:da:eb:8a:51:0a:03:d9:55:38:e8:82:bb.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'server4,172.25.17.4' (ECDSA) to the list of known hosts.
root@server4's password: 
id_rsa                                                                    100% 1679     1.6KB/s   00:00    
id_rsa.pub                                                                100%  394     0.4KB/s   00:00    
known_hosts                                                               100%  543     0.5KB/s   00:00

3.执行免密操作的连接；

[root@server5 ~]# ssh-copy-id server1
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@server1's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'server1'"
and check to make sure that only the key(s) you wanted were added.

[root@server5 ~]# ssh-copy-id server2
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@server2's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'server2'"
and check to make sure that only the key(s) you wanted were added.

[root@server5 ~]# ssh-copy-id server4
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@server4's password: 

Number of key(s) added: 1

Now try logging into the machine, with:   "ssh 'server4'"
and check to make sure that only the key(s) you wanted were added.

4.查看免密操作是否成功(注意：登陆后一定要退出)；

[root@server5 ~]# ssh server2
Last login: Tue Feb 26 14:18:47 2019 from server5
[root@server2 ~]# ssh server1
Last login: Tue Feb 26 12:14:01 2019 from foundation17.ilt.example.com
[root@server1 ~]# ssh server4
The authenticity of host 'server4 (172.25.17.4)' can't be established.
ECDSA key fingerprint is ee:4a:6f:f9:da:eb:8a:51:0a:03:d9:55:38:e8:82:bb.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'server4,172.25.17.4' (ECDSA) to the list of known hosts.
Last login: Tue Feb 26 12:14:14 2019 from foundation17.ilt.example.com
[root@server4 ~]# exit
logout
Connection to server4 closed.
[root@server1 ~]# exit
logout
Connection to server1 closed.
[root@server2 ~]# exit
logout
Connection to server2 closed.

5.检查MHA Manger到所有MHA Node的SSH连接状态：

执行之后看到successful即可：

[root@server5 ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf 
Tue Feb 26 14:23:43 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Feb 26 14:23:43 2019 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Tue Feb 26 14:23:43 2019 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Tue Feb 26 14:23:43 2019 - [info] Starting SSH connection tests..
Tue Feb 26 14:23:44 2019 - [debug] 
Tue Feb 26 14:23:43 2019 - [debug]  Connecting via SSH from [email protected](172.25.17.1:22) to [email protected](172.25.17.2:22)..
Tue Feb 26 14:23:43 2019 - [debug]   ok.
Tue Feb 26 14:23:43 2019 - [debug]  Connecting via SSH from [email protected](172.25.17.1:22) to [email protected](172.25.17.4:22)..
Tue Feb 26 14:23:43 2019 - [debug]   ok.
Tue Feb 26 14:23:44 2019 - [debug] 
Tue Feb 26 14:23:44 2019 - [debug]  Connecting via SSH from [email protected](172.25.17.2:22) to [email protected](172.25.17.1:22)..
Tue Feb 26 14:23:44 2019 - [debug]   ok.
Tue Feb 26 14:23:44 2019 - [debug]  Connecting via SSH from [email protected](172.25.17.2:22) to [email protected](172.25.17.4:22)..
Warning: Permanently added '172.25.17.4' (ECDSA) to the list of known hosts.
Tue Feb 26 14:23:44 2019 - [debug]   ok.
Tue Feb 26 14:23:45 2019 - [debug] 
Tue Feb 26 14:23:44 2019 - [debug]  Connecting via SSH from [email protected](172.25.17.4:22) to [email protected](172.25.17.1:22)..
Tue Feb 26 14:23:44 2019 - [debug]   ok.
Tue Feb 26 14:23:44 2019 - [debug]  Connecting via SSH from [email protected](172.25.17.4:22) to [email protected](172.25.17.2:22)..
Tue Feb 26 14:23:44 2019 - [debug]   ok.
Tue Feb 26 14:23:45 2019 - [info] All SSH connection tests passed successfully.

步骤四：检查整个复制环境状况。

检测之前需要执行两步操作；

步骤一：主库修改设置权限；

[root@server1 ~]# mysql -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 10
Server version: 5.7.24-log MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> grant all on *.*  to root@'%' identified by 'Westos-6';
Query OK, 0 rows affected, 1 warning (0.02 sec)

步骤二：从库server2设置只读模式；

[root@server2 ~]# mysql -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 7
Server version: 5.7.24-log MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> set GLOBAL read_only=1;
Query OK, 0 rows affected (0.00 sec)

以上两个步骤操作已经完成，接下来manager就可以进行检测了；

[root@server5 ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf 
Tue Feb 26 14:34:14 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Feb 26 14:34:14 2019 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Tue Feb 26 14:34:14 2019 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Tue Feb 26 14:34:14 2019 - [info] MHA::MasterMonitor version 0.58.
Tue Feb 26 14:34:15 2019 - [info] GTID failover mode = 1
Tue Feb 26 14:34:15 2019 - [info] Dead Servers:
Tue Feb 26 14:34:15 2019 - [info] Alive Servers:
Tue Feb 26 14:34:15 2019 - [info]   172.25.17.1(172.25.17.1:3306)
Tue Feb 26 14:34:15 2019 - [info]   172.25.17.2(172.25.17.2:3306)
Tue Feb 26 14:34:15 2019 - [info]   172.25.17.4(172.25.17.4:3306)
Tue Feb 26 14:34:15 2019 - [info] Alive Slaves:
Tue Feb 26 14:34:15 2019 - [info]   172.25.17.2(172.25.17.2:3306)  Version=5.7.24-log (oldest major version between slaves) log-bin:enabled
Tue Feb 26 14:34:15 2019 - [info]     GTID ON
Tue Feb 26 14:34:15 2019 - [info]     Replicating from 172.25.17.1(172.25.17.1:3306)
Tue Feb 26 14:34:15 2019 - [info]     Primary candidate for the new Master (candidate_master is set)
Tue Feb 26 14:34:15 2019 - [info]   172.25.17.4(172.25.17.4:3306)  Version=5.7.24-log (oldest major version between slaves) log-bin:enabled
Tue Feb 26 14:34:15 2019 - [info]     GTID ON
Tue Feb 26 14:34:15 2019 - [info]     Replicating from 172.25.17.1(172.25.17.1:3306)
Tue Feb 26 14:34:15 2019 - [info]     Not candidate for the new Master (no_master is set)
Tue Feb 26 14:34:15 2019 - [info] Current Alive Master: 172.25.17.1(172.25.17.1:3306)
Tue Feb 26 14:34:15 2019 - [info] Checking slave configurations..
Tue Feb 26 14:34:15 2019 - [info]  read_only=1 is not set on slave 172.25.17.4(172.25.17.4:3306).
Tue Feb 26 14:34:15 2019 - [info] Checking replication filtering settings..
Tue Feb 26 14:34:15 2019 - [info]  binlog_do_db= , binlog_ignore_db= 
Tue Feb 26 14:34:15 2019 - [info]  Replication filtering check ok.
Tue Feb 26 14:34:15 2019 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Tue Feb 26 14:34:15 2019 - [info] Checking SSH publickey authentication settings on the current master..
Tue Feb 26 14:34:16 2019 - [info] HealthCheck: SSH to 172.25.17.1 is reachable.
Tue Feb 26 14:34:16 2019 - [info] 
172.25.17.1(172.25.17.1:3306) (current master)
 +--172.25.17.2(172.25.17.2:3306)
 +--172.25.17.4(172.25.17.4:3306)

Tue Feb 26 14:34:16 2019 - [info] Checking replication health on 172.25.17.2..
Tue Feb 26 14:34:16 2019 - [info]  ok.
Tue Feb 26 14:34:16 2019 - [info] Checking replication health on 172.25.17.4..
Tue Feb 26 14:34:16 2019 - [info]  ok.
Tue Feb 26 14:34:16 2019 - [warning] master_ip_failover_script is not defined.
Tue Feb 26 14:34:16 2019 - [warning] shutdown_script is not defined.
Tue Feb 26 14:34:16 2019 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

显示ok即可！

到此MHA的部署已经完成，接下来介绍三种切换；

1.手动切换：

(1)首先需要关闭主库的mysqld;

[root@server1 ~]# systemctl stop mysqld

(2)manger执行切换命令：

[root@server5 ~]# masterha_master_switch --master_state=dead --conf=/etc/masterha/app1.cnf --dead_master_host=172.25.17.1 --dead_master_port=3306 --new_master_host=172.25.17.2 --new_master_port=3306
--dead_master_ip=<dead_master_ip> is not set. Using 172.25.17.1.
Tue Feb 26 15:07:31 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Feb 26 15:07:31 2019 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Tue Feb 26 15:07:31 2019 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Tue Feb 26 15:07:31 2019 - [info] MHA::MasterFailover version 0.58.
Tue Feb 26 15:07:31 2019 - [info] Starting master failover.
Tue Feb 26 15:07:31 2019 - [info] 
Tue Feb 26 15:07:31 2019 - [info] * Phase 1: Configuration Check Phase..
Tue Feb 26 15:07:31 2019 - [info] 
Tue Feb 26 15:07:32 2019 - [info] GTID failover mode = 1
Tue Feb 26 15:07:32 2019 - [info] Dead Servers:
Tue Feb 26 15:07:32 2019 - [info]   172.25.17.1(172.25.17.1:3306)
Tue Feb 26 15:07:32 2019 - [info] Checking master reachability via MySQL(double check)...
Tue Feb 26 15:07:32 2019 - [info]  ok.
Tue Feb 26 15:07:32 2019 - [info] Alive Servers:
Tue Feb 26 15:07:32 2019 - [info]   172.25.17.2(172.25.17.2:3306)
Tue Feb 26 15:07:32 2019 - [info]   172.25.17.4(172.25.17.4:3306)
Tue Feb 26 15:07:32 2019 - [info] Alive Slaves:
Tue Feb 26 15:07:32 2019 - [info]   172.25.17.2(172.25.17.2:3306)  Version=5.7.24-log (oldest major version between slaves) log-bin:enabled
Tue Feb 26 15:07:32 2019 - [info]     GTID ON
Tue Feb 26 15:07:32 2019 - [info]     Replicating from 172.25.17.1(172.25.17.1:3306)
Tue Feb 26 15:07:32 2019 - [info]     Primary candidate for the new Master (candidate_master is set)
Tue Feb 26 15:07:32 2019 - [info]   172.25.17.4(172.25.17.4:3306)  Version=5.7.24-log (oldest major version between slaves) log-bin:enabled
Tue Feb 26 15:07:32 2019 - [info]     GTID ON
Tue Feb 26 15:07:32 2019 - [info]     Replicating from 172.25.17.1(172.25.17.1:3306)
Tue Feb 26 15:07:32 2019 - [info]     Not candidate for the new Master (no_master is set)
Master 172.25.17.1(172.25.17.1:3306) is dead. Proceed? (yes/NO): yes
Tue Feb 26 15:07:35 2019 - [info] Starting GTID based failover.
Tue Feb 26 15:07:35 2019 - [info] 
Tue Feb 26 15:07:35 2019 - [info] ** Phase 1: Configuration Check Phase completed.
Tue Feb 26 15:07:35 2019 - [info] 
Tue Feb 26 15:07:35 2019 - [info] * Phase 2: Dead Master Shutdown Phase..
Tue Feb 26 15:07:35 2019 - [info] 
Tue Feb 26 15:07:35 2019 - [info] HealthCheck: SSH to 172.25.17.1 is reachable.
Tue Feb 26 15:07:35 2019 - [info] Forcing shutdown so that applications never connect to the current master..
Tue Feb 26 15:07:35 2019 - [warning] master_ip_failover_script is not set. Skipping invalidating dead master IP address.
Tue Feb 26 15:07:35 2019 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Tue Feb 26 15:07:35 2019 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Tue Feb 26 15:07:35 2019 - [info] 
Tue Feb 26 15:07:35 2019 - [info] * Phase 3: Master Recovery Phase..
Tue Feb 26 15:07:35 2019 - [info] 
Tue Feb 26 15:07:35 2019 - [info] * Phase 3.1: Getting Latest Slaves Phase..
Tue Feb 26 15:07:35 2019 - [info] 
Tue Feb 26 15:07:35 2019 - [info] The latest binary log file/position on all slaves is binlog.000001:892
Tue Feb 26 15:07:35 2019 - [info] Retrieved Gtid Set: da3b896e-397d-11e9-9492-5254001cd5e6:1-3
Tue Feb 26 15:07:35 2019 - [info] Latest slaves (Slaves that received relay log files to the latest):
Tue Feb 26 15:07:35 2019 - [info]   172.25.17.2(172.25.17.2:3306)  Version=5.7.24-log (oldest major version between slaves) log-bin:enabled
Tue Feb 26 15:07:35 2019 - [info]     GTID ON
Tue Feb 26 15:07:35 2019 - [info]     Replicating from 172.25.17.1(172.25.17.1:3306)
Tue Feb 26 15:07:35 2019 - [info]     Primary candidate for the new Master (candidate_master is set)
Tue Feb 26 15:07:35 2019 - [info]   172.25.17.4(172.25.17.4:3306)  Version=5.7.24-log (oldest major version between slaves) log-bin:enabled
Tue Feb 26 15:07:35 2019 - [info]     GTID ON
Tue Feb 26 15:07:35 2019 - [info]     Replicating from 172.25.17.1(172.25.17.1:3306)
Tue Feb 26 15:07:35 2019 - [info]     Not candidate for the new Master (no_master is set)
Tue Feb 26 15:07:35 2019 - [info] The oldest binary log file/position on all slaves is binlog.000001:892
Tue Feb 26 15:07:35 2019 - [info] Retrieved Gtid Set: da3b896e-397d-11e9-9492-5254001cd5e6:1-3
Tue Feb 26 15:07:35 2019 - [info] Oldest slaves:
Tue Feb 26 15:07:35 2019 - [info]   172.25.17.2(172.25.17.2:3306)  Version=5.7.24-log (oldest major version between slaves) log-bin:enabled
Tue Feb 26 15:07:35 2019 - [info]     GTID ON
Tue Feb 26 15:07:35 2019 - [info]     Replicating from 172.25.17.1(172.25.17.1:3306)
Tue Feb 26 15:07:35 2019 - [info]     Primary candidate for the new Master (candidate_master is set)
Tue Feb 26 15:07:35 2019 - [info]   172.25.17.4(172.25.17.4:3306)  Version=5.7.24-log (oldest major version between slaves) log-bin:enabled
Tue Feb 26 15:07:35 2019 - [info]     GTID ON
Tue Feb 26 15:07:35 2019 - [info]     Replicating from 172.25.17.1(172.25.17.1:3306)
Tue Feb 26 15:07:35 2019 - [info]     Not candidate for the new Master (no_master is set)
Tue Feb 26 15:07:35 2019 - [info] 
Tue Feb 26 15:07:35 2019 - [info] * Phase 3.3: Determining New Master Phase..
Tue Feb 26 15:07:35 2019 - [info] 
Tue Feb 26 15:07:35 2019 - [info] 172.25.17.2 can be new master.
Tue Feb 26 15:07:35 2019 - [info] New master is 172.25.17.2(172.25.17.2:3306)
Tue Feb 26 15:07:35 2019 - [info] Starting master failover..
Tue Feb 26 15:07:35 2019 - [info] 
From:
172.25.17.1(172.25.17.1:3306) (current master)
 +--172.25.17.2(172.25.17.2:3306)
 +--172.25.17.4(172.25.17.4:3306)

To:
172.25.17.2(172.25.17.2:3306) (new master)
 +--172.25.17.4(172.25.17.4:3306)

Starting master switch from 172.25.17.1(172.25.17.1:3306) to 172.25.17.2(172.25.17.2:3306)? (yes/NO): yes
Tue Feb 26 15:07:37 2019 - [info] New master decided manually is 172.25.17.2(172.25.17.2:3306)
Tue Feb 26 15:07:37 2019 - [info] 
Tue Feb 26 15:07:37 2019 - [info] * Phase 3.3: New Master Recovery Phase..
Tue Feb 26 15:07:37 2019 - [info] 
Tue Feb 26 15:07:37 2019 - [info]  Waiting all logs to be applied.. 
Tue Feb 26 15:07:37 2019 - [info]   done.
Tue Feb 26 15:07:37 2019 - [info] Getting new master's binlog name and position..
Tue Feb 26 15:07:37 2019 - [info]  binlog.000001:892
Tue Feb 26 15:07:37 2019 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.25.17.2', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Tue Feb 26 15:07:37 2019 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: binlog.000001, 892, da3b896e-397d-11e9-9492-5254001cd5e6:1-3
Tue Feb 26 15:07:37 2019 - [warning] master_ip_failover_script is not set. Skipping taking over new master IP address.
Tue Feb 26 15:07:37 2019 - [info] Setting read_only=0 on 172.25.17.2(172.25.17.2:3306)..
Tue Feb 26 15:07:37 2019 - [info]  ok.
Tue Feb 26 15:07:37 2019 - [info] ** Finished master recovery successfully.
Tue Feb 26 15:07:37 2019 - [info] * Phase 3: Master Recovery Phase completed.
Tue Feb 26 15:07:37 2019 - [info] 
Tue Feb 26 15:07:37 2019 - [info] * Phase 4: Slaves Recovery Phase..
Tue Feb 26 15:07:37 2019 - [info] 
Tue Feb 26 15:07:37 2019 - [info] 
Tue Feb 26 15:07:37 2019 - [info] * Phase 4.1: Starting Slaves in parallel..
Tue Feb 26 15:07:37 2019 - [info] 
Tue Feb 26 15:07:37 2019 - [info] -- Slave recovery on host 172.25.17.4(172.25.17.4:3306) started, pid: 12280. Check tmp log /etc/masterha/172.25.17.4_3306_20190226150731.log if it takes time..
Tue Feb 26 15:07:38 2019 - [info] 
Tue Feb 26 15:07:38 2019 - [info] Log messages from 172.25.17.4 ...
Tue Feb 26 15:07:38 2019 - [info] 
Tue Feb 26 15:07:37 2019 - [info]  Resetting slave 172.25.17.4(172.25.17.4:3306) and starting replication from the new master 172.25.17.2(172.25.17.2:3306)..
Tue Feb 26 15:07:38 2019 - [info]  Executed CHANGE MASTER.
Tue Feb 26 15:07:38 2019 - [info]  Slave started.
Tue Feb 26 15:07:38 2019 - [info]  gtid_wait(da3b896e-397d-11e9-9492-5254001cd5e6:1-3) completed on 172.25.17.4(172.25.17.4:3306). Executed 0 events.
Tue Feb 26 15:07:38 2019 - [info] End of log messages from 172.25.17.4.
Tue Feb 26 15:07:38 2019 - [info] -- Slave on host 172.25.17.4(172.25.17.4:3306) started.
Tue Feb 26 15:07:38 2019 - [info] All new slave servers recovered successfully.
Tue Feb 26 15:07:38 2019 - [info] 
Tue Feb 26 15:07:38 2019 - [info] * Phase 5: New master cleanup phase..
Tue Feb 26 15:07:38 2019 - [info] 
Tue Feb 26 15:07:38 2019 - [info] Resetting slave info on the new master..
Tue Feb 26 15:07:39 2019 - [info]  172.25.17.2: Resetting slave info succeeded.
Tue Feb 26 15:07:39 2019 - [info] Master failover to 172.25.17.2(172.25.17.2:3306) completed successfully.
Tue Feb 26 15:07:39 2019 - [info] 

----- Failover Report -----

app1: MySQL Master failover 172.25.17.1(172.25.17.1:3306) to 172.25.17.2(172.25.17.2:3306) succeeded

Master 172.25.17.1(172.25.17.1:3306) is down!

Check MHA Manager logs at server5 for details.

Started manual(interactive) failover.
Selected 172.25.17.2(172.25.17.2:3306) as a new master.
172.25.17.2(172.25.17.2:3306): OK: Applying all logs succeeded.
172.25.17.4(172.25.17.4:3306): OK: Slave started, replicating from 172.25.17.2(172.25.17.2:3306)
172.25.17.2(172.25.17.2:3306): Resetting slave info succeeded.
Master failover to 172.25.17.2(172.25.17.2:3306) completed successfully.

此时已经成功将master从172.25.17.1切换到172.25.17.2

(3)检测：

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.25.17.2
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000001
          Read_Master_Log_Pos: 892
               Relay_Log_File: server4-relay-bin.000002
                Relay_Log_Pos: 405
        Relay_Master_Log_File: binlog.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 892
              Relay_Log_Space: 614
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 2
                  Master_UUID: d9947527-397d-11e9-9425-525400dce289
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: da3b896e-397d-11e9-9492-5254001cd5e6:1-3
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

ERROR: 
No query specified

此时 Master_Host显示的是 172.25.17.2，

而开启server1时，它已经变成了一个slave;

[root@server1 ~]# systemctl start mysqld

Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.24-log MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> CHANGE MASTER TO MASTER_HOST='172.25.17.2', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='Westos-6';
Query OK, 0 rows affected, 2 warnings (0.29 sec)

mysql> start slave;
Query OK, 0 rows affected (0.02 sec)

mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.25.17.2
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000001
          Read_Master_Log_Pos: 892
               Relay_Log_File: server1-relay-bin.000002
                Relay_Log_Pos: 405
        Relay_Master_Log_File: binlog.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 892
              Relay_Log_Space: 614
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 2
                  Master_UUID: d9947527-397d-11e9-9425-525400dce289
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: da3b896e-397d-11e9-9492-5254001cd5e6:1-3
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

ERROR: 
No query specified

2.热切：

热切即就是不停掉任何一台主机的mysqld;

由于此文件存在的话八小时之内不允许再进行切换，所以我们需要删除这个文件，
[root@server5 ~]# cd /etc/masterha/
[root@server5 masterha]# ls
app1.cnf app1.failover.complete
[root@server5 masterha]# rm -fr app1.failover.complete

将host从server2切换到server1;

[root@server5 masterha]# masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=172.25.17.1 --new_master_port=3306 --orig_master_is_new_slav
Tue Feb 26 15:16:05 2019 - [info] MHA::MasterRotate version 0.58.
Tue Feb 26 15:16:05 2019 - [info] Starting online master switch..
Tue Feb 26 15:16:05 2019 - [info] 
Tue Feb 26 15:16:05 2019 - [info] * Phase 1: Configuration Check Phase..
Tue Feb 26 15:16:05 2019 - [info] 
Tue Feb 26 15:16:05 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Feb 26 15:16:05 2019 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Tue Feb 26 15:16:05 2019 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Tue Feb 26 15:16:06 2019 - [info] GTID failover mode = 1
Tue Feb 26 15:16:06 2019 - [info] Current Alive Master: 172.25.17.2(172.25.17.2:3306)
Tue Feb 26 15:16:06 2019 - [info] Alive Slaves:
Tue Feb 26 15:16:06 2019 - [info]   172.25.17.1(172.25.17.1:3306)  Version=5.7.24-log (oldest major version between slaves) log-bin:enabled
Tue Feb 26 15:16:06 2019 - [info]     GTID ON
Tue Feb 26 15:16:06 2019 - [info]     Replicating from 172.25.17.2(172.25.17.2:3306)
Tue Feb 26 15:16:06 2019 - [info]   172.25.17.4(172.25.17.4:3306)  Version=5.7.24-log (oldest major version between slaves) log-bin:enabled
Tue Feb 26 15:16:06 2019 - [info]     GTID ON
Tue Feb 26 15:16:06 2019 - [info]     Replicating from 172.25.17.2(172.25.17.2:3306)
Tue Feb 26 15:16:06 2019 - [info]     Not candidate for the new Master (no_master is set)

It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 172.25.17.2(172.25.17.2:3306)? (YES/no): yes
Tue Feb 26 15:16:12 2019 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Tue Feb 26 15:16:12 2019 - [info]  ok.
Tue Feb 26 15:16:12 2019 - [info] Checking MHA is not monitoring or doing failover..
Tue Feb 26 15:16:12 2019 - [info] Checking replication health on 172.25.17.1..
Tue Feb 26 15:16:12 2019 - [info]  ok.
Tue Feb 26 15:16:12 2019 - [info] Checking replication health on 172.25.17.4..
Tue Feb 26 15:16:12 2019 - [info]  ok.
Tue Feb 26 15:16:12 2019 - [info] 172.25.17.1 can be new master.
Tue Feb 26 15:16:12 2019 - [info] 
From:
172.25.17.2(172.25.17.2:3306) (current master)
 +--172.25.17.1(172.25.17.1:3306)
 +--172.25.17.4(172.25.17.4:3306)

To:
172.25.17.1(172.25.17.1:3306) (new master)
 +--172.25.17.4(172.25.17.4:3306)
 +--172.25.17.2(172.25.17.2:3306)

Starting master switch from 172.25.17.2(172.25.17.2:3306) to 172.25.17.1(172.25.17.1:3306)? (yes/NO): yes
Tue Feb 26 15:16:13 2019 - [info] Checking whether 172.25.17.1(172.25.17.1:3306) is ok for the new master..
Tue Feb 26 15:16:13 2019 - [info]  ok.
Tue Feb 26 15:16:13 2019 - [info] 172.25.17.2(172.25.17.2:3306): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host.
Tue Feb 26 15:16:14 2019 - [info] 172.25.17.2(172.25.17.2:3306): Resetting slave pointing to the dummy host.
Tue Feb 26 15:16:14 2019 - [info] ** Phase 1: Configuration Check Phase completed.
Tue Feb 26 15:16:14 2019 - [info] 
Tue Feb 26 15:16:14 2019 - [info] * Phase 2: Rejecting updates Phase..
Tue Feb 26 15:16:14 2019 - [info] 
master_ip_online_change_script is not defined. If you do not disable writes on the current master manually, applications keep writing on the current master. Is it ok to proceed? (yes/NO): yes
Tue Feb 26 15:16:19 2019 - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Tue Feb 26 15:16:19 2019 - [info] Executing FLUSH TABLES WITH READ LOCK..
Tue Feb 26 15:16:19 2019 - [info]  ok.
Tue Feb 26 15:16:19 2019 - [info] Orig master binlog:pos is binlog.000001:892.
Tue Feb 26 15:16:19 2019 - [info]  Waiting to execute all relay logs on 172.25.17.1(172.25.17.1:3306)..
Tue Feb 26 15:16:19 2019 - [info]  master_pos_wait(binlog.000001:892) completed on 172.25.17.1(172.25.17.1:3306). Executed 0 events.
Tue Feb 26 15:16:19 2019 - [info]   done.
Tue Feb 26 15:16:19 2019 - [info] Getting new master's binlog name and position..
Tue Feb 26 15:16:19 2019 - [info]  binlog.000002:194
Tue Feb 26 15:16:19 2019 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.25.17.1', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Tue Feb 26 15:16:19 2019 - [info] 
Tue Feb 26 15:16:19 2019 - [info] * Switching slaves in parallel..
Tue Feb 26 15:16:19 2019 - [info] 
Tue Feb 26 15:16:19 2019 - [info] -- Slave switch on host 172.25.17.4(172.25.17.4:3306) started, pid: 12296
Tue Feb 26 15:16:19 2019 - [info] 
Tue Feb 26 15:16:20 2019 - [info] Log messages from 172.25.17.4 ...
Tue Feb 26 15:16:20 2019 - [info] 
Tue Feb 26 15:16:19 2019 - [info]  Waiting to execute all relay logs on 172.25.17.4(172.25.17.4:3306)..
Tue Feb 26 15:16:19 2019 - [info]  master_pos_wait(binlog.000001:892) completed on 172.25.17.4(172.25.17.4:3306). Executed 0 events.
Tue Feb 26 15:16:19 2019 - [info]   done.
Tue Feb 26 15:16:19 2019 - [info]  Resetting slave 172.25.17.4(172.25.17.4:3306) and starting replication from the new master 172.25.17.1(172.25.17.1:3306)..
Tue Feb 26 15:16:19 2019 - [info]  Executed CHANGE MASTER.
Tue Feb 26 15:16:19 2019 - [info]  Slave started.
Tue Feb 26 15:16:20 2019 - [info] End of log messages from 172.25.17.4 ...
Tue Feb 26 15:16:20 2019 - [info] 
Tue Feb 26 15:16:20 2019 - [info] -- Slave switch on host 172.25.17.4(172.25.17.4:3306) succeeded.
Tue Feb 26 15:16:20 2019 - [info] Unlocking all tables on the orig master:
Tue Feb 26 15:16:20 2019 - [info] Executing UNLOCK TABLES..
Tue Feb 26 15:16:20 2019 - [info]  ok.
Tue Feb 26 15:16:20 2019 - [info] Starting orig master as a new slave..
Tue Feb 26 15:16:20 2019 - [info]  Resetting slave 172.25.17.2(172.25.17.2:3306) and starting replication from the new master 172.25.17.1(172.25.17.1:3306)..
Tue Feb 26 15:16:20 2019 - [info]  Executed CHANGE MASTER.
Tue Feb 26 15:16:20 2019 - [info]  Slave started.
Tue Feb 26 15:16:20 2019 - [info] All new slave servers switched successfully.
Tue Feb 26 15:16:20 2019 - [info] 
Tue Feb 26 15:16:20 2019 - [info] * Phase 5: New master cleanup phase..
Tue Feb 26 15:16:20 2019 - [info] 
Tue Feb 26 15:16:21 2019 - [info]  172.25.17.1: Resetting slave info succeeded.
Tue Feb 26 15:16:21 2019 - [info] Switching master to 172.25.17.1(172.25.17.1:3306) completed successfully.

检测：
server2:

检测：
server2:
mysql> show slave status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.25.17.1
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000002
          Read_Master_Log_Pos: 194
               Relay_Log_File: server2-relay-bin.000002
                Relay_Log_Pos: 361
        Relay_Master_Log_File: binlog.000002
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 194
              Relay_Log_Space: 570
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 1
                  Master_UUID: da3b896e-397d-11e9-9492-5254001cd5e6
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: da3b896e-397d-11e9-9492-5254001cd5e6:1-3
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

ERROR: 
No query specified

3.自动切换

操作前提：

[root@server5 masterha]# nohup masterha_manager --conf=/etc/masterha/app1.cnf &> /dev/null &
[1] 12301

[root@server1 ~]# systemctl stop mysqld

[root@server5 masterha]# nohup masterha_manager --conf=/etc/masterha/app1.cnf &> /dev/null &
[1] 12301

[root@server1 ~]# systemctl stop mysqld

server5上显示：

[root@server5 masterha]# 
[1]+  Done                    nohup masterha_manager --conf=/etc/masterha/app1.cnf &>/dev/null
[root@server1 ~]# systemctl start mysqld
[root@server1 ~]# mysql -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.7.24-log MySQL Community Server (GPL)

Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.


mysql> CHANGE MASTER TO MASTER_HOST='172.25.17.2', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='Westos-6';
Query OK, 0 rows affected, 2 warnings (0.19 sec)

mysql> start slave;
Query OK, 0 rows affected (0.07 sec)

mysql> show slave  status\G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 172.25.17.2
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: binlog.000001
          Read_Master_Log_Pos: 892
               Relay_Log_File: server1-relay-bin.000002
                Relay_Log_Pos: 405
        Relay_Master_Log_File: binlog.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 892
              Relay_Log_Space: 614
              Until_Condition: None
               Until_Log_File: 
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File: 
           Master_SSL_CA_Path: 
              Master_SSL_Cert: 
            Master_SSL_Cipher: 
               Master_SSL_Key: 
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error: 
               Last_SQL_Errno: 0
               Last_SQL_Error: 
  Replicate_Ignore_Server_Ids: 
             Master_Server_Id: 2
                  Master_UUID: d9947527-397d-11e9-9425-525400dce289
             Master_Info_File: /var/lib/mysql/master.info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind: 
      Last_IO_Error_Timestamp: 
     Last_SQL_Error_Timestamp: 
               Master_SSL_Crl: 
           Master_SSL_Crlpath: 
           Retrieved_Gtid_Set: 
            Executed_Gtid_Set: da3b896e-397d-11e9-9492-5254001cd5e6:1-3
                Auto_Position: 1
         Replicate_Rewrite_DB: 
                 Channel_Name: 
           Master_TLS_Version: 
1 row in set (0.00 sec)

ERROR: 
No query specified

接下来部署vip:

注释：/usr/local/bin/ 下的两个文件为我在官网下载的，详细文件见下个博客；

1.manager的部署：

编辑三个文件并加上执行权限；

[root@server5 scripts]# cd /usr/local/bin/
[root@server5 bin]# vim master_ip_online_change 

my $vip = '172.25.17.100/24';
my $ssh_start_vip = "/sbin/ip addr add $vip dev eth0";
my $ssh_stop_vip = "/sbin/ip addr del  $vip dev eth0";


[root@server5 bin]# vim master_ip_failover 


my $vip = '172.25.17.100/24';
my $exit_code = 0;
my $ssh_start_vip = "/sbin/ip addr add $vip  dev eth0";
my $ssh_stop_vip = "/sbin/ip addr del $vip    dev eth0";

[root@server5 bin]# chmod +x *

[root@server5 masterha]# vim app1.cnf 

 5 master_ip_failover_script= /usr/local/bin/master_ip_failover
  6 master_ip_online_change_script= /usr/local/bin/master_ip_online_change
  7 password=Westos-6

2.给server1添加vip

[root@server1 ~]# ip addr add 172.25.17.100/24 dev eth0
[root@server1 ~]# ip addr 
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 52:54:00:dc:e2:89 brd ff:ff:ff:ff:ff:ff
    inet 172.25.17.1/24 brd 172.25.17.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 172.25.17.100/24 scope global eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fedc:e289/64 scope link 
       valid_lft forever preferred_lft forever

3.设置vip漂移：

[root@server5 masterha]# masterha_master_switch --conf=/etc/masterha/app1.cnf --master_state=alive --new_master_host=172.25.17.2 --new_master_port=3306 --orig_master_is_new_slave --running_updates_limit=10000
Tue Feb 26 17:11:02 2019 - [info] MHA::MasterRotate version 0.58.
Tue Feb 26 17:11:02 2019 - [info] Starting online master switch..
Tue Feb 26 17:11:02 2019 - [info] 
Tue Feb 26 17:11:02 2019 - [info] * Phase 1: Configuration Check Phase..
Tue Feb 26 17:11:02 2019 - [info] 
Tue Feb 26 17:11:02 2019 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Tue Feb 26 17:11:02 2019 - [info] Reading application default configuration from /etc/masterha/app1.cnf..
Tue Feb 26 17:11:02 2019 - [info] Reading server configuration from /etc/masterha/app1.cnf..
Tue Feb 26 17:11:03 2019 - [info] GTID failover mode = 1
Tue Feb 26 17:11:03 2019 - [info] Current Alive Master: 172.25.17.1(172.25.17.1:3306)
Tue Feb 26 17:11:03 2019 - [info] Alive Slaves:
Tue Feb 26 17:11:03 2019 - [info]   172.25.17.2(172.25.17.2:3306)  Version=5.7.24-log (oldest major version between slaves) log-bin:enabled
Tue Feb 26 17:11:03 2019 - [info]     GTID ON
Tue Feb 26 17:11:03 2019 - [info]     Replicating from 172.25.17.1(172.25.17.1:3306)
Tue Feb 26 17:11:03 2019 - [info]     Primary candidate for the new Master (candidate_master is set)
Tue Feb 26 17:11:03 2019 - [info]   172.25.17.4(172.25.17.4:3306)  Version=5.7.24-log (oldest major version between slaves) log-bin:enabled
Tue Feb 26 17:11:03 2019 - [info]     GTID ON
Tue Feb 26 17:11:03 2019 - [info]     Replicating from 172.25.17.1(172.25.17.1:3306)
Tue Feb 26 17:11:03 2019 - [info]     Not candidate for the new Master (no_master is set)

It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 172.25.17.1(172.25.17.1:3306)? (YES/no): yes
Tue Feb 26 17:11:09 2019 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time..
Tue Feb 26 17:11:09 2019 - [info]  ok.
Tue Feb 26 17:11:09 2019 - [info] Checking MHA is not monitoring or doing failover..
Tue Feb 26 17:11:09 2019 - [info] Checking replication health on 172.25.17.2..
Tue Feb 26 17:11:09 2019 - [info]  ok.
Tue Feb 26 17:11:09 2019 - [info] Checking replication health on 172.25.17.4..
Tue Feb 26 17:11:09 2019 - [info]  ok.
Tue Feb 26 17:11:09 2019 - [info] 172.25.17.2 can be new master.
Tue Feb 26 17:11:09 2019 - [info] 
From:
172.25.17.1(172.25.17.1:3306) (current master)
 +--172.25.17.2(172.25.17.2:3306)
 +--172.25.17.4(172.25.17.4:3306)

To:
172.25.17.2(172.25.17.2:3306) (new master)
 +--172.25.17.4(172.25.17.4:3306)
 +--172.25.17.1(172.25.17.1:3306)

Starting master switch from 172.25.17.1(172.25.17.1:3306) to 172.25.17.2(172.25.17.2:3306)? (yes/NO): yes
Tue Feb 26 17:11:13 2019 - [info] Checking whether 172.25.17.2(172.25.17.2:3306) is ok for the new master..
Tue Feb 26 17:11:13 2019 - [info]  ok.
Tue Feb 26 17:11:13 2019 - [info] 172.25.17.1(172.25.17.1:3306): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host.
Tue Feb 26 17:11:13 2019 - [info] 172.25.17.1(172.25.17.1:3306): Resetting slave pointing to the dummy host.
Tue Feb 26 17:11:13 2019 - [info] ** Phase 1: Configuration Check Phase completed.
Tue Feb 26 17:11:13 2019 - [info] 
Tue Feb 26 17:11:13 2019 - [info] * Phase 2: Rejecting updates Phase..
Tue Feb 26 17:11:13 2019 - [info] 
Tue Feb 26 17:11:13 2019 - [info] Executing master ip online change script to disable write on the current master:
Tue Feb 26 17:11:13 2019 - [info]   /usr/local/bin/master_ip_online_change --command=stop --orig_master_host=172.25.17.1 --orig_master_ip=172.25.17.1 --orig_master_port=3306 --orig_master_user='root' --new_master_host=172.25.17.2 --new_master_ip=172.25.17.2 --new_master_port=3306 --new_master_user='root' --orig_master_ssh_user=root --new_master_ssh_user=root   --orig_master_is_new_slave --orig_master_password=xxx --new_master_password=xxx



***************************************************************
Disabling the VIP - 172.25.17.100/24 on old master: 172.25.17.1
***************************************************************



Tue Feb 26 17:11:14 2019 - [info]  ok.
Tue Feb 26 17:11:14 2019 - [info] Locking all tables on the orig master to reject updates from everybody (including root):
Tue Feb 26 17:11:14 2019 - [info] Executing FLUSH TABLES WITH READ LOCK..
Tue Feb 26 17:11:14 2019 - [info]  ok.
Tue Feb 26 17:11:14 2019 - [info] Orig master binlog:pos is binlog.000003:194.
Tue Feb 26 17:11:14 2019 - [info]  Waiting to execute all relay logs on 172.25.17.2(172.25.17.2:3306)..
Tue Feb 26 17:11:14 2019 - [info]  master_pos_wait(binlog.000003:194) completed on 172.25.17.2(172.25.17.2:3306). Executed 0 events.
Tue Feb 26 17:11:14 2019 - [info]   done.
Tue Feb 26 17:11:14 2019 - [info] Getting new master's binlog name and position..
Tue Feb 26 17:11:14 2019 - [info]  binlog.000001:892
Tue Feb 26 17:11:14 2019 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.25.17.2', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Tue Feb 26 17:11:14 2019 - [info] Executing master ip online change script to allow write on the new master:
Tue Feb 26 17:11:14 2019 - [info]   /usr/local/bin/master_ip_online_change --command=start --orig_master_host=172.25.17.1 --orig_master_ip=172.25.17.1 --orig_master_port=3306 --orig_master_user='root' --new_master_host=172.25.17.2 --new_master_ip=172.25.17.2 --new_master_port=3306 --new_master_user='root' --orig_master_ssh_user=root --new_master_ssh_user=root   --orig_master_is_new_slave --orig_master_password=xxx --new_master_password=xxx



***************************************************************
Enabling the VIP - 172.25.17.100/24 on new master: 172.25.17.2 
***************************************************************



Tue Feb 26 17:11:14 2019 - [info]  ok.
Tue Feb 26 17:11:14 2019 - [info] Setting read_only=0 on 172.25.17.2(172.25.17.2:3306)..
Tue Feb 26 17:11:14 2019 - [info]  ok.
Tue Feb 26 17:11:14 2019 - [info] 
Tue Feb 26 17:11:14 2019 - [info] * Switching slaves in parallel..
Tue Feb 26 17:11:14 2019 - [info] 
Tue Feb 26 17:11:14 2019 - [info] -- Slave switch on host 172.25.17.4(172.25.17.4:3306) started, pid: 12449
Tue Feb 26 17:11:14 2019 - [info] 
Tue Feb 26 17:11:15 2019 - [info] Log messages from 172.25.17.4 ...
Tue Feb 26 17:11:15 2019 - [info] 
Tue Feb 26 17:11:14 2019 - [info]  Waiting to execute all relay logs on 172.25.17.4(172.25.17.4:3306)..
Tue Feb 26 17:11:14 2019 - [info]  master_pos_wait(binlog.000003:194) completed on 172.25.17.4(172.25.17.4:3306). Executed 0 events.
Tue Feb 26 17:11:14 2019 - [info]   done.
Tue Feb 26 17:11:14 2019 - [info]  Resetting slave 172.25.17.4(172.25.17.4:3306) and starting replication from the new master 172.25.17.2(172.25.17.2:3306)..
Tue Feb 26 17:11:14 2019 - [info]  Executed CHANGE MASTER.
Tue Feb 26 17:11:14 2019 - [info]  Slave started.
Tue Feb 26 17:11:15 2019 - [info] End of log messages from 172.25.17.4 ...
Tue Feb 26 17:11:15 2019 - [info] 
Tue Feb 26 17:11:15 2019 - [info] -- Slave switch on host 172.25.17.4(172.25.17.4:3306) succeeded.
Tue Feb 26 17:11:15 2019 - [info] Unlocking all tables on the orig master:
Tue Feb 26 17:11:15 2019 - [info] Executing UNLOCK TABLES..
Tue Feb 26 17:11:15 2019 - [info]  ok.
Tue Feb 26 17:11:15 2019 - [info] Starting orig master as a new slave..
Tue Feb 26 17:11:15 2019 - [info]  Resetting slave 172.25.17.1(172.25.17.1:3306) and starting replication from the new master 172.25.17.2(172.25.17.2:3306)..
Tue Feb 26 17:11:15 2019 - [info]  Executed CHANGE MASTER.
Tue Feb 26 17:11:15 2019 - [info]  Slave started.
Tue Feb 26 17:11:15 2019 - [info] All new slave servers switched successfully.
Tue Feb 26 17:11:15 2019 - [info] 
Tue Feb 26 17:11:15 2019 - [info] * Phase 5: New master cleanup phase..
Tue Feb 26 17:11:15 2019 - [info] 
Tue Feb 26 17:11:16 2019 - [info]  172.25.17.2: Resetting slave info succeeded.
Tue Feb 26 17:11:16 2019 - [info] Switching master to 172.25.17.2(172.25.17.2:3306) completed successfully.

此时vip已经漂移到了server2

此时vip已经跳到server2


[root@server2 ~]# ip addr
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
    link/ether 52:54:00:dc:e2:89 brd ff:ff:ff:ff:ff:ff
    inet 172.25.17.2/24 brd 172.25.17.255 scope global eth0
       valid_lft forever preferred_lft forever
    inet 172.25.17.100/24 scope global secondary eth0
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fedc:e289/64 scope link 
       valid_lft forever preferred_lft forever

MySQL高可用架构之MHA原理及其部署,三种切换方式以及vip的漂移

MySQL高可用架构之MHA

二.部署MHA

步骤一：搭建一主两从

（1）主数据库：server1:

所需安装包： mysql-5.7.24-1.el7.x86_64.rpm-bundle.tar

（2）从数据库：server2.server4:

从节点一的设置【server2】

步骤二，MHA的配置：

测试主机的配置(server5)；

步骤三，免密登陆的配置：

步骤四：检查整个复制环境状况。

1.手动切换：

2.热切：

检测：
server2:

3.自动切换

操作前提：

接下来部署vip:

3.设置vip漂移：

猜你喜欢

MySQL高可用架构之MHA原理及其部署,三种切换方式以及vip的漂移

MySQL高可用架构之MHA

二.部署MHA

步骤一：搭建一主两从

（1）主数据库：server1:

所需安装包： mysql-5.7.24-1.el7.x86_64.rpm-bundle.tar

（2）从数据库：server2.server4:

从节点一的设置【server2】

步骤二，MHA的配置：

测试主机的配置(server5)；

步骤三，免密登陆的配置：

步骤四：检查整个复制环境状况。

1.手动切换：

2.热切：

检测： server2:

3.自动切换

操作前提：

接下来部署vip:

3.设置vip漂移：

猜你喜欢

检测：
server2: