版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/u010719917/article/details/80406863
MHA failover5个阶段:
* Phase 1: Configuration Check Phase.. 检查配置文件,选候选slave为新的master
* Phase 2: Dead Master Shutdown Phase.. 强制关闭,并执行master ip漂移
* Phase 3: Master Recovery Phase.. 构建新的master
** Phase 3.1: Getting Latest Slaves Phase.. 获得最新的binary log file/position
The latest binary log file/position on all slaves is mysql_bin.000005:190 最新的binary log file和日志位置
The oldest binary log file/position on all slaves is mysql_bin.000005:190 最旧的binary log file和日志位置
差异日志
** Phase 3.2: Saving Dead Master's Binlog Phase.. 保存Dead master's binlog
Fetching dead master's binary logs..
** Phase 3.3: Determining New Master Phase.. 选取New Master
From:
centos129 (current master)
+--centos128
+--centos130
To:
centos128 (new master)
+--centos130
** Phase 3.3: New Master Diff Log Generation Phase.. New Master差异日志生成阶段
** Phase 3.4: Master Log Apply Phase..
All relay logs were successfully applied. 所有relay log被运用之后,new master的binlog和position信息:
Getting new master's binlog name and position..
mysql_bin.000020:230 其它的slave 开始复制的日志点
vip漂移:
/usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=centos129 --orig_master_ip=192.168.44.129 --orig_master_port=3306 --new_master_host=centos128 --new_master_ip=192.168.44.128 --new_master_port=3306 --new_master_user='root' --new_master_password='123456'
IN SCRIPT TEST====/usr/sbin/ifconfig ens33:1 down==/usr/sbin/ifconfig ens33:1 192.168.44.211===
* Phase 4: Slaves Recovery Phase.. slave的恢复阶段
* Phase 4.1: Starting Parallel Slave Diff Log Generation Phase.. 开始并行slave差异日志的产生阶段
* Phase 4.2: Starting Parallel Slave Log Apply Phase.. 开始并行slave差异日志的应用阶段
* Phase 5: New master cleanup phase.. 新master的清理阶段
通过脚本masterha_secondary_check 多种线路检测master是否存活
masterha_secondary_check -s centos129 -s centos128 -user=root --master_host=centos128 --master_ip=192.168.44.128 --master_port=3306 --user=root --master_host=centos129 --master_ip=192.168.44.129 --master_port=3306
决定是否Failover
Monitoring server centos129 is reachable, Master is not reachable from centos129. OK.
Tue May 22 13:53:54 2018 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.44.129' (111))
Tue May 22 13:53:54 2018 - [warning] Connection failed 1 time(s)..
Monitoring server centos128 is reachable, Master is not reachable from centos128. OK.
Tue May 22 13:53:54 2018 - [info] Master is not reachable from al l other monitoring servers. Failover should start.
Tue May 22 13:53:55 2018 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.44.129' (111))
Tue May 22 13:53:55 2018 - [warning] Connection failed 2 time(s)..
Tue May 22 13:53:56 2018 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '192.168.44.129' (111))
Tue May 22 13:53:56 2018 - [warning] Connection failed 3 time(s)..
Tue May 22 13:53:56 2018 - [warning] Master is not reachable from health checker!
Tue May 22 13:53:56 2018 - [warning] Master centos129(192.168.44.129:3306) is not reachable!
读配置文件/etc/masterha/app1.cnf,选择侯选的slave为新的master
Reading server configurations from /etc/masterha/app1.cnf..
Tue May 22 13:53:57 2018 - [info] Dead Servers:
Tue May 22 13:53:57 2018 - [info] centos129(192.168.44.129:3306)
Tue May 22 13:53:57 2018 - [info] Alive Servers:
Tue May 22 13:53:57 2018 - [info] centos128(192.168.44.128:3306)
Tue May 22 13:53:57 2018 - [info] centos130(192.168.44.130:3306)
Tue May 22 13:53:57 2018 - [info] Alive Slaves:
Tue May 22 13:53:57 2018 - [info] centos128(192.168.44.128:3306) Version=5.7.18-enterprise-commercial-advanced-log (oldest major version between slaves) log-bin:enabled
Tue May 22 13:53:57 2018 - [info] Replicating from 192.168.44.129(192.168.44.129:3306)
Tue May 22 13:53:57 2018 - [info] Primary candidate for the new Master (candidate_master is set)
Tue May 22 13:53:57 2018 - [info] centos130(192.168.44.130:3306) Version=5.7.18-enterprise-commercial-advanced-log (oldest major version between slaves) log-bin:enabled
Tue May 22 13:53:57 2018 - [info] Replicating from 192.168.44.129(192.168.44.129:3306)
* Phase 1: Configuration Check Phase.. 检查配置文件,选候选slave为新的master
Tue May 22 13:53:57 2018 - [info]
Tue May 22 13:53:58 2018 - [info] Dead Servers:
Tue May 22 13:53:58 2018 - [info] centos129(192.168.44.129:3306)
Tue May 22 13:53:58 2018 - [info] Checking master reachability via mysql(double check)..
Tue May 22 13:53:58 2018 - [info] ok.
Tue May 22 13:53:58 2018 - [info] Alive Servers:
Tue May 22 13:53:58 2018 - [info] centos128(192.168.44.128:3306)
Tue May 22 13:53:58 2018 - [info] centos130(192.168.44.130:3306)
Tue May 22 13:53:58 2018 - [info] Alive Slaves:
Tue May 22 13:53:58 2018 - [info] centos128(192.168.44.128:3306) Version=5.7.18-enterprise-commercial-advanced-log (oldest major version between slaves) log-bin:enabled
Tue May 22 13:53:58 2018 - [info] Replicating from 192.168.44.129(192.168.44.129:3306)
Tue May 22 13:53:58 2018 - [info] Primary candidate for the new Master (candidate_master is set)
Tue May 22 13:53:58 2018 - [info] centos130(192.168.44.130:3306) Version=5.7.18-enterprise-commercial-advanced-log (oldest major version between slaves) log-bin:enabled
Tue May 22 13:53:58 2018 - [info] Replicating from 192.168.44.129(192.168.44.129:3306)
Tue May 22 13:53:58 2018 - [info] ** Phase 1: Configuration Check Phase completed.
* Phase 2: Dead Master Shutdown Phase..强制关闭,并执行master ip漂移
Tue May 22 13:53:58 2018 - [info]
Tue May 22 13:53:58 2018 - [info] Forcing shutdown so that applications never connect to the current master..
Tue May 22 13:53:58 2018 - [info] Executing master IP deactivatation script:
Tue May 22 13:53:58 2018 - [info] /usr/local/bin/master_ip_failover --orig_master_host=centos129 --orig_master_ip=192.168.44.129 --orig_master_port=3306 --command=stopssh --ssh_user=root
IN SCRIPT TEST====/usr/sbin/ifconfig ens33:1 down==/usr/sbin/ifconfig ens33:1 192.168.44.211===
Disabling the VIP on old master: centos129
SIOCSIFFLAGS: 鏃犳硶鎸囧畾琚姹傜殑鍦板潃
Tue May 22 13:53:58 2018 - [info] done.
Tue May 22 13:53:58 2018 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Tue May 22 13:53:59 2018 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Tue May 22 13:53:59 2018 - [info] * Phase 3: Master Recovery Phase.. 构建新的master
Tue May 22 13:53:59 2018 - [info]
Tue May 22 13:53:59 2018 - [info] * Phase 3.1: Getting Latest Slaves Phase.. 获得最新的binary log file/position
Tue May 22 13:53:59 2018 - [info]
Tue May 22 13:53:59 2018 - [info] The latest binary log file/position on all slaves is mysql_bin.000005:190 最新的binary log file和日志位置
Tue May 22 13:53:59 2018 - [info] Latest slaves (Slaves that received relay log files to the latest):
Tue May 22 13:53:59 2018 - [info] centos128(192.168.44.128:3306) Version=5.7.18-enterprise-commercial-advanced-log (oldest major version between slaves) log-bin:enabled
Tue May 22 13:53:59 2018 - [info] Replicating from 192.168.44.129(192.168.44.129:3306)
Tue May 22 13:53:59 2018 - [info] Primary candidate for the new Master (candidate_master is set)
Tue May 22 13:53:59 2018 - [info] centos130(192.168.44.130:3306) Version=5.7.18-enterprise-commercial-advanced-log (oldest major version between slaves) log-bin:enabled
Tue May 22 13:53:59 2018 - [info] Replicating from 192.168.44.129(192.168.44.129:3306)
Tue May 22 13:53:59 2018 - [info] The oldest binary log file/position on all slaves is mysql_bin.000005:190 最旧的binary log file和日志位置
Tue May 22 13:53:59 2018 - [info] Oldest slaves:
Tue May 22 13:53:59 2018 - [info] centos128(192.168.44.128:3306) Version=5.7.18-enterprise-commercial-advanced-log (oldest major version between slaves) log-bin:enabled
Tue May 22 13:53:59 2018 - [info] Replicating from 192.168.44.129(192.168.44.129:3306)
Tue May 22 13:53:59 2018 - [info] Primary candidate for the new Master (candidate_master is set)
Tue May 22 13:53:59 2018 - [info] centos130(192.168.44.130:3306) Version=5.7.18-enterprise-commercial-advanced-log (oldest major version between slaves) log-bin:enabled
Tue May 22 13:53:59 2018 - [info] Replicating from 192.168.44.129(192.168.44.129:3306)
Tue May 22 13:53:59 2018 - [info]
Tue May 22 13:53:59 2018 - [info] * Phase 3.2: Saving Dead Master's Binlog Phase.. 保存Dead master's binlog
Tue May 22 13:53:59 2018 - [info]
Tue May 22 13:53:59 2018 - [info] Fetching dead master's binary logs..
Tue May 22 13:53:59 2018 - [info] Executing command on the dead master centos129(192.168.44.129:3306): save_binary_logs --command=save --start_file=mysql_bin.000005 --start_pos=190 --binlog_dir=/data/3306/logs/log --output_file=/tmp/saved_master_binlog_from_centos129_3306_20180522135357.binlog --handle_raw_binlog=1 --disable_log_bin=0 --manager_version=0.54
Creating /tmp if not exists.. ok.
Concat binary/relay logs from mysql_bin.000005 pos 190 to mysql_bin.000005 EOF into /tmp/saved_master_binlog_from_centos129_3306_20180522135357.binlog ..
Dumping binlog format description event, from position 0 to 123.. ok.
No need to dump effective binlog data from /data/3306/logs/log/mysql_bin.000005 (pos starts 190, filesize 190). Skipping.
/tmp/saved_master_binlog_from_centos129_3306_20180522135357.binlog has no effective data events.
Event not exists.
Tue May 22 13:54:00 2018 - [info] Additional events were not found from the orig master. No need to save.
Tue May 22 13:54:00 2018 - [info]
Tue May 22 13:54:00 2018 - [info] * Phase 3.3: Determining New Master Phase.. 选取New Master
Tue May 22 13:54:00 2018 - [info]
Tue May 22 13:54:00 2018 - [info] Finding the latest slave that has all relay logs for recovering other slaves..
Tue May 22 13:54:00 2018 - [info] All slaves received relay logs to the same position. No need to resync each other.
Tue May 22 13:54:00 2018 - [info] Searching new master from slaves..
Tue May 22 13:54:00 2018 - [info] Candidate masters from the configuration file:
Tue May 22 13:54:00 2018 - [info] centos128(192.168.44.128:3306) Version=5.7.18-enterprise-commercial-advanced-log (oldest major version between slaves) log-bin:enabled
Tue May 22 13:54:00 2018 - [info] Replicating from 192.168.44.129(192.168.44.129:3306)
Tue May 22 13:54:00 2018 - [info] Primary candidate for the new Master (candidate_master is set)
Tue May 22 13:54:00 2018 - [info] Non-candidate masters:
Tue May 22 13:54:00 2018 - [info] Searching from candidate_master slaves which have received the latest relay log events..
Tue May 22 13:54:00 2018 - [info] New master is centos128(192.168.44.128:3306)
Tue May 22 13:54:00 2018 - [info] Starting master failover..
Tue May 22 13:54:00 2018 - [info]
From:
centos129 (current master)
+--centos128
+--centos130
To:
centos128 (new master)
+--centos130
Tue May 22 13:54:00 2018 - [info]
Tue May 22 13:54:00 2018 - [info] * Phase 3.3: New Master Diff Log Generation Phase..
Tue May 22 13:54:00 2018 - [info]
Tue May 22 13:54:00 2018 - [info] This server has all relay logs. No need to generate diff files from the latest slave.
Tue May 22 13:54:00 2018 - [info]
Tue May 22 13:54:00 2018 - [info] * Phase 3.4: Master Log Apply Phase..
Tue May 22 13:54:00 2018 - [info]
Tue May 22 13:54:00 2018 - [info] *NOTICE: If any error happens from this phase, manual recovery is needed.
Tue May 22 13:54:00 2018 - [info] Starting recovery on centos128(192.168.44.128:3306)..
Tue May 22 13:54:00 2018 - [info] This server has all relay logs. Waiting all logs to be applied..
Tue May 22 13:54:00 2018 - [info] done.
Tue May 22 13:54:00 2018 - [info] All relay logs were successfully applied. 所有relay log被运用之后,new master的binlog和position信息:
Tue May 22 13:54:00 2018 - [info] Getting new master's binlog name and position..
Tue May 22 13:54:00 2018 - [info] mysql_bin.000020:230 其它的slave 开始复制的日志点
Tue May 22 13:54:00 2018 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='centos128 or 192.168.44.128', MASTER_PORT=3306, MASTER_LOG_FILE='mysql_bin.000020', MASTER_LOG_POS=230, MASTER_USER='repl', MASTER_PASSWORD='xxx';
Tue May 22 13:54:00 2018 - [info] Executing master IP activate script:
Tue May 22 13:54:00 2018 - [info] /usr/local/bin/master_ip_failover --command=start --ssh_user=root --orig_master_host=centos129 --orig_master_ip=192.168.44.129 --orig_master_port=3306 --new_master_host=centos128 --new_master_ip=192.168.44.128 --new_master_port=3306 --new_master_user='root' --new_master_password='123456'
IN SCRIPT TEST====/usr/sbin/ifconfig ens33:1 down==/usr/sbin/ifconfig ens33:1 192.168.44.211===
Enabling the VIP - 192.168.44.211 on the new master - centos128
Undefined subroutine &main::FIXME_xxx_create_user called at /usr/local/bin/master_ip_failover line 120.
Set read_only=0 on the new master.
Creating app user on the new master..
Tue May 22 13:54:00 2018 - [error][/usr/local/share/perl5/MHA/MasterFailover.pm, ln1261] Failed to activate master IP address for centos128 with return code 10:0
Tue May 22 13:54:00 2018 - [warning] Proceeding.
Tue May 22 13:54:00 2018 - [info] ** Finished master recovery successfully.
Tue May 22 13:54:00 2018 - [info] * Phase 3: Master Recovery Phase completed.
Tue May 22 13:54:00 2018 - [info] * Phase 4: Slaves Recovery Phase.. slave的恢复阶段
Tue May 22 13:54:00 2018 - [info]
Tue May 22 13:54:00 2018 - [info] * Phase 4.1: Starting Parallel Slave Diff Log Generation Phase.. 开始并行slave差异日志的产生阶段
Tue May 22 13:54:00 2018 - [info]
Tue May 22 13:54:00 2018 - [info] -- Slave diff file generation on host centos130(192.168.44.130:3306) started, pid: 100996. Check tmp log /var/log/masterha/app1/centos130_3306_20180522135357.log if it takes time..
Tue May 22 13:54:01 2018 - [info]
Tue May 22 13:54:01 2018 - [info] Log messages from centos130 ...
Tue May 22 13:54:01 2018 - [info]
Tue May 22 13:54:00 2018 - [info] This server has all relay logs. No need to generate diff files from the latest slave.
Tue May 22 13:54:01 2018 - [info] End of log messages from centos130.
Tue May 22 13:54:01 2018 - [info] -- centos130(192.168.44.130:3306) has the latest relay log events.
Tue May 22 13:54:01 2018 - [info] Generating relay diff files from the latest slave succeeded.
Tue May 22 13:54:01 2018 - [info]
Tue May 22 13:54:01 2018 - [info] * Phase 4.2: Starting Parallel Slave Log Apply Phase.. 开始并行slave差异日志的应用阶段
Tue May 22 13:54:01 2018 - [info]
Tue May 22 13:54:01 2018 - [info] -- Slave recovery on host centos130(192.168.44.130:3306) started, pid: 101008. Check tmp log /var/log/masterha/app1/centos130_3306_20180522135357.log if it takes time..
Tue May 22 13:54:02 2018 - [info]
Tue May 22 13:54:02 2018 - [info] Log messages from centos130 ...
Tue May 22 13:54:02 2018 - [info]
Tue May 22 13:54:01 2018 - [info] Starting recovery on centos130(192.168.44.130:3306)..
Tue May 22 13:54:01 2018 - [info] This server has all relay logs. Waiting all logs to be applied..
Tue May 22 13:54:01 2018 - [info] done.
Tue May 22 13:54:01 2018 - [info] All relay logs were successfully applied.
Tue May 22 13:54:01 2018 - [info] Resetting slave centos130(192.168.44.130:3306) and starting replication from the new master centos128(192.168.44.128:3306)..
Tue May 22 13:54:01 2018 - [info] Executed CHANGE MASTER.
Tue May 22 13:54:01 2018 - [info] Slave started.
Tue May 22 13:54:02 2018 - [info] End of log messages from centos130.
Tue May 22 13:54:02 2018 - [info] -- Slave recovery on host centos130(192.168.44.130:3306) succeeded.
Tue May 22 13:54:02 2018 - [info] All new slave servers recovered successfully.
Tue May 22 13:54:02 2018 - [info]
Tue May 22 13:54:02 2018 - [info] * Phase 5: New master cleanup phase.. 新master的清理阶段
Tue May 22 13:54:02 2018 - [info]
Tue May 22 13:54:02 2018 - [info] Resetting slave info on the new master..
Tue May 22 13:54:02 2018 - [info] centos128: Resetting slave info succeeded.
Tue May 22 13:54:02 2018 - [info] Master failover to centos128(192.168.44.128:3306) completed successfully.
Tue May 22 13:54:02 2018 - [info]
----- Failover Report -----
app1: MySQL Master failover centos129 to centos128 succeeded
Master centos129 is down!
Check MHA Manager logs at centos128:/var/log/masterha/app1/manager.log for details.