1. MHA-related knowledge
1.1 What is MHA
MHA (MasterHigh Availability) is an excellent set of software for failover and master-slave replication in a MySQL high-availability environment.
The emergence of MHA is to solve the problem of MySQL single point of failure.
During the MySQL failover process, MHA can automatically complete the failover operation within 0-30 seconds.
MHA can ensure data consistency to the greatest extent during the failover process to achieve high availability in the true sense.
1.2 Composition of MHA
1) MHA Node (data node)
MHA Node runs on each MySQL server.
2) MHA Manager (management node)
MHA Manager can be deployed on an independent machine to manage multiple master-slave clusters; it can also be deployed on a slave node.
MHA Manager will regularly detect the master node in the cluster. When the master fails, it can automatically promote the slave with the latest data as the new master, and then re-point all other slaves to the new master. The entire failover process is completely transparent to the application.
The working principle of MHA is summarized as follows:
MHA Manager will periodically detect the master node in the cluster
When the master fails, save binary log events (binlog events) from the crashed master;
Identify the slave with the latest update;
Apply differential relay logs to other slaves;
Apply binary log events (binlog events) saved from the master;
Promote a slave to a new master, and make other slaves connect to the new master for replication;
1.3 Features of MHA
In the process of automatic failover, MHA tries to save the binary log from the downtime master server, so as to ensure that the data is not lost to the greatest extent.
Using semi-synchronous replication can greatly reduce the risk of data loss. If only one slave has received the latest binary log, MHA can apply the latest binary log to all other slave servers, thus ensuring data consistency of all nodes .
At present, MHA supports one master and multiple slaves architecture, with at least three servers, that is, one master and two slaves.
2. MHA's one-master-two-slave deployment
experiment design
meets the requirements: build a highly available read-write replication mysql cluster with one master and two slaves. When the master server goes down, the slave server with the most complete data updates will replace the master server and occupy the VIP , to ensure normal operation.
experimental components
Experiment specific operation
step 1: Configure master-slave replication
(1) Modify the hostname of the mysql node server
##Master node ##
hostnamectl set-hostname mysql1
su
##
Slave1 node ##
hostnamectl set-hostname mysql2
su
##
Slave2 node ##
hostnamectl set-hostname mysql3
su
##manager##
hostnamectl set-hostname manager
su
(2) Add the master-slave mysql mapping relationship
and add all three hosts:
vim /etc/hosts
192.168.73.105 mysql1
192.168.73.106 mysql2
192.168.73.107 mysql3
(3) Modify the main Mysql configuration file /etc/my.cnf
Note:
Master node, open the binary log.
Slave1, Slave2 nodes, open the binary log and relay log.
##Master Node##
vim /etc/my.cnf
[mysqld]
server-id = 1
log-bin = master-bin #Open the binary log, specify the storage location
log-slave-updates = true
#Allow the slave to write to itself when copying data from the master In the binary log of
systemctl
restart mysqld #restart mysql
##
Slave1 node##
vim /etc/my.cnf
[mysqld]
server-id = 2 #The server-id of the three servers cannot be the same
log-bin = master-bin
relay-log = relay-log-bin
relay-log-index = slave-relay -bin.index
systemctl
restart mysqld
##
Slave2 node##
vim /etc/my.cnf
[mysqld]
server-id = 3 #The server-id of the three servers cannot be the same
log-bin = master-bin
relay-log = relay-log-bin
relay-log-index = slave-relay -bin.index
systemctl
restart mysqld #Restart mysql
(4) Create a soft link on the mysql node server
All three mysql node servers are created:
ln -s /usr/local/mysql/bin/mysql /usr/sbin/
ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/ls /
usr
/sbin/mysql* #View soft link
( 5) Log in to the database for authorization
##----(1) Authorize all database nodes for mysql master-slave synchronization ------##
grant replication slave on *.* to 'myslave'@'192.168.73. %'
identified by '123123'; #Synchronous use from the server
##---(2) All database nodes, authorized to the manager server -----##
grant all privileges on *.* to 'mha'@'192.168 .73.%' identified by 'manager';
#In order to prevent the connection failure caused by the host name, authorize the login address again
grant
all privileges on *.* to 'mha'@'mysql1' identified by 'manager ';
grant all privileges on *.* to 'mha'@'mysql2' identified by 'manager';
grant all privileges on *.* to 'mha'@'mysql3' identified by 'manager';
flush
privileges; #Refresh privileges
(6) Configure master-slave synchronization
View binary files and synchronization points (ie offsets) on the Master node, and perform synchronization operations on Slave1 and Slave2 nodes.
##(1) View the binary file and synchronization point (ie offset) on the Master node ##
show master status;
#General possible reasons for "Slave_IO_Running: No":
1. The network is unreachable
2. There is a problem with my.cnf configuration (server-id is repeated)
3. The password, file name, and pos offset are incorrect
4. The firewall is not closed
##Perform synchronous operation on Slave1 and Slave2 nodes##
change master to master_host='192.168.73.105', master_user='myslave', master_password='123123',
master_log_file='master-bin.000006', master_log_pos=154
;
start slave; #Start synchronization, if there is an error, execute reset slave;
##
Check node status in Slave1 and Slave2##
show slave status\G
##The two slave libraries must be set to read-only mode##
set global read_only=1;
##
Insert data in the Master main library, test database synchronization##
mysql> use db_test
mysql> create table info(id int,name char (10));
mysql> insert into info values(1,'mysql1's data');
#Verify
whether the synchronization is successful from the database
Step 2: Configure MHA
Note: The epel source installation here cannot be installed from the local source. The local source of centos7 is not available, it needs to be modified to an online source for installation
1) Install MHA software on all servers
## (1) Install the MHA dependent environment on all servers, first install the epel source ##
yum install epel-release --nogpgcheck -y
#Install
the MHA dependent environment
yum install -y perl- DBD-MySQL \
perl-Config-Tiny \
perl-Log-Dispatch \
perl-Parallel-ForkManager \
perl-ExtUtils-CBuilder \
perl-ExtUtils-MakeMaker \
perl-CPAN
##
(2) To install the MHA package, you must first Install node components on all servers ##
#Upload the installation package to the /opt/ directory, unzip and install node components##
cd /opt
tar zxvf mha4mysql-node-0.57.tar.gz
cd mha4mysql-node-0.57
perl Makefile. PL
make && make install
##
(3) Finally, install the manager component on the MHA manager node## (the manager component depends on the node component)
cd /opt/
tar zxvf mha4mysql-manager-0.57.tar.gz
cd mha4mysql-manager-0.57
perl Makefile.PL
make && make install -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-------------------------------------------------- #manager After the component is installed, several tools will be generated under /usr/local/bin, mainly including the following: masterha_check_ssh #Check the SSH configuration status of MHA masterha_check_repl #Check the MySQL replication status masterha_manger #Start the script of the manager masterha_check_status #Detect the current MHA running status masterha_master_monitor #Detect whether the master is down masterha_master_switch #Control failover (automatic or manual) masterha_conf_host #Add or delete configured server information masterha_stop #Close manager
After the #node component is installed, several scripts will be generated under /usr/local/bin (these tools are usually triggered by MHAManager scripts without manual operation), mainly as follows: save_binary_logs #Save and copy the binary log of the master apply_diff_relay_logs #Identify
the
difference Relay log events and apply the difference events to other slaves
filter_mysqlbinlog #Remove unnecessary ROLLBACK events (MHA no longer uses this tool)
purge_relay_logs #Clear relay logs (does not block SQL threads)
(2) Configure passwordless authentication on all servers
## (1) Configure passwordless authentication to all database nodes on the manager node
ssh-keygen -t rsa #Press the Enter key all the way to generate a key. "-t rsa" specifies the type of key.
ssh-copy-id 192.168.73.105 #Pass the public key to all database nodes to form a password-free connection to log in
ssh-copy-id 192.168.73.106
ssh-copy-id 192.168.73.107
##
(2)Configure on mysql1 Passwordless authentication of database nodes mysql2 and mysql3
ssh-keygen -t rsa
ssh-copy-id 192.168.73.106 #Pass the public key to two slave nodes to form a password-free connection to log in
ssh-copy-id 192.168.73.107
##
(3) Configure passwordless authentication on mysql2 to database nodes mysql1 and mysql3
ssh-keygen -t rsa
ssh-copy-id 192.168.73.105
ssh-copy-id 192.168.73.107
##
(4) Configure on mysql3 to Passwordless authentication for database nodes mysql1 and mysql2
ssh-keygen -t rsa
ssh-copy-id 192.168.73.105
ssh-copy-id 192.168.73.106
(3) Configure MHA on the manager node
## (1) Copy the relevant scripts to the /usr/local/bin directory on the manager node
cp -rp /opt/mha4mysql-manager-0.57/samples/scripts /usr/local/bin
#After copying, there will be four executable files
ll /usr/local/bin/scripts/
------------
The dotted line is a comment -------------- --------------------------------------------
master_ip_failover #VIP management script when switching automatically
master_ip_online_change #Management script of VIP when switching online
power_manager #Script to shut down host after failure occurs
send_report #Script to send alarm after failover
---------------------- -------------------------------------------------- -------- ##
(2) Copy the above-mentioned VIP management script during automatic switching to the /usr/local/bin directory, here use the master_ip_failover script to manage VIP and failover
cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin ##
(3) Modify the content as follows: (Delete the original content, directly copy and modify vip related parameters. You can enter: set paste before copying Solve vim paste out of order) vim /usr/local/bin/master_ip_failover #!/usr/bin/env perl use strict; use warnings FATAL => 'all'; use Getopt::Long; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port ); ############################ ## Add content section ########################################## my $vip = '192.168.73.66'; #Specify the address of vip
my $brdc = '192.168.73.255'; #Specify the broadcast address of vip
my $ifdev = 'ens33'; #Specify the network card bound to vip
my $key = '1'; #Specify the serial number of the virtual network card bound to vip
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip"; #Represents the value of this variable ifconfig ens33:1 192.168.72.100
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down"; #Represents the value of this variable For ifconfig ens33:1 192.168.72.100 down
my $exit_code = 0; #Specify exit status code as 0
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev: $key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
##################################################################################
GetOptions(
'command=s' => $command,
'ssh_user=s' => $ssh_user,
'orig_master_host=s' => $orig_master_host,
'orig_master_ip=s' => $orig_master_ip,
'orig_master_port=i' => $orig_master_port,
'new_master_host=s' => $new_master_host,
'new_master_ip=s' => $new_master_ip,
'new_master_port=i' => $new_master_port,
);
exit &main();
sub main {
print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
if ( $command eq "stop" || $command eq "stopssh" ) {
my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {
my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user@$new_master_host " $ssh_start_vip "`;
}
## A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user@$orig_master_host " $ssh_stop_vip "`;
}
sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}
(4) The manager node edits the configuration file and manages the mysql node server
mkdir /etc/masterha
cp /opt/mha4mysql-manager-0.57/samples/conf/app1.cnf /etc/masterha/ #copy configuration file
vim
/etc/masterha /app1.cnf #Delete the original content, directly copy and modify the IP address of the node server
[server default]
manager_log=/var/log/masterha/app1/manager.log
manager_workdir=/var/log/masterha/app1
master_binlog_dir=/ usr/local/mysql/data
master_ip_failover_script=/usr/local/bin/master_ip_failover
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
user=mha
password=manager
ping_interval=1
remote_workdir=/tmp
repl_password=123123
repl_user=mysl ave
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.73.106
-s 192.168.73.107
shutdown_script=""
ssh_user=root
user=mha
[server1]
hostname=192.168.73.105
port=3306
[server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.73.106
port=3306
[server3]
hostname=192.168.73.107
port=3306
[server default]
manager_log=/var/log/masterha/app1/manager.log #manager log
manager_workdir=/var/log/masterha/app1 #manager working directory
master_binlog_dir=/usr/local/mysql/data/ #master save binlog location, the path here must be consistent with the binlog path configured in the master, so that MHA can find
master_ip_failover_script=/usr/local/bin/master_ip_failover
#Setting the switching script when automatic failover is set, which is the script above
master_ip_online_change_script=/usr /local/bin/master_ip_online_change
#Set the switching script for manual switching
user=mha #Set the mysql user, the user created when authorizing the manager
password=manager #Set the password of the mysql user, which is the password used to create the monitoring user in the previous article
ping_interval=1
#Set the time interval for monitoring the main library and sending ping packets, the default is 3 seconds, and failover will be automatically performed when there is no response after three attempts
remote_workdir=/tmp #Set the binlog storage location when the remote mysql switches
repl_password=123123 #Set the password of the replicated user (the user and password created during the master-slave synchronization authorization)
repl_user=myslave #Set the user of the replicated user
report_script=/usr/local/send_report #Set the script of the alarm sent after the switch occurs
secondary_check_script= /usr/local/bin/masterha_secondary_check -s 192.168.73.106 -s 192.168.73.107 #Specify the IP address of the slave server to check
shutdown_script="" #Set the script to shut down the failed host after a failure occurs (the main function of this script is to shut down the host to prevent occurrence Split brain, not used here, you can use the power_manager that comes with the system)
ssh_user=root #Set the login user name of ssh
[
server1] #master
hostname=192.168.73.105
port=3306
[
server2] #slave1
hostname=192.168.73.106
port=3306
candidate_master=1
#Set as candidate master. After setting this parameter, the slave library will be promoted to the master library after the master-slave switch occurs, even if the slave library is not the latest
slave
check_repl_delay=0
#By default, if a slave lags behind the master by more than 100M relay logs, MHA will not select the slave as a new master, because it takes a long time to restore the slave; by setting check_repl_delay=0, MHA triggers switching and will ignore the replication delay when selecting a new master. This parameter is very useful for hosts with candidate_master=1, because the candidate master must be the new master during the switching process[server3]
#
slave2
hostname=192.168.73.107
port=3306
(5) The first configuration needs to manually open the virtual IP on the Master node
/sbin/ifconfig ens33:1 192.168.73.66/24
(6) Test ssh passwordless authentication on the manager node
masterha_check_ssh -conf=/etc/masterha/app1.cnf
(7) Test the mysql master-slave connection on the manager node
masterha_check_repl -conf=/etc/masterha/app1.cnf
Test the master-slave connection of mysql on the manager node, and the words MySQL Replication Health is OK appear at the end, indicating that it is normal
(8) Start MHA nohup on the manager node
masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 & -------------------------------
The following are notes------------- ------------------------------------------------- remove_dead_master_conf #This parameter means that when the master-slave switch occurs, the ip of the old master library will be removed from the configuration file. --manger_log #Log storage location. --ignore_last_failover #By default, if MHA detects continuous downtime and the interval between two downtimes is less than 8 hours, Failover will not be performed. The reason for this limitation is to avoid the ping-pong effect (switching back and forth lead to split brain). This parameter means to ignore the file generated by the last MHA trigger switch. By default, after the MHA switch occurs, it will be recorded in the app1.failover.complete log file. If the file is found to exist in the directory when the switch is made again next time, it will not be allowed Trigger switchover, unless the file is deleted after the first switchover, for convenience, here is set to --ignore_last_failover.
-------------------------------------------------- -----------------------------------
Use
& run the program in the background: the result will be output to the terminal; use Ctrl+C sends a SIGINT signal, and the program is immune; closes the session and sends a SIGHUP signal, and the program closes.
●Use nohup to run the program: the result will be output to nohup.out by default; use Ctrl+C to send a SIGINT signal, and the program will close; close the session and send a SIGHUP signal, and the program will be immune.
● Use nohup and & to start the program nohup ./test &: Simultaneously immune to SIGINT and SIGHUP signals.
(9) Check the MHA status and MHA log on the manager node, and you can see the address of the master
#Check the MHA status, and you can see that the current master is the Mysql1 node.
masterha_check_status --conf=/etc/masterha/app1.cnf
#Check
the MHA log, you can also see that the current master is 192.168.73.105
cat /var/log/masterha/app1/manager.log | grep "current master"
(10) Check whether the VIP address exists on Mysql1
ifconfig
#To
close the manager service, you can use the following command.
masterha_stop --conf=/etc/masterha/app1.cnf
#Or it can be closed directly by killing the process ID.
Step 3: Fault test
------------------------ Algorithm for failover of alternate main library ------------- ----------- Algorithm for
failover 1. Generally, the slave library is judged from (position/GTID) to judge whether it is good or bad, and the data is different. The slave closest to the master becomes the candidate master. 2 . If the data is consistent, select an alternative main library according to the order of the configuration files. 3 . Set the weight (candidate_master=1), and the candidate master is forced to be designated according to the weight. ( 1) By default, if a slave lags behind the master's relay logs by 100M, it will fail even if it has weight. ( 2) If check_repl_delay=0, even if there are many logs behind, it is forced to be selected as the backup master.
##(1) Stop the mysql service on the Master node Mysql1
systemctl stop mysqld
or
pkill -9 mysql
##
(2) Monitor and observe log records on the manager node
tail -f /var/log/masterha/app1/manager.log ##
(3) After a normal automatic switch, the MHA process will exit. MHA will automatically modify the contents of the app1.cnf file and delete the downtime mysql1 node. vim /etc/masterha/app1.cnf #View the configuration file of the manager node ## (4) Check whether mysql2 takes over VIP ifconfig
3. Repair the faulty master
(1) Repair mysql1 (that is, repair the original master node)
systemctl restart mysqld
(2) Repair master-slave data
mysql -u root -p
mysql> show master status;
mysql -u root -p
change master to master_host='192.168.73.106',master_user='myslave',master_password='123123',master_log_file='master-bin.000002',master_log_pos=154;
(3) Modify the configuration file app1.cnf on the manager node
vi /etc/masterha/app1.cnf
......
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.72.192 -s 192.168.72.80
.. ....
[server1]
hostname=192.168.72.60
port=3306
[
server2]
candidate_master=1
check_repl_delay=0
hostname=192.168.72.192
port=3306
[
server3]
hostname=192.168.72.80
port=3306
(4) Start MHA on the manager node
masterha_stop --conf=/etc/masterha/app1.cnf
nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var /log/masterha/app1/manager.log 2>&1 &
(5) Restart mysql1 and mysql2
systemctl restart mysqld