MHA high availability of mysqld

1. MHA-related knowledge
 1.1 What is MHA
MHA (MasterHigh Availability) is an excellent set of software for failover and master-slave replication in a MySQL high-availability environment.
The emergence of MHA is to solve the problem of MySQL single point of failure.
During the MySQL failover process, MHA can automatically complete the failover operation within 0-30 seconds.
MHA can ensure data consistency to the greatest extent during the failover process to achieve high availability in the true sense.


 
1.2 Composition of MHA 
1) MHA Node (data node)

MHA Node runs on each MySQL server.
2) MHA Manager (management node)

MHA Manager can be deployed on an independent machine to manage multiple master-slave clusters; it can also be deployed on a slave node.
MHA Manager will regularly detect the master node in the cluster. When the master fails, it can automatically promote the slave with the latest data as the new master, and then re-point all other slaves to the new master. The entire failover process is completely transparent to the application.

The working principle of MHA is summarized as follows:
MHA Manager will periodically detect the master node in the cluster

When the master fails, save binary log events (binlog events) from the crashed master;

Identify the slave with the latest update;

Apply differential relay logs to other slaves;

Apply binary log events (binlog events) saved from the master;

Promote a slave to a new master, and make other slaves connect to the new master for replication;
 


 

1.3 Features of MHA 
In the process of automatic failover, MHA tries to save the binary log from the downtime master server, so as to ensure that the data is not lost to the greatest extent.
Using semi-synchronous replication can greatly reduce the risk of data loss. If only one slave has received the latest binary log, MHA can apply the latest binary log to all other slave servers, thus ensuring data consistency of all nodes .
At present, MHA supports one master and multiple slaves architecture, with at least three servers, that is, one master and two slaves.
 

2. MHA's one-master-two-slave deployment 
experiment design
meets the requirements: build a highly available read-write replication mysql cluster with one master and two slaves. When the master server goes down, the slave server with the most complete data updates will replace the master server and occupy the VIP , to ensure normal operation.

experimental components 

Experiment specific operation 
step 1: Configure master-slave replication 
(1) Modify the hostname of the mysql node server 
 ##Master node ##
 hostnamectl set-hostname mysql1
 su
 ​##
 Slave1 node ##
 hostnamectl set-hostname mysql2
 su
 ​##
 Slave2 node ##
 hostnamectl set-hostname mysql3
 su
 
 ##manager##
 hostnamectl set-hostname manager
 su

(2) Add the master-slave mysql mapping relationship 
and add all three hosts:

 vim /etc/hosts
 192.168.73.105 mysql1
 192.168.73.106 mysql2
 192.168.73.107 mysql3
(3) Modify the main Mysql configuration file /etc/my.cnf 
 Note:

Master node, open the binary log.

Slave1, Slave2 nodes, open the binary log and relay log.

 ##Master Node##

 
 vim /etc/my.cnf
 [mysqld]
 server-id = 1
 log-bin = master-bin #Open the binary log, specify the storage location
 log-slave-updates = true     
#Allow the slave to write to itself when copying data from the master In the binary log of
 systemctl
 restart mysqld #restart mysql
 ##
 
 
Slave1 node##

 
 vim /etc/my.cnf
 [mysqld]
 server-id = 2 #The server-id of the three servers cannot be the same
 log-bin = master-bin
 relay-log = relay-log-bin
 relay-log-index = slave-relay -bin.index
 ​systemctl
 restart mysqld
 ​##
Slave2 node##

 
 vim /etc/my.cnf
 [mysqld]
 server-id = 3 #The server-id of the three servers cannot be the same
 log-bin = master-bin
 relay-log = relay-log-bin
 relay-log-index = slave-relay -bin.index
 ​systemctl
 restart mysqld #Restart mysql
(4) Create a soft link on the mysql node server 
All three mysql node servers are created: 

 ln -s /usr/local/mysql/bin/mysql /usr/sbin/
 ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/​ls /
 usr
 /sbin/mysql* #View soft link
 
( 5) Log in to the database for authorization 
 ##----(1) Authorize all database nodes for mysql master-slave synchronization ------##
 grant replication slave on *.* to 'myslave'@'192.168.73. %' 
identified by '123123'; #Synchronous use from the server
 
 
 ##---(2) All database nodes, authorized to the manager server -----##
 
 grant all privileges on *.* to 'mha'@'192.168 .73.%' identified by 'manager';  
 #In order to prevent the connection failure caused by the host name, authorize the login address again
 ​grant
 all privileges on *.* to 'mha'@'mysql1' identified by 'manager ';              
 grant all privileges on *.* to 'mha'@'mysql2' identified by 'manager';
 grant all privileges on *.* to 'mha'@'mysql3' identified by 'manager';
 ​flush
 privileges; #Refresh privileges

(6) Configure master-slave synchronization 
View binary files and synchronization points (ie offsets) on the Master node, and perform synchronization operations on Slave1 and Slave2 nodes. 

 ##(1) View the binary file and synchronization point (ie offset) on the Master node ##
 show master status;
 

 #General possible reasons for "Slave_IO_Running: No":
  1. The network is unreachable 
  2. There is a problem with my.cnf configuration (server-id is repeated)
  3. The password, file name, and pos offset are incorrect 
  4. The firewall is not closed 

 ##Perform synchronous operation on Slave1 and Slave2 nodes##
 change master to master_host='192.168.73.105', master_user='myslave', master_password='123123',
master_log_file='master-bin.000006', master_log_pos=154 
 ;
 start slave; #Start synchronization, if there is an error, execute reset slave;
 ​##
 Check node status in Slave1 and Slave2##
 show slave status\G     
 


 ##The two slave libraries must be set to read-only mode##
 set global read_only=1;
 ​##
 Insert data in the Master main library, test database synchronization##
 mysql> use db_test
 
mysql> create table info(id int,name char (10));
mysql> insert into info values(1,'mysql1's data');
 
 
 ​#Verify
 whether the synchronization is successful from the database
Step 2: Configure MHA 
Note: The epel source installation here cannot be installed from the local source. The local source of centos7 is not available, it needs to be modified to an online source for installation 

1) Install MHA software on all servers 
 ## (1) Install the MHA dependent environment on all servers, first install the epel source ##
 yum install epel-release --nogpgcheck -y
 ​#Install
 the MHA dependent environment
yum install -y perl- DBD-MySQL \
perl-Config-Tiny \
perl-Log-Dispatch \
perl-Parallel-ForkManager \
perl-ExtUtils-CBuilder \
perl-ExtUtils-MakeMaker \
perl-CPAN
 ​##
(2) To install the MHA package, you must first Install node components on all servers ##
#Upload the installation package to the /opt/ directory, unzip and install node components##
cd /opt
tar zxvf mha4mysql-node-0.57.tar.gz
cd mha4mysql-node-0.57
perl Makefile. PL
make && make install
 ​##
(3) Finally, install the manager component on the MHA manager node## (the manager component depends on the node component)
cd /opt/
tar zxvf mha4mysql-manager-0.57.tar.gz
cd mha4mysql-manager-0.57
perl Makefile.PL
make && make install  ​​​​-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------  
   --------------------------------------------------  ​#manager  After the component is installed, several tools will be generated under /usr/local/bin, mainly including the following:  masterha_check_ssh #Check the SSH configuration status of MHA  masterha_check_repl #Check the MySQL replication status  masterha_manger #Start the script of the manager  masterha_check_status #Detect the current MHA running status  masterha_master_monitor #Detect whether the master is down  masterha_master_switch #Control failover (automatic or manual)  masterha_conf_host #Add or delete configured server information  masterha_stop #Close  manager














 After the #node component is installed, several scripts will be generated under /usr/local/bin (these tools are usually triggered by MHAManager scripts without manual operation), mainly as follows: save_binary_logs #Save and copy the binary log of the master apply_diff_relay_logs #Identify
 the
 difference Relay log events and apply the difference events to other slaves
 filter_mysqlbinlog #Remove unnecessary ROLLBACK events (MHA no longer uses this tool)
 purge_relay_logs #Clear relay logs (does not block SQL threads)

(2) Configure passwordless authentication on all servers 
 ## (1) Configure passwordless authentication to all database nodes on the manager node
 ssh-keygen -t rsa #Press the Enter key all the way to generate a key. "-t rsa" specifies the type of key.
 ssh-copy-id 192.168.73.105 #Pass the public key to all database nodes to form a password-free connection to log in
 ssh-copy-id 192.168.73.106
 ssh-copy-id 192.168.73.107
 ​##
 (2)Configure on mysql1 Passwordless authentication of database nodes mysql2 and mysql3
 ssh-keygen -t rsa
 ssh-copy-id 192.168.73.106 #Pass the public key to two slave nodes to form a password-free connection to log in
 ssh-copy-id 192.168.73.107
 ​##
 (3) Configure passwordless authentication on mysql2 to database nodes mysql1 and mysql3
 ssh-keygen -t rsa
 ssh-copy-id 192.168.73.105
 ssh-copy-id 192.168.73.107
 ​##
 (4) Configure on mysql3 to Passwordless authentication for database nodes mysql1 and mysql2
 ssh-keygen -t rsa
 ssh-copy-id 192.168.73.105
 ssh-copy-id 192.168.73.106
 

(3) Configure MHA on the manager node 
 ## (1) Copy the relevant scripts to the /usr/local/bin directory on the manager node
 cp -rp /opt/mha4mysql-manager-0.57/samples/scripts /usr/local/bin
 #After copying, there will be four executable files
 ll /usr/local/bin/scripts/
 ​------------
 The dotted line is a comment -------------- --------------------------------------------
 master_ip_failover #VIP management script when switching automatically
 master_ip_online_change #Management script of VIP when switching online
 power_manager #Script to shut down host after failure occurs
 send_report #Script to send alarm after failover
 ---------------------- -------------------------------------------------- --------  ​​##
   (2) Copy the above-mentioned VIP management script during automatic switching to the /usr/local/bin directory, here use the master_ip_failover script to manage VIP and failover


 cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin  ​​##
   (3) Modify the content as follows: (Delete the original content, directly copy and modify vip related parameters. You can enter: set paste before copying Solve vim paste out of order)  vim /usr/local/bin/master_ip_failover  #!/usr/bin/env perl  use strict;  use warnings FATAL => 'all';  ​use  Getopt::Long;  ​my  (  $command, $ssh_user, $orig_master_host, $orig_master_ip,  $orig_master_port, $new_master_host, $new_master_ip, $new_master_port  );  ############################  ## Add content section ##########################################  my $vip = '192.168.73.66'; #Specify the address of vip
















 my $brdc = '192.168.73.255'; #Specify the broadcast address of vip
 my $ifdev = 'ens33'; #Specify the network card bound to vip
 my $key = '1'; #Specify the serial number of the virtual network card bound to vip
 my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip"; #Represents the value of this variable ifconfig ens33:1 192.168.72.100
 my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down"; #Represents the value of this variable For ifconfig ens33:1 192.168.72.100 down
 my $exit_code = 0; #Specify exit status code as 0
 #my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev: $key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
 #my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
 ##################################################################################
 ​
 GetOptions(
 'command=s' => $command,
 'ssh_user=s' => $ssh_user,
 'orig_master_host=s' => $orig_master_host,
 'orig_master_ip=s' => $orig_master_ip,
 'orig_master_port=i' => $orig_master_port,
 'new_master_host=s' => $new_master_host,
 'new_master_ip=s' => $new_master_ip,
 'new_master_port=i' => $new_master_port,
 );
 ​
 exit &main();
 ​
 sub main {
 ​
 print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";
 ​
 if ( $command eq "stop" || $command eq "stopssh" ) {
 ​
 my $exit_code = 1;
 eval {
 print "Disabling the VIP on old master: $orig_master_host \n";
 &stop_vip();
 $exit_code = 0;
 };
 if ($@) {
 warn "Got Error: $@\n";
 exit $exit_code;
 }
 exit $exit_code;
 }
 elsif ( $command eq "start" ) {
 ​
 my $exit_code = 10;
 eval {
 print "Enabling the VIP - $vip on the new master - $new_master_host \n";
 &start_vip();
 $exit_code = 0;
 };
 if ($@) {
 warn $@;
 exit $exit_code;
 }
 exit $exit_code;
 }
 elsif ( $command eq "status" ) {
 print "Checking the Status of the script.. OK \n";
 exit 0;
 }
 else {
 &usage();
 exit 1;
 }
 }
 sub start_vip() {
 `ssh $ssh_user@$new_master_host " $ssh_start_vip "`;
 }
 ## A simple system call that disable the VIP on the old_master
 sub stop_vip() {
 `ssh $ssh_user@$orig_master_host " $ssh_stop_vip "`;
 }
 ​
 sub usage {
 print
 "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
 }
 

(4) The manager node edits the configuration file and manages the mysql node server 
 mkdir /etc/masterha
 cp /opt/mha4mysql-manager-0.57/samples/conf/app1.cnf /etc/masterha/ #copy configuration file
 ​vim
 /etc/masterha /app1.cnf #Delete the original content, directly copy and modify the IP address of the node server
 [server default]
 manager_log=/var/log/masterha/app1/manager.log
 manager_workdir=/var/log/masterha/app1
 master_binlog_dir=/ usr/local/mysql/data
 master_ip_failover_script=/usr/local/bin/master_ip_failover
 master_ip_online_change_script=/usr/local/bin/master_ip_online_change
 user=mha
 password=manager
 ping_interval=1
 remote_workdir=/tmp
 repl_password=123123
 repl_user=mysl ave
 secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.73.106 
-s 192.168.73.107
 shutdown_script=""
 ssh_user=root
 user=mha
 ​
 [server1]
 hostname=192.168.73.105
 port=3306
 ​
 [server2]
 candidate_master=1
 check_repl_delay=0
 hostname=192.168.73.106
 port=3306
 ​
 [server3]
 hostname=192.168.73.107
 port=3306
 

[server default]
 manager_log=/var/log/masterha/app1/manager.log #manager log
 manager_workdir=/var/log/masterha/app1 #manager working directory
 master_binlog_dir=/usr/local/mysql/data/ #master save binlog location, the path here must be consistent with the binlog path configured in the master, so that MHA can find
 master_ip_failover_script=/usr/local/bin/master_ip_failover  
#Setting the switching script when automatic failover is set, which is the script above
 master_ip_online_change_script=/usr /local/bin/master_ip_online_change 
 #Set the switching script for manual switching
 user=mha #Set the mysql user, the user created when authorizing the manager
 password=manager #Set the password of the mysql user, which is the password used to create the monitoring user in the previous article
 ping_interval=1        
 #Set the time interval for monitoring the main library and sending ping packets, the default is 3 seconds, and failover will be automatically performed when there is no response after three attempts
 remote_workdir=/tmp #Set the binlog storage location when the remote mysql switches
 repl_password=123123 #Set the password of the replicated user (the user and password created during the master-slave synchronization authorization)
 repl_user=myslave #Set the user of the replicated user
 report_script=/usr/local/send_report #Set the script of the alarm sent after the switch occurs
 secondary_check_script= /usr/local/bin/masterha_secondary_check -s 192.168.73.106 -s 192.168.73.107 #Specify the IP address of the slave server to check
 shutdown_script="" #Set the script to shut down the failed host after a failure occurs (the main function of this script is to shut down the host to prevent occurrence Split brain, not used here, you can use the power_manager that comes with the system)
 ssh_user=root #Set the login user name of ssh
 ​[
 server1] #master
 hostname=192.168.73.105
 port=3306
 ​[
 server2] #slave1
 hostname=192.168.73.106
 port=3306
 candidate_master=1
 #Set as candidate master. After setting this parameter, the slave library will be promoted to the master library after the master-slave switch occurs, even if the slave library is not the latest
 slave
 check_repl_delay=0
 #By default, if a slave lags behind the master by more than 100M relay logs, MHA will not select the slave as a new master, because it takes a long time to restore the slave; by setting check_repl_delay=0, MHA triggers switching and will ignore the replication delay when selecting a new master. This parameter is very useful for hosts with candidate_master=1, because the candidate master must be the new master during the switching process​[server3]
 #
 slave2
 hostname=192.168.73.107
 port=3306
 

(5) The first configuration needs to manually open the virtual IP on the Master node
/sbin/ifconfig ens33:1 192.168.73.66/24

(6) Test ssh passwordless authentication on the manager node 
masterha_check_ssh -conf=/etc/masterha/app1.cnf

(7) Test the mysql master-slave connection on the manager node 
masterha_check_repl -conf=/etc/masterha/app1.cnf

Test the master-slave connection of mysql on the manager node, and the words MySQL Replication Health is OK appear at the end, indicating that it is normal

(8) Start MHA nohup on the manager node 
masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &  ​​-------------------------------
   The following are notes------------- -------------------------------------------------  remove_dead_master_conf  #This parameter means that when the master-slave switch occurs, the ip of the old master library will be removed from the configuration file.  --manger_log #Log storage location.  --ignore_last_failover #By default, if MHA detects continuous downtime and the interval between two downtimes is less than 8 hours, Failover will not be performed. The reason for this limitation is to avoid the ping-pong effect (switching back and forth lead to split brain). This parameter means to ignore the file generated by the last MHA trigger switch. By default, after the MHA switch occurs, it will be recorded in the app1.failover.complete log file. If the file is found to exist in the directory when the switch is made again next time, it will not be allowed Trigger switchover, unless the file is deleted after the first switchover, for convenience, here is set to --ignore_last_failover.  ​







 -------------------------------------------------- -----------------------------------
 ​Use
 & run the program in the background: the result will be output to the terminal; use Ctrl+C sends a SIGINT signal, and the program is immune; closes the session and sends a SIGHUP signal, and the program closes.
 ●Use nohup to run the program: the result will be output to nohup.out by default; use Ctrl+C to send a SIGINT signal, and the program will close; close the session and send a SIGHUP signal, and the program will be immune.
 ● Use nohup and & to start the program nohup ./test &: Simultaneously immune to SIGINT and SIGHUP signals.
 ​
 

(9) Check the MHA status and MHA log on the manager node, and you can see the address of the master 
#Check the MHA status, and you can see that the current master is the Mysql1 node.
 masterha_check_status --conf=/etc/masterha/app1.cnf
 ​#Check
 the MHA log, you can also see that the current master is 192.168.73.105
 cat /var/log/masterha/app1/manager.log | grep "current master"
 

(10) Check whether the VIP address exists on Mysql1
 ifconfig
 ​#To
 close the manager service, you can use the following command.
 masterha_stop --conf=/etc/masterha/app1.cnf
 #Or it can be closed directly by killing the process ID.

Step 3: Fault test 
 ------------------------ Algorithm for failover of alternate main library ------------- -----------  ​Algorithm for
 failover  1. Generally, the slave library is judged from (position/GTID) to judge whether it is good or bad, and the data is different. The slave closest to the master becomes the candidate master.  ​2  . If the data is consistent, select an alternative main library according to the order of the configuration files.  ​3  . Set the weight (candidate_master=1), and the candidate master is forced to be designated according to the weight.  ​(  1) By default, if a slave lags behind the master's relay logs by 100M, it will fail even if it has weight.  ​(  2) If check_repl_delay=0, even if there are many logs behind, it is forced to be selected as the backup master.










 

 ##(1) Stop the mysql service on the Master node Mysql1
 systemctl stop mysqld
 or
 pkill -9 mysql
 ​##
 (2) Monitor and observe log records on the manager node
 tail -f /var/log/masterha/app1/manager.log  ​​​​##  
   (3) After a normal automatic switch, the MHA process will exit. MHA will automatically modify the contents of the app1.cnf file and delete the downtime mysql1 node.  vim /etc/masterha/app1.cnf #View the configuration file of the manager node  ##  (4) Check whether mysql2 takes over VIP  ifconfig    
 









 

 3. Repair the faulty master 
(1) Repair mysql1 (that is, repair the original master node)
systemctl restart mysqld

(2) Repair master-slave data 
mysql -u root -p
mysql> show master status;

mysql -u root -p
change master to master_host='192.168.73.106',master_user='myslave',master_password='123123',master_log_file='master-bin.000002',master_log_pos=154;

(3) Modify the configuration file app1.cnf on the manager node 
vi /etc/masterha/app1.cnf
 ......
 secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.72.192 -s 192.168.72.80
 .. ....
 [server1]
 hostname=192.168.72.60
 port=3306
 ​[
 server2]
 candidate_master=1
 check_repl_delay=0
 hostname=192.168.72.192
 port=3306
 ​[
 server3]
 hostname=192.168.72.80
 port=3306

(4) Start MHA on the manager node 
masterha_stop --conf=/etc/masterha/app1.cnf
 
 
nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var /log/masterha/app1/manager.log 2>&1 &

(5) Restart mysql1 and mysql2 
systemctl restart mysqld

Guess you like

Origin blog.csdn.net/zl965230/article/details/130803423