MySQL - 15 MHA Profile

About a .MHA


Songxin Jiafan:
MySQL / Linux expert
2001 Sony entry
began in 2001, using the oracle
in 2004 started using MySQL
in September 2006 - August 2010 MySQL consultant in
2010 - DeNA in 2012
2012 - present in Facebook


Software Description

MHA is possible to achieve automatic fault detection and failover in a short time, usually within 10-30 seconds; replication in the frame, MHA can solve the data consistency problem during replication, there is no need in the prior the replication in additional servers, only one node manager, while a Manager can manage multiple sets of copy, it is possible to greatly save the number of servers; Furthermore, easy installation, without loss of performance, and the need to modify an existing deployment replication it is also at the advantage of.

MHA also offers online master library function switch can be switched safely to the main library currently running a new main library (library by lifting from the main library), it can be completed within about 0.5-2 seconds.

MHA consists of two parts: MHA Manager (management node), and the Node MHA (node data). MHA Manager can be deployed independently manage multiple Master-Slave cluster on a separate machine, it can be deployed on a single Slave. When the Master fails, it can automatically upgrade to the latest data Slave new Master, and then all the other Slave redirected to the new Master. The entire failover process is completely transparent to the application.

II. Workflow

Three .MHA Chart 

1, MHA is serving a C / S structure

2, MHA can be installed on any server

3, a MHA management node can manage multiple sets of MySQL Cluster

4, MHA management node avoid installed in the main database (to avoid power failure, broken network)

NOTE: If mounting on slave_02, slave_02 promoted to the main library? (Not to upgrade the main library slave_02 no master)

5, MHA is manger and node composition, manager is a server, node client

Four .MHA tool introduced

MHA software consists of two parts, and the Node Manager Toolkit kit specifically described as follows:

Manager Kit includes the following tools:

[root@db01 bin]# pwd
/root/mha4mysql-manager-0.56/bin
[root@db01 bin]# ll
total 40
-rwxr-xr-x 1 4984 users 1995 Apr  1  2014 masterha_check_repl       #检查主从复制情况
-rwxr-xr-x 1 4984 users 1779 Apr  1  2014 masterha_check_ssh       #检查MHA的ssh-key
-rwxr-xr-x 1 4984 users 1865 Apr  1  2014 masterha_check_status    #检测MHA的运行状态
-rwxr-xr-x 1 4984 users 3201 Apr  1  2014 masterha_conf_host    #配置MHA主机
-rwxr-xr-x 1 4984 users 2517 Apr  1  2014 masterha_manager     #启动MHA
-rwxr-xr-x 1 4984 users 2165 Apr  1  2014 masterha_master_monitor    #检测master是否宕机
-rwxr-xr-x 1 4984 users 2373 Apr  1  2014 masterha_master_switch    #手动故障转移,切换主库
-rwxr-xr-x 1 4984 users 5171 Apr  1  2014 masterha_secondary_check   #建立TCP连接
-rwxr-xr-x 1 4984 users 1739 Apr  1  2014 masterha_stop  #停止MHA

Node Kit includes the following tools:

[root@db01 bin]# pwd
/root/mha4mysql-node-0.56/bin
[root@db01 bin]# ll
total 44
-rwxr-xr-x 1 4984 users 16367 Apr  1  2014 apply_diff_relay_logs    #对比从库之间的relay log
-rwxr-xr-x 1 4984 users  4807 Apr  1  2014 filter_mysqlbinlog   #截取binlog, 防止回滚事件
-rwxr-xr-x 1 4984 users  8261 Apr  1  2014 purge_relay_logs #删除relay-log
-rwxr-xr-x 1 4984 users  7525 Apr  1  2014 save_binary_logs  #保存所有的binlog事件

MHA advantage summary

1) Masterfailover and slave promotion can be done very quickly
automatic failover faster 0-30 seconds

2) Mastercrash does not result in data inconsistency
data consistency problem there is no collapse of the main library

3) Noneed to modify current MySQL settings (MHA works with regular MySQL)
do not need to make major changes to the current environment mysql

4) Noneed to increase lots of servers
without adding additional servers (only one manager can manage hundreds of replication)

5) Noperformance penalty
excellent performance, can be operated in semi-synchronous and asynchronous replication, while monitoring mysql state, every N seconds is only necessary to send a ping packet Master (default value is three seconds), so no effect on performance. You can be understood as MHA performance and simple from the same master copy performance framework.

1. If the ping baidu.com, ping protocol is icmp, telnet is similar, aliyun server security group

2. ping here is a sql ping (select ping), heartbeat detection database

6) Works with any storage engine
as long as the supported storage replication engine, MHA support, will not be limited to innodb

MySQL environment ready

1) Check the environment

mysql-db01

#系统版本
[root@mysql-db01 ~]# cat /etc/redhat-release 
CentOS release 6.7 (Final)
#内核版本
[root@mysql-db01 ~]# uname -r
2.6.32-573.el6.x86_64
#IP地址
[root@mysql-db01 ~]# hostname -I
10.0.0.51

mysql-db02

#系统版本
[root@mysql-db02 ~]# cat /etc/redhat-release
CentOS release 6.7 (Final)
#内核版本
[root@mysql-db02 ~]# uname -r
2.6.32-573.el6.x86_64
#IP地址
[root@mysql-db02 ~]# hostname -I
10.0.0.52

mysql-db03

#系统版本
[root@mysql-db03 ~]# cat /etc/redhat-release 
CentOS release 6.7 (Final)
#内核版本
[root@mysql-db03 ~]# uname -r
2.6.32-573.el6.x86_64
#IP地址
[root@mysql-db03 ~]# hostname -I
10.0.0.53

Installing MySQL

1) preparing the installation package

#创建安装包存放目录
[root@mysql-db01 ~]# mkdir /home/oldboy/tools -p
#进入目录
[root@mysql-db01 ~]# cd /home/oldboy/tools/
#上传mysql安装包(mysql-5.6.16-linux-glibc2.5-x86_64.tar.gz)
[root@mysql-db01 tools]# rz -be

2) Installation

#创建安装目录
[root@mysql-db01 tools]# mkdir /application
#解压mysql二进制包
[root@mysql-db01 tools]# tar xf mysql-5.6.16-linux-glibc2.5-x86_64.tar.gz
#移动安装包
[root@mysql-db01 tools]# mv mysql-5.6.16-linux-glibc2.5-x86_64 /application/mysql-5.6.16
#做软链接
[root@mysql-db01 tools]# ln -s /application/mysql-5.6.16/ /application/mysql
#创建mysql用户
[root@mysql-db01 tools]# useradd mysql -s /sbin/nologin -M
#进入mysql初始化目录
[root@mysql-db01 tools]# cd /application/mysql/scripts/
#初始化mysql
[root@mysql-db01 scripts]# ./mysql_install_db \
--user=mysql \
--datadir=/application/mysql/data/ \
--basedir=/application/mysql/
#注解
--user:  指定mysql用户
--datadir:指定mysql数据存放目录
--basedir:指定mysql base目录
#拷贝mysql配置文件
[root@mysql-db01 ~]# \cp /application/mysql/support-files/my-default.cnf /etc/my.cnf
#拷贝mysql启动脚本
[root@mysql-db01 ~]# cp /application/mysql/support-files/mysql.server /etc/init.d/mysqld
#修改mysql默认安装目录(否则无法启动)
[root@mysql-db01 ~]# sed -i 's#/usr/local#/application#g' /etc/init.d/mysqld
[root@mysql-db01 ~]# sed -i 's#/usr/local#/application#g' /application/mysql/bin/mysqld_safe
#配置mysql环境变量
[root@mysql-db01 ~]# echo 'export PATH="/application/mysql/bin:$PATH"' >> /etc/profile.d/mysql.sh
#刷新环境变量
[root@mysql-db01 ~]# source /etc/profile
2.2.3启动
#加入开机自启
[root@mysql-db01 ~]# chkconfig mysqld on
#启动mysql
[root@mysql-db01 ~]# /etc/init.d/mysqld start
Starting MySQL........... SUCCESS! #启动成功
2.2.4配置密码
#配置mysql密码为oldboy123
[root@mysql-db01 ~]# mysqladmin -uroot password oldboy123

V. Based on the master copy from GTID

1) GTID: (Global Transaction ID)

A globally unique identifier, UUID + TID is composed. MySQL instances where UUID is a unique identification. TID is the transaction commits number and submit monotonically increasing along with the transaction.

GTID a specific form:

342a3b8f-0d8e-11ea-8095-000c29c7dac3:1

342a3b8f-0d8e-11ea-8095-000c29c7dac3:2

2) GTID new features
(1) high efficiency, high speed, support multi-threaded replication: in fact, open the corresponding separate threads for each database, that is, each has a separate database (sql thread).

(2) Support Enable GTID, copied from the traditional way, you need to find binlog and POS main point in the configuration, and then change master to point to.
In mysql5.6 years, no longer need to know binlog and POS points, just need to know master's IP / port / account password, because synchronous replication is automatic, MySQL GTID automatically find the point of synchronization through internal mechanisms. (show master status) master_auto_position

(3). Row-based replication to save only the changed columns, saving Disk Space / Network resources and Memory usage. (Disk resources, network resources, memory usage)

(4) Support the Master and Slave relevant information recorded in the Table, the original record in the file, records in the table, enhanced usability

(5) Support delay replication.

Prerequisites
1) and the main library from the library must open the binlog
2) different from the main library and the library server-id
from 3 to copy) have a primary user

Main library operations

Modify the configuration file

#编辑mysql配置文件
[root@mysql-db01 ~]# vim /etc/my.cnf
#在mysqld标签下配置
[mysqld]
#主库server-id为1,从库不等于1
server_id =1
#开启binlog日志
log_bin=mysql-bin

Create a master-slave replication user

#登录数据库
[root@mysql-db01 ~]# mysql -uroot -p123
#创建rep用户
mysql> grant replication slave on *.* to rep@'10.0.0.%' identified by 'oldboy123';

From the library operations

Modify the configuration file

#修改mysql-db02配置文件
[root@mysql-db02 ~]# vim /etc/my.cnf
#在mysqld标签下配置
[mysqld]
#主库server-id为1,从库必须不等于1
server_id =5
#开启binlog日志
log_bin=mysql-bin
#重启mysql
[root@mysql-db02 ~]# /etc/init.d/mysqld restart

#修改mysql-db03配置文件
[root@mysql-db03 ~]# vim /etc/my.cnf
#在mysqld标签下配置
[mysqld]
#主库server-id为1,从库必须大于1
server_id =10
#开启binlog日志
log_bin=mysql-bin
#重启mysql
[root@mysql-db03 ~]# /etc/init.d/mysqld restart

Note: In the past, if it is based on the master binlog log from replication, you must remember that the main library master status information.

mysql> show master status;
+------------------+----------+
| File             | Position |
+------------------+----------+
| mysql-bin.000002 |      120 |
+------------------+----------+

Open GTID

#没开启之前先看一下GTID的状态
mysql> show global variables like '%gtid%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| enforce_gtid_consistency | OFF   |
| gtid_executed            |       |
| gtid_mode                | OFF   |
| gtid_owned               |       |
| gtid_purged              |       |
+--------------------------+-------+ 

#编辑mysql配置文件(主库从库都需要修改)
[root@mysql-db01 ~]# vim /etc/my.cnf
#在[mysqld]标签下添加
[mysqld]
gtid_mode=ON
log_slave_updates
enforce_gtid_consistency
#重启数据库
[root@mysql-db01 ~]# /etc/init.d/mysqld restart
#检查GTID状态
mysql> show global variables like '%gtid%';
+--------------------------+-------+
| Variable_name            | Value |
+--------------------------+-------+
| enforce_gtid_consistency | ON    | #执行GTID一致
| gtid_executed            |       |
| gtid_mode                | ON    | #开启GTID模块
| gtid_owned               |       |
| gtid_purged              |       |
+--------------------------+-------+

Note: The main library is open GTID need from the library or from the time of the error will be replicated in the shots:

[root@mysql-db02 ~]# mysql -uroot -p123
mysql> change master to
-> master_host='10.0.0.51',
-> master_user='rep',
-> master_password='123',
-> master_auto_position=1;
ERROR 1777 (HY000): CHANGE MASTER TO MASTER_AUTO_POSITION = 1 can only be executed when @@GLOBAL.GTID_MODE = ON.

Configure the master-slave replication

#登录数据库
[root@mysql-db02 ~]# mysql -uroot -poldboy123
#配置复制主机信息
mysql> change master to
#主库IP
-> master_host='10.0.0.51',
#主库复制用户
-> master_user='rep',
#主库复制用户的密码
-> master_password='oldboy123',
#GTID位置点
-> master_auto_position=1;
#开启slave
mysql> start slave;
#查看slave状态
mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.0.51
                  Master_User: rep
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000003
          Read_Master_Log_Pos: 403
               Relay_Log_File: mysql-db02-relay-bin.000002
                Relay_Log_Pos: 613
        Relay_Master_Log_File: mysql-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB: 
          Replicate_Ignore_DB: 
           Replicate_Do_Table: 
       Replicate_Ignore_Table: 
      Replicate_Wild_Do_Table: 
  Replicate_Wild_Ignore_Table: 
                   Last_Errno: 0
                   Last_Error: 
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 403
              Relay_Log_Space: 822
              Until_Condition: None

From the library settings

#登录从库
[root@mysql-db02 ~]# mysql -uroot -p123
#临时关闭relay log 功能 (主库和从库) 
mysql> set global relay_log_purge = 0;
#设置只读  (仅从库)
mysql> set global read_only=1;
#编辑配置文件
[root@mysql-db02 ~]# vim /etc/my.cnf
#在mysqld标签下添加
[mysqld]
#永久关闭自动删除relay log (主库和从库)
relay_log_purge = 0

1. Note: log-slave-updates: When will all use this parameter?

1. Dual mode master

2. cascading replication

3.GTID

2. Note: sever_id

1. Main Library: open binlog and server_id

2. From the Library: do not open the binlog, server_id can be the same

If you do MHA: you must be turned binlog, server_id from the library must not be the same

VI. MHA deployment

1) Preparing the Environment (all nodes)

#安装依赖包
[root@mysql-db01 ~]# yum install perl-DBD-MySQL -y
#进入安装包存放目录
[root@mysql-db01 ~]# cd /home/oldboy/tools/
#上传mha安装包
[root@mysql-db01 tools]# rz -be
mha4mysql-manager-0.56-0.el6.noarch.rpm
mha4mysql-manager-0.56.tar.gz
mha4mysql-node-0.56-0.el6.noarch.rpm
mha4mysql-node-0.56.tar.gz
#安装node包
[root@mysql-db01 ~]# rpm -ivh mha4mysql-node-0.56-0.el6.noarch.rpm
Preparing...                ########################################### [100%]
   1:mha4mysql-node         ########################################### [100%]
#登录数据库
[root@mysql-db01 tools]# mysql -uroot -poldboy123
#添加mha管理账号
mysql> grant all privileges on *.* to mha@'10.0.0.%' identified by 'mha';
#查看是否添加成功
mysql> select user,host from mysql.user;
#主库上创建,从库会自动复制(在从库上查看)
mysql> select user,host from mysql.user;

Command flexible connection (all nodes)

#如果不创建命令软连接,检测mha复制情况的时候会报错
[root@mysql-db01 ~]# ln -s /application/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog
[root@mysql-db01 ~]# ln -s /application/mysql/bin/mysql /usr/bin/mysql

Deployment Management node (mha-manager: mysql-db03)

#使用epel源
[root@mysql-db03 ~]# wget -O /etc/yum.repos.d/epel.repo https://mirrors.aliyun.com/repo/epel-6.repo
#安装manager依赖包
[root@mysql-db03 ~]# yum install -y perl-Config-Tiny epel-release perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes
#安装manager包
[root@mysql-db03 tools]# rpm -ivh mha4mysql-manager-0.56-0.el6.noarch.rpm 
Preparing...              ########################################### [100%]
1:mha4mysql-manager       ########################################### [100%]

Edit Profile

#创建配置文件目录
[root@mysql-db03 ~]# mkdir -p /etc/mha

#编辑mha配置文件
[root@mysql-db03 ~]# vim /etc/mha/app1.cnf
[server default]
manager_log=/etc/mha/manager.log
manager_workdir=/etc/mha/app1
master_binlog_dir=/application/mysql/data
user=mha
password=mha
ping_interval=2
repl_password=123
repl_user=rep
ssh_user=root

[server1]
hostname=10.0.0.51
port=3306

[server2]
candidate_master=1
check_repl_delay=0
hostname=10.0.0.52
port=3306

[server3]
hostname=10.0.0.53
port=3306

Detailed profiles

[server default]
#设置manager的工作目录
manager_workdir=/var/log/masterha/app1
#设置manager的日志
manager_log=/var/log/masterha/app1/manager.log 
#设置master 保存binlog的位置,以便MHA可以找到master的日志,我这里的也就是mysql的数据目录
master_binlog_dir=/data/mysql
#设置自动failover时候的切换脚本
master_ip_failover_script= /usr/local/bin/master_ip_failover
#设置手动切换时候的切换脚本
master_ip_online_change_script= /usr/local/bin/master_ip_online_change
#设置mysql中root用户的密码,这个密码是前文中创建监控用户的那个密码
password=123456
#设置监控用户root
user=root
#设置监控主库,发送ping包的时间间隔,尝试三次没有回应的时候自动进行failover
ping_interval=1
#设置远端mysql在发生切换时binlog的保存位置
remote_workdir=/tmp
#设置复制用户的密码
repl_password=123456
#设置复制环境中的复制用户名 
repl_user=rep
#设置发生切换后发送的报警的脚本
report_script=/usr/local/send_report
#一旦MHA到server02的监控之间出现问题,MHA Manager将会尝试从server03登录到server02
secondary_check_script= /usr/local/bin/masterha_secondary_check -s server03 -s server02 --user=root --master_host=server02 --master_ip=192.168.0.50 --master_port=3306
#设置故障发生后关闭故障主机脚本(该脚本的主要作用是关闭主机放在发生脑裂,这里没有使用)
shutdown_script=""
#设置ssh的登录用户名
ssh_user=root 

[server1]
hostname=10.0.0.51
port=3306

[server2]
hostname=10.0.0.52
port=3306
#设置为候选master,如果设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中事件最新的slave。
candidate_master=1
#默认情况下如果一个slave落后master 100M的relay logs的话,MHA将不会选择该slave作为一个新的master,因为对于这个slave的恢复需要花费很长时间,通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master
check_repl_delay=0

Configuring ssh trust (all nodes)

#创建秘钥对
[root@mysql-db01 ~]# ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa >/dev/null 2>&1
#发送公钥,包括自己
[root@mysql-db01 ~]# ssh-copy-id -i /root/.ssh/id_dsa.pub [email protected]
[root@mysql-db01 ~]# ssh-copy-id -i /root/.ssh/id_dsa.pub [email protected]
[root@mysql-db01 ~]# ssh-copy-id -i /root/.ssh/id_dsa.pub [email protected]

Start the test

#测试ssh
[root@mysql-db03 ~]# masterha_check_ssh --conf=/etc/mha/app1.cnf
#看到如下字样,则测试成功
Tue Mar  7 01:03:33 2017 - [info] All SSH connection tests passed successfully.
#测试复制
[root@mysql-db03 ~]# masterha_check_repl --conf=/etc/mha/app1.cnf
#看到如下字样,则测试成功
MySQL Replication Health is OK.

Start MHA

#启动
[root@mysql-db03 ~]# nohup masterha_manager --conf=/etc/mha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /etc/mha/manager.log 2>&1 &

[root@mysql-db03 ~]# masterha_check_status --conf=/etc/mha/app1.cnf 
app1 (pid:12000) is running(0:PING_OK), master:10.0.0.51

Switching master test

#登录数据库(db02)
[root@mysql-db02 ~]# mysql -uroot -p123
#检查复制情况
mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.0.51
                  Master_User: rep
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000006
          Read_Master_Log_Pos: 191
               Relay_Log_File: mysql-db02-relay-bin.000002
                Relay_Log_Pos: 361
        Relay_Master_Log_File: mysql-bin.000006
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
#登录数据库(db03)
[root@mysql-db03 ~]# mysql -uroot -p123
#检查复制情况
mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.0.51
                  Master_User: rep
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000006
          Read_Master_Log_Pos: 191
               Relay_Log_File: mysql-db03-relay-bin.000002
                Relay_Log_Pos: 361
        Relay_Master_Log_File: mysql-bin.000006
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

#停掉主库
[root@mysql-db01 ~]# /etc/init.d/mysqld stop
Shutting down MySQL..... SUCCESS!
#登录数据库(db02)
[root@mysql-db02 ~]# mysql -uroot -p123
#查看slave状态
mysql> show slave status\G
#db02的slave已经为空
Empty set (0.00 sec)
#登录数据库(db03)
[root@mysql-db03 ~]# mysql -uroot -p123
#查看slave状态
mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.0.0.52
                  Master_User: rep
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000006
          Read_Master_Log_Pos: 191
               Relay_Log_File: mysql-db03-relay-bin.000002
                Relay_Log_Pos: 361
        Relay_Master_Log_File: mysql-bin.000006
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

Guess you like

Origin www.cnblogs.com/gongjingyun123--/p/11909575.html