MYSQL 5.7 MHA(GTID+ROW)部署及failover,online_change实战演练

文章结构如下:

1、MHA简介

Masterhigh availability manager and toolsfor mysql,是日本的一位mysql专家采用perl语言编写的一个脚本管理工具,该工具进适用于msql replication环境。目的在于维持master主库的高可用性,MHA是自动的master故障转移和slave提升的软件包,基于标准的MYSQL复制(异步/半同步)。

下载地址:

https://github.com/yoshinorim/mha4mysql-manager/releases https://github.com/yoshinorim/mha4mysql-node/releases

其他需要下载的包地址:

http://rpm.pbone.net/index.php3?stat=3&limit=5&srodzaj=1&dl=40&search=perl-Module-Install&field=

2、MHA原理

MHA的目的在于维持MYSQL Replication 中master库的高可用性,其最大特点是可以修复多个slave之间的差异日志,最终使所有slave保持一致,然后从中选择一个充当新的master,并将其他slave指向它。当master出现故障时,可以通过对比slave之间的I/O thread读取主库binlog的position号,选取最接近的slave 作为备选主库,其他的从库可以通过与备库主库对比生成差异的中继日志,在备选主库上应用从原来master保存的binlog,同时将备选主库提升为master,最后在其他slave上应用相应的差异中继日志并从新的master开始复制。

3、MHA优缺点

3.1.    优点

1、 故障切换时,可以自行判断从哪个库与主库的数据最接近,然后切换到上面,可以减少数据的丢失,保证数据的一致性。

2、 支持binlog server,可以提高binlog的传送效率,进一步减少数据丢失的风险。

3、 结合MYSQL5.7的增强半同步功能,确保故障切换时数据不丢失。

3.2.    缺点

1、自动切换的脚本太简单了,而且比较老化,建议后期逐渐完善

2、搭建,MHA架构需要开启LINUX系统互信协议,所以对于系统安全性来说是个不小的考验。

4、MHA工具包的功能

1、manager管理工具

masterha_check_ssh:检查MHA的ssh配置。

masterha_check_repl:检查MYSQL数据库的主从复制功能。

masterha_manager:启动MHA服务。

masterha_check_status:检查当前MHA运行状态。

masterha_master_monitor:监测Master是否宕机。

masterha_master_switch:控制故障转移(手动或者自动)。

masterha_conf_host:添加或删除配置的server信息。

2、node 数据节点工具

save_binary_logs:保存和复制master的二进制日志。

apply_diff_relay_logs:识别差异的中继日志时间并应用于其他slave。

filter_mysqlbinlog:去除不必要的rollback事件(MHA已不再使用这个工具)。

purge_relay_logs:清除中继日志(不会阻塞SQL线程)。

5、MYSQL MHA安装以及演练

IP

主机名

作用

数据库版本

操作系统版本

172.16.10.22

rac2

master node

Mysql5.7.20

Redhat6.7

172.16.10.61

racdg1

slave node1

Mysql5.7.20

Redhat6.7

172.16.10.62

Racdg2

slave node2

Mysql5.7.20

Redhat6.7

172.16.10.30

 

VIP

 

 

 

5.1.     配置互信

配置三台机器的互信:

cd

mkdir ~/.ssh

cd /root/.ssh/

ssh-keygen -t dsa -P '' -f id_dsa

id_dsa.pub为公钥,id_dsa为私钥,紧接着将公钥文件复制authorized_keys文件,过程:

cat id_dsa.pub >> authorized_keys

从库1执行:

ssh-keygen -t dsa -P '' -f id_dsa

cat id_dsa.pub >> authorized_keys

从库2执行:

ssh-keygen -t dsa -P '' -f id_dsa

cat id_dsa.pub >> authorized_keys

在把秘钥的传到主库过程:

scp /root/.ssh/id_dsa.pub 172.16.10.22:/root/.ssh/id_dsa.pub.61

scp /root/.ssh/id_dsa.pub 172.16.10.22:/root/.ssh/id_dsa.pub.62

查看主库秘钥:

接收完成后执行合并秘钥:

cat id_dsa.pub.61 >> authorized_keys

cat id_dsa.pub.62 >> authorized_keys

scp /root/.ssh/authorized_keys 172.16.10.61:/root/.ssh/

scp /root/.ssh/authorized_keys 172.16.10.62:/root/.ssh/

然后三台服务器添加域名解析:

vi /etc/hosts

172.16.10.22  rac2

172.16.10.61   racdg1

172.16.10.62   racdg2

验证互信:

主执行:

ssh racdg1 date

ssh racdg2 date

Slave1执行:

ssh rac2 date

ssh racdg2 date

Slave2执行:

ssh rac2 date

ssh racdg1 date

5.2.     搭建一主两从

搭建主从环境,使用的是5.7版本,基于GTID+ROW模式进行搭建。

所以机器执行创建复制账号步骤:

create user 'rep'@'172.16.10.%' identified by 'mysql';

grant replication slave on *.* to 'rep'@'172.16.10.%';

flush privileges;

show grants for 'rep'@'172.16.10.%';

 

所有主机创建管理账号:

create user 'zs'@'172.16.10.%' identified by '123456';

grant all privileges on *.* to 'zs'@'172.16.10.%';

flush privileges;

 

配置主从命令并且开启主从同步:

初始化:

/usr/local/mysql5.7/bin/mysqldump -S /tmp/mysql3307.sock --single-transaction -uroot -pmysql --master-data=2 -A > slave.sql

注意:必须加参数 –master-data=2,让备份出来的文件中记录备份这一刻binlog文件与position号,为搭建主从环境做准备。查看备份文件中记录的当前binlog文件和position号。

scp slave.sql 172.16.10.61:/root

scp slave.sql 172.16.10.62:/root

注意,如果主从GTID不一样,数据一致可以:

set global gtid_purged='*******';

主库需要以下配置:

gtid_mode=on

enforce_gtid_consistency=on

log_bin=on

从库需要以下配置:

servier-id 主从库不能一样。

gtid_mode=on

enforce_gtid_consistency=on

log_slave_updates=1

mysql -S /tmp/mysql3307.sock -uroot -pmysql < slave.sql

从库执行:

change master to master_host='172.16.10.22',master_port=3307,master_user='rep',master_password='mysql',master_auto_position=1;

start slave;

show slave status\G;

 

5.3.     安装MHA节点

所有MHA服务器安装Perl环境

yum install perl-DBD-MySQL

注意最新的为0.58版本,但是对应的是centos7/redhat7,如果是低版本,建议用0.57(本人已经踩过这个坑)

a)     安装MHA-NODE节点

所有节点上执行以下

tar -zxvf mha4mysql-node-0.57.tar.gz

yum install perl-CPAN*

cd mha4mysql-node-0.57

perl Makefile.PL

 

make && make install

 

ls -lh /usr/local/bin/

查看生成的文件:

 

b)    安装配置MHA-manager管理节点

在salve2执行以下操作:

安装管理节点,先安装介质包:

yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes –y

yum -y install perl*

 

查看,并下载需要安装的依赖包:

先检查:

rpm -qa|grep perl-Config-Tiny

rpm -qa|grep perl-Email-Date-Format

rpm -qa|grep perl-Log-Dispatch

rpm -qa|grep perl-Mail-Sender

rpm -qa|grep perl-Mail-Sendmail

rpm -qa|grep perl-MIME-Lite

rpm -qa|grep perl-Time-HiRes

rpm -qa|grep perl-Parallel-ForkManager

后安装:

rpm -ivh perl-Config-Tiny-2.12-1.el6.rfx.noarch.rpm

rpm -ivh perl-Email-Date-Format-1.002-1.el6.rfx.noarch.rpm

rpm -ivh perl-Parallel-ForkManager-0.7.9-1.el6.noarch.rpm

rpm -ivh perl-Mail-Sendmail-0.79_16-4.2.noarch.rpm

rpm -ivh perl-Mail-Sender-0.8.16-3.el6.noarch.rpm

rpm -ivh perl-MIME-Lite-3.029-1.el6.rfx.noarch.rpm

rpm -ivh perl-Parallel-ForkManager-0.7.9-1.el6.noarch.rpm

rpm -ivh perl-Log-Dispatch-2.27-1.el6.noarch.rpm

 

然后安装管理节点:

tar -zxvf mha4mysql-manager-0.57.tar.gz

cd mha4mysql-manager-0.57

perl Makefile.PL

 

make

 

make install

ls -lh /usr/local/bin

 

进行管理节点MHA的配置过程。

slave2执行:

mkdir -p /usr/local/mha

mkdir -p /etc/mha

cd /etc/mha/

vim mha.conf

[server default]

user=zs

password=123456

manager_workdir=/usr/local/mha    

manager_log=/usr/local/mha/manager.log

remote_workdir=/usr/local/mha

ssh_user=root

repl_user=rep

repl_password=mysql

ping_interval=1        

master_ip_failover_script=/usr/local/scripts/master_ip_failover   

master_ip_online_change_script=/usr/local/scripts/master_ip_online_change 

[server1]

hostname=172.16.10.22

ssh_port=22

master_binlog_dir=/mydata/mysql/mysql3307/logs/

candidate_master=1

port=3307

[server2]

hostname=172.16.10.61

ssh_port=22

master_binlog_dir=/mydata/mysql/mysql3307/logs/

candidate_master=1

port=3307

  

[server3]

hostname=172.16.10.62

ssh_port=22

master_binlog_dir=/mydata/mysql/mysql3307/logs/

no_master=1

port=3307

注释:

manager_workdir=/usr/local/mha              //设置manager的工作目录manager_log=/usr/local/mhamanager.log          //设置manager的日志

ping_interval=1         //设置监控主库,发送ping包的时间间隔,默认是3秒,尝试三次没有回应的时候自动进行failover

master_ip_failover_script=/usr/local/scripts/master_ip_failover    //设置自动failover时候的切换脚本

master_ip_online_change_script= /usr/local/scripts/master_ip_online_change  //设置手动切换时候的切换脚本

candidate_master=1   //设置为候选master,如果设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中事件最新的slave

check_repl_delay=0   //默认情况下如果一个slave落后master 100M的relay logs的话,MHA将不会选择该slave作为一个新的master,因为对于这个slave的恢复需要花费很长时间,通过设置check_repl_delay=0,MHA触发切换在选择一个新的master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master

no_master=1 //意味着这个server从来不会成为新的master,这个参数用来标记从来不用成为新主的服务器。

编辑failover切换脚本:

mkdir -p /usr/local/scripts

cd /usr/local/scripts

vim master_ip_failover

脚本内容如下:

#!/usr/bin/env perl

#use strict;

use warnings FATAL => 'all';

use Getopt::Long;

use Net::Ping;

use Switch;

use DBI;

my ($command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $orig_master_ssh_port, $new_master_host, $new_master_ip, $new_master_port, $new_master_user, $new_master_password, $new_master_ssh_port);

$orig_master_ssh_port = 22;

GetOptions(

  'command=s'             => \$command,

  'ssh_user=s'            => \$ssh_user,

  'orig_master_host=s'    => \$orig_master_host,

  'orig_master_ip=s'      => \$orig_master_ip,

  'orig_master_port=i'    => \$orig_master_port,

  'orig_master_ssh_port=i' => \$orig_master_ssh_port,

  'new_master_host=s'     => \$new_master_host,

  'new_master_ip=s'       => \$new_master_ip,

  'new_master_port=i'     => \$new_master_port,

  'new_master_user=s'     => \$new_master_user,

  'new_master_password=s' => \$new_master_password,

  'new_master_ssh_port=i'  => \$new_master_ssh_port,

);

my $vip = '172.16.10.30';  # Virtual IP

my $timeout = 5;

my $key = "2";

my $getway = "172.16.10.1";

my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";

my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";

my $ssh_arping_vip = "/sbin/arping -U $vip -I eth0 -c 3";

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

    # $orig_master_host, $orig_master_ip, $orig_master_port are passed.

    # If you manage master ip address at global catalog database,

    # invalidate orig_master_ip here.

    my $exit_code = 1;

    eval {

        print "Disabling the VIP on old master if the server is still UP: $orig_master_host \n";

        sleep(30);

        my $p=Net::Ping->new('icmp');

        $syn=Net::Ping->new("syn");

        $syn->port_number($orig_master_ssh_port);

        $syn->ping($master_srv);

        &stop_vip() if $p->ping($master_srv, $timeout) && $syn->ack;

        $p->close();

        $syn->close();

        $exit_code = 0;

    };

    if ($@) {

        warn "Got Error: $@\n";

        exit $exit_code;

    }

    exit $exit_code;

}

elsif ( $command eq "start" ) {

    # all arguments are passed.

    # If you manage master ip address at global catalog database,

    # activate new_master_ip here.

    # You can also grant write access (create user, set read_only=0, etc) here.

my $exit_code = 10;

    eval {

        print "Enabling the VIP - $vip on the new master - $new_master_host \n";

        &start_vip();

        $exit_code = 0;

    };

    if ($@) {

        warn $@;

        exit $exit_code;

    }

    exit $exit_code;

}

elsif ( $command eq "status" ) {

    print "Checking the Status of the script.. OK \n";

    #`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;

    exit 0;

}

else {

    &usage();

    exit 1;

}

}

# A simple system call that enable the VIP on the new master

sub start_vip() {

my $new_master_status = `mysql -u$new_master_user -h$new_master_host -P$new_master_port -p$new_master_password --execute=\"show master status;\"`;

`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;

`ssh $ssh_user\@$new_master_host \" $ssh_arping_vip \"`;

#`/sbin/ifconfig eth0:$key $vip`;

print "\n\nNEW MASTER STATUS \n\n  $new_master_status";

}

# A simple system call that disable the VIP on the old_master

sub stop_vip() {

`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

}

sub usage {

print

"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";

}

chmod +x master_ip_failover

vi master_ip_online_change

脚本内容如下:

#!/usr/bin/env perl

#  Copyright (C) 2011 DeNA Co.,Ltd.

#

#  This program is free software; you can redistribute it and/or modify

#  it under the terms of the GNU General Public License as published by

#  the Free Software Foundation; either version 2 of the License, or

#  (at your option) any later version.

#

#  This program is distributed in the hope that it will be useful,

#  but WITHOUT ANY WARRANTY; without even the implied warranty of

#  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the

#  GNU General Public License for more details.

#

#  You should have received a copy of the GNU General Public License

#   along with this program; if not, write to the Free Software

#  Foundation, Inc.,

#  51 Franklin Street, Fifth Floor, Boston, MA  02110-1301  USA

## Note: This is a sample script and is not complete. Modify the script based on your environment.

use strict;

use warnings FATAL => 'all';

use Getopt::Long;

use MHA::DBHelper;

use MHA::NodeUtil;

use Time::HiRes qw( sleep gettimeofday tv_interval );

use Data::Dumper;

my $_tstart;

my $_running_interval = 0.1;

my $vip = '172.16.10.30';

my $key = "2";

my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip";

my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down";

my $ssh_send_garp = "/sbin/arping -U $vip -I eth0 -c 1";

my (

  $command,              $orig_master_is_new_slave, $orig_master_host,

  $orig_master_ip,       $orig_master_port,         $orig_master_user,

  $orig_master_password, $orig_master_ssh_user,     $new_master_host,

  $new_master_ip,        $new_master_port,          $new_master_user,

  $new_master_password,  $new_master_ssh_user,

);

GetOptions(

  'command=s'                => \$command,

  'orig_master_is_new_slave' => \$orig_master_is_new_slave,

  'orig_master_host=s'       => \$orig_master_host,

  'orig_master_ip=s'         => \$orig_master_ip,

  'orig_master_port=i'       => \$orig_master_port,

  'orig_master_user=s'       => \$orig_master_user,

  'orig_master_password=s'   => \$orig_master_password,

  'orig_master_ssh_user=s'   => \$orig_master_ssh_user,

  'new_master_host=s'        => \$new_master_host,

  'new_master_ip=s'          => \$new_master_ip,

  'new_master_port=i'        => \$new_master_port,

  'new_master_user=s'        => \$new_master_user,

  'new_master_password=s'    => \$new_master_password,

  'new_master_ssh_user=s'    => \$new_master_ssh_user,

);

exit &main();

sub start_vip(){

    `ssh $new_master_ssh_user\@$new_master_host \" $ssh_start_vip \"`;

    `ssh $new_master_ssh_user\@$new_master_host \" $ssh_send_garp \"`;

}

sub stop_vip(){

    `ssh $orig_master_ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;

}

sub current_time_us {

  my ( $sec, $microsec ) = gettimeofday();

  my $curdate = localtime($sec);

  return $curdate . " " . sprintf( "%06d", $microsec );

}

sub sleep_until {

  my $elapsed = tv_interval($_tstart);

  if ( $_running_interval > $elapsed ) {

    sleep( $_running_interval - $elapsed );

  }

}

sub get_threads_util {

  my $dbh                    = shift;

  my $my_connection_id       = shift;

  my $running_time_threshold = shift;

  my $type                   = shift;

  $running_time_threshold = 0 unless ($running_time_threshold);

  $type                   = 0 unless ($type);

  my @threads;

  my $sth = $dbh->prepare("SHOW PROCESSLIST");

  $sth->execute();

  while ( my $ref = $sth->fetchrow_hashref() ) {

    my $id         = $ref->{Id};

    my $user       = $ref->{User};

    my $host       = $ref->{Host};

    my $command    = $ref->{Command};

    my $state      = $ref->{State};

    my $query_time = $ref->{Time};

    my $info       = $ref->{Info};

    $info =~ s/^\s*(.*?)\s*$/$1/ if defined($info);

    next if ( $my_connection_id == $id );

    next if ( defined($query_time) && $query_time < $running_time_threshold );

    next if ( defined($command)    && $command eq "Binlog Dump" );

    next if ( defined($user)       && $user eq "system user" );

    next

      if ( defined($command)

      && $command eq "Sleep"

      && defined($query_time)

      && $query_time >= 1 );

    if ( $type >= 1 ) {

      next if ( defined($command) && $command eq "Sleep" );

      next if ( defined($command) && $command eq "Connect" );

    }

    if ( $type >= 2 ) {

      next if ( defined($info) && $info =~ m/^select/i );

      next if ( defined($info) && $info =~ m/^show/i );

    }

    push @threads, $ref;

  }

  return @threads;

}

sub main {

  if ( $command eq "stop" ) {

    ## Gracefully killing connections on the current master

    # 1. Set read_only= 1 on the new master

    # 2. DROP USER so that no app user can establish new connections

    # 3. Set read_only= 1 on the current master

    # 4. Kill current queries

    # * Any database access failure will result in script die.

    my $exit_code = 1;

    eval {

      ## Setting read_only=1 on the new master (to avoid accident)

      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error(die_on_error)_or_not

      $new_master_handler->connect( $new_master_ip, $new_master_port,

        $new_master_user, $new_master_password, 1 );

      print current_time_us() . " Set read_only on the new master.. ";

      $new_master_handler->enable_read_only();

      if ( $new_master_handler->is_read_only() ) {

        print "ok.\n";

      }

      else {

        die "Failed!\n";

      }

      $new_master_handler->disconnect();

      # Connecting to the orig master, die if any database error happens

      my $orig_master_handler = new MHA::DBHelper();

      $orig_master_handler->connect( $orig_master_ip, $orig_master_port,

        $orig_master_user, $orig_master_password, 1 );

      ## Drop application user so that nobody can connect. Disabling per-session binlog beforehand

      $orig_master_handler->disable_log_bin_local();

      # print current_time_us() . " Drpping app user on the orig master..\n";

      #drop_app_user($orig_master_handler);

      ## Waiting for N * 100 milliseconds so that current connections can exit

      my $time_until_read_only = 15;

      $_tstart = [gettimeofday];

      my @threads = get_threads_util( $orig_master_handler->{dbh},

        $orig_master_handler->{connection_id} );

      while ( $time_until_read_only > 0 && $#threads >= 0 ) {

        if ( $time_until_read_only % 5 == 0 ) {

          printf

"%s Waiting all running %d threads are disconnected.. (max %d milliseconds)\n",

            current_time_us(), $#threads + 1, $time_until_read_only * 100;

          if ( $#threads < 5 ) {

            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"

              foreach (@threads);

          }

        }

        sleep_until();

        $_tstart = [gettimeofday];

        $time_until_read_only--;

        @threads = get_threads_util( $orig_master_handler->{dbh},

          $orig_master_handler->{connection_id} );

      }

      ## Setting read_only=1 on the current master so that nobody(except SUPER) can write

      print current_time_us() . " Set read_only=1 on the orig master.. ";

      $orig_master_handler->enable_read_only();

      if ( $orig_master_handler->is_read_only() ) {

        print "ok.\n";

      }

      else {

        die "Failed!\n";

      }

      ## Waiting for M * 100 milliseconds so that current update queries can complete

      my $time_until_kill_threads = 5;

      @threads = get_threads_util( $orig_master_handler->{dbh},

        $orig_master_handler->{connection_id} );

      while ( $time_until_kill_threads > 0 && $#threads >= 0 ) {

        if ( $time_until_kill_threads % 5 == 0 ) {

          printf

"%s Waiting all running %d queries are disconnected.. (max %d milliseconds)\n",

            current_time_us(), $#threads + 1, $time_until_kill_threads * 100;

          if ( $#threads < 5 ) {

            print Data::Dumper->new( [$_] )->Indent(0)->Terse(1)->Dump . "\n"

              foreach (@threads);

          }

        }

        sleep_until();

        $_tstart = [gettimeofday];

        $time_until_kill_threads--;

        @threads = get_threads_util( $orig_master_handler->{dbh},

          $orig_master_handler->{connection_id} );

      }

      ## Terminating all threads

      print current_time_us() . " Killing all application threads..\n";

      $orig_master_handler->kill_threads(@threads) if ( $#threads >= 0 );

      print current_time_us() . " done.\n";

      $orig_master_handler->enable_log_bin_local();

      $orig_master_handler->disconnect();

      ## Droping the VIP    

      print "Disabling the VIP an old master: $orig_master_host \n";

      &stop_vip();

      ## After finishing the script, MHA executes FLUSH TABLES WITH READ LOCK

      $exit_code = 0;

    };

    if ($@) {

      warn "Got Error: $@\n";

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq "start" ) {

    ## Activating master ip on the new master

    # 1. Create app user with write privileges

    # 2. Moving backup script if needed

    # 3. Register new master's ip to the catalog database

# We don't return error even though activating updatable accounts/ip failed so that we don't interrupt slaves' recovery.

# If exit code is 0 or 10, MHA does not abort

    my $exit_code = 10;

    eval {

      my $new_master_handler = new MHA::DBHelper();

      # args: hostname, port, user, password, raise_error_or_not

      $new_master_handler->connect( $new_master_ip, $new_master_port,

        $new_master_user, $new_master_password, 1 );

      ## Set read_only=0 on the new master

      $new_master_handler->disable_log_bin_local();

      print current_time_us() . " Set read_only=0 on the new master.\n";

      $new_master_handler->disable_read_only();

      ## Creating an app user on the new master

      #print current_time_us() . " Creating app user on the new master..\n";

      # create_app_user($new_master_handler);

      print "Enabling the VIP $vip on the new master: $new_master_host \n";

      &start_vip();

      $new_master_handler->enable_log_bin_local();

      $new_master_handler->disconnect();

      ## Update master ip on the catalog database, etc

      $exit_code = 0;

    };

    if ($@) {

      warn "Got Error: $@\n";

      exit $exit_code;

    }

    exit $exit_code;

  }

  elsif ( $command eq "status" ) {

    # do nothing

    exit 0;

  }

  else {

    &usage();

    exit 1;

  }

}

sub usage {

  print

"Usage: master_ip_online_change --command=start|stop|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";

  die;

}

chmod +x master_ip_online_change

利用MHA工具检测SSH。

安装需要的环境包:

yum -y install perl-Time-HiRes

执行检测命令:

/usr/local/bin/masterha_check_ssh --conf=/etc/mha/mha.conf

 

结果全是OK代表着SSH检测成功。

检测整个主从结构:

/usr/local/bin/masterha_check_repl --conf=/etc/mha/mha.conf

这个容易报错:

注意:update mysql.user set Host='%' where User='root'; (需要主从root 互相登陆)

 

5.4.     添加VIP

在主库(172.16.10.22)上执行添加VIP的过程(第一次手动添加)

ip addr add 172.16.10.30/24 dev eth0:1

ip addr show

删除的话:ip addr del 172.16.10.30/24 dev eth0

5.5.     启动MHA服务

在管理节点(172.16.10.62)slave2上执行MHA的启动:

nohup /usr/local/bin/masterha_manager --conf=/etc/mha/mha.conf > /tmp/mha_manager.log < /dev/null 2>&1 &

注意:如果做过一次FAILOVER测试,启动MHA 建议如下启动:

nohup /usr/local/bin/masterha_manager --conf=/etc/mha/mha.conf --ignore_last_failover > /tmp/mha_manager.log < /dev/null 2>&1 &

 

验证启动成功命令并且查看显示状态:

/usr/local/bin/masterha_check_status --conf=/etc/mha/mha.conf

 

5.6.     故障转移演练

模拟主库(172.16.10.22)宕机,即停止MySQL服务。

mysqladmin -S /tmp/mysql3307.sock -uroot -pmysql shutdown

原salve1自动获得VIP 172.16.10.30,如下图:

ip addr

 

即salve1转换成新的主库。

 

Slave2指向新的主库,如下图:

 

且管理节点即(172.16.10.62)上MHA进程自动停止,如下图:

ps -ef |grep MHA

root     11998  5673  0 09:01 pts/1    00:00:00 grep MHA

/usr/local/bin/masterha_check_status --conf=/etc/mha/mha.conf

 

5.7.     恢复原master

启动原master

mysqld_safe --defaults-file=/etc/my3307.cnf &

重新配置主从:

当然,数据要一致是前提(生产上数据不一致,要备份恢复),配置新主从如下:

即原master变成新的从库:

新一主两从结构如下:

 

change master to master_host='172.16.10.61',master_port=3307,master_user='rep',master_password='mysql',master_auto_position=1;

手动在线切换:

我的测试环境执行下面一句:

/usr/local/bin/masterha_master_switch --conf=/etc/mha/mha.conf --master_state=alive --new_master_host=172.16.10.22 --new_master_port=3307 --orig_master_is_new_slave --running_updates_limit=10000

注意:回切可以不用启动monitor进程。

中间提示输入YES.

 

原salve1(现为master)需要switch

查看VIP

VIP漂过来了。回切成功。

后成功后启动MHA:

nohup /usr/local/bin/masterha_manager --conf=/etc/mha/mha.conf --ignore_last_failover > /tmp/mha_manager.log < /dev/null 2>&1 &

5.8.     总结

本地实验能够做好MHA,但是美中不足之处在于:

1、做过一次failover后,回切需要以下启动MHA

/usr/local/bin/masterha_master_switch --conf=/etc/mha/mha.conf --master_state=alive --new_master_host=172.16.10.22 --new_master_port=3307 --orig_master_is_new_slave --running_updates_limit=10000

2、测试被DOWN掉的MASTER,虽然做了failover,但是原master服务器还有VIP,只能手动把原master上的VIP删除;新MASTER 也有VIP,连接测试还是用的新master VIP;

3、online change之后,原master服务器还有VIP,只能手动把原master上的VIP删除;新MASTER 也有VIP,连接测试还是用的新master VIP;

猜你喜欢

转载自www.cnblogs.com/hmwh/p/9263547.html