RabbitMQ+keepalived+haproxy builds a high-availability cluster

1. Introduction to clusters

1.1 RabbitMQ cluster

Normally, in a cluster we call each service a node. In a RabbitMQ cluster, node types can be divided into two types:

  • Memory node: Metadata is stored in memory. In order to synchronize data after restarting, the memory node will store the address of the disk node in the disk. In addition, if the message is persisted, it will also be stored in the disk. Because the memory node reads and writes quickly, general customers The terminal will connect to the memory node.
  • Disk node: Metadata is stored in the disk (default node type). It is necessary to ensure at least one disk node. Otherwise, once it goes down, the data cannot be restored, and the high availability of the cluster cannot be achieved.
    PS: Metadata refers to basic information including queue name attributes, switch type name attributes, binding information, vhost, etc. It does not include message data in the queue.

There are two main modes of clustering in RabbitMQ: normal cluster mode and mirror queue mode.

Normal cluster mode
In normal cluster mode, each node in the cluster will only synchronize metadata with each other, that is to say, message data will not be synchronized. Then the question arises, what if we connect to node A, but the message is stored in node B?

Whether it is a producer or a consumer, if the queue data is not stored on the connected node, it will be forwarded internally to the node that stores the queue data for storage. Although it is said that forwarding can be implemented internally, because the message is only stored in one node, if the node hangs up, will the message be lost? Therefore, this ordinary cluster mode does not achieve the purpose of high availability.

Mirror queue mode (selected this time)
In normal cluster mode, different nodes will only synchronize metadata (definitions of switches, queues, binding relationships, and vhosts) with each other but will not synchronize messages. For example, the messages of queue 1 are only stored on node 1. Node 2 and node 3 only synchronize the metadata of the switch and queue, but do not synchronize messages.

If the producer is connected to node 3, the message needs to be routed to queue 1 through switch A, and the message will eventually be forwarded to node 1 for storage. Similarly, if the consumer is connected to node 2, the message must be pulled from queue 1. , the message will eventually be forwarded from node 1 to node 2, and other nodes will act as a route; if node 1 hangs up, all data in queue 1 will be lost.

In mirror queue mode, not only metadata is synchronized between nodes, but message content is also synchronized between mirror nodes, resulting in higher availability. While this solution improves availability, it also affects performance to a certain extent due to the network overhead caused by synchronizing data.

1.2 Cluster architecture

If there are multiple memory nodes in a RabbitMQ cluster, which node should we connect to? If this selection strategy is done on the client, there will be great disadvantages. The most serious one is that the client code must be modified every time the cluster is expanded, so this method is not very desirable, so we are deploying the cluster. At this time, an intermediate proxy component is needed. This component must be able to implement service monitoring and forwarding, such as the Sentinel cluster mode in Redis. The sentinel can monitor the Redis node and implement failover.

In the RabbitMQ cluster, the high availability and load balancing functions of the cluster are implemented through two components: Keepalived and HAProxy.

HAProxy

HAProxy is an open source, high-performance load balancing software. Other load balancing software can also be used as nginx, lvs, etc. HAproxy supports Layer 7 load balancing and Layer 4 load balancing.

Specific usage reference: Detailed explanation of HAProxy installation and configuration

Keepalived

In order to achieve high availability of HAProxy, it is necessary to introduce a Keepalived component. The Keepalived component mainly has the following features:

  • It has a load function that can monitor the status of nodes in the cluster. If a node in the cluster goes down, failover can be achieved.
  • It can also implement clustering itself, but there can only be one master node.
  • The master node will provide a virtual IP to the outside world, and the application only needs to connect to this IP. It can be understood that the HAProxy nodes in the cluster will compete for this virtual IP at the same time. Whichever node competes for it will provide services.

VRRP protocol
The VRRP protocol is the Virtual Router Redundancy Protocol. The virtual IP mechanism provided in Keepalived belongs to VRRP, which is a fault-tolerant protocol to avoid single points of failure in routers.

Specific usage reference: Build keepalived+nginx hot standby high availability (active standby + dual active mode)

The load-balanced RabbitMQ cluster architecture should look like the following figure:
Insert image description here

2. RabbitMQ cluster construction

RabbitMQ normal cluster mode

2.1 Environment preparation

IP address CPU name Remark
192.168.92.100 mq01 Disk node RabbitMQ
192.168.92.101 mq02 Memory node keepalived+haproxy+RabbitMQ
192.168.92.110 mq03 Memory node keepalived+haproxy+RabbitMQ

2.2 host configuration

  1. Change the host name to mq01, and change the other two to mq02 and mq03 respectively.

    vim /etc/hostname
    mq01
    #修改完重启并生效
    reboot
    
  2. Configure hosts.
    Configure the /etc/hosts files of the three server nodes: add the mapping relationship between the node name and the node IP.

    vim /etc/hosts
    #注意不能带.注意-主机名称也要更改
    192.168.92.100 mq01
    192.168.92.101 mq02
    192.168.92.110 mq03
    

    Restart the network service network restarand test whether they can ping each other.

2.3 rabbitmq cluster construction

  1. yum install rabbitmq

    yum install -y epel-release
    
    yum install -y rabbitmq-server
    
  2. Copy erlang.cookie
    The Rabbitmq cluster depends on the erlang cluster to work, so the erlang cluster environment must be built first. Each node in the Erlang cluster is implemented through a magic cookie. This cookie is stored in /var/lib/rabbitmq/.erlang.cookieand the file has 400 permissions. Therefore, the cookies of each node must be consistent, otherwise the nodes will not be able to communicate.
    We first start the stand-alone version of RabbitMQ on mq01 to generate the Cookie file:

    systemctl start rabbitmq-server
    #文件中实际就是字符串
    [root@mq01 rabbitmq]#  cat /var/lib/rabbitmq/.erlang.cookie 
    KVUEFJJZLEXGPOEBXQKO
    
    

    Use scp to copy the .erlang.cookie value of the mq01 node to the other two nodes.

    scp /var/lib/rabbitmq/.erlang.cookie [email protected]:/var/lib/rabbitmq/.erlang.cookie
    
    scp /var/lib/rabbitmq/.erlang.cookie [email protected]:/var/lib/rabbitmq/.erlang.cookie
    

    Since you may use different accounts to operate on three hosts, in order to avoid the problem of insufficient permissions later, it is recommended to change the original 400 permissions of the cookie file to 777. The command is as follows:

    chmod 777 /var/lib/rabbitmq/.erlang.cookie
    
  3. View the three nodes separately and add management services, and finally start the rabbitmq service.
    RabbitMQ provides a very friendly graphical monitoring page plug-in (rabbitmq_management), allowing us to see the status of Rabbit or the cluster status at a glance

    #查看插件安装情况
    /usr/lib/rabbitmq/bin/rabbitmq-plugins list 
    #启用rabbitmq_management服务
    /usr/lib/rabbitmq/bin/rabbitmq-plugins enable rabbitmq_management /
    #启动服务
    rabbitmq-server -detached
    
    

    Startup error resolution
    When creating a rabbitmq cluster, you need to modify the .erlang.cookie file data of the current node to the .erlang.cookie file content of the first node. In order to prevent the introduction of automatic line breaks at the end of the data caused by manual vim modification, I used direct replacement of the file, and then when restarting the current mq node service, the error was reported as follows:

    [root@net-test-leel ~]# systemctl restart rabbitmq-server 
    Redirecting to /bin/systemctl restart rabbitmq-server.service
    Job for rabbitmq-server.service failed because the control process exited with error code. See "systemctl status rabbitmq-server.service" and "journalctl -xe" for details.
    
    [error] Cookie file /var/lib/rabbitmq/.erlang.cookie must be accessible by owner only
    

    Problem Solving:
    The content of the error is a permissions issue. Currently, this file can only be accessed by the owner of the file. Because this file was copied and replaced from other nodes, permissions need to be re-granted. Execute the following command under the default installation path of rabbitmq /var/lib/rabbitmq/. ,

    cd /var/lib/rabbitmq/
    sudo chown rabbitmq:rabbitmq .erlang.cookie        
    sudo chmod 400 .erlang.cookie
    

    Afterwards, the problem was solved and mq could be started normally.

  4. View listening port
    netstat -ntap | grep 5672

  5. To build a RabbitMQ cluster, you need to select any one of the nodes as the baseline and gradually add other nodes. Here we use mq01 as the base node and add mq02 and mq03 to the cluster.
    Execute the following commands on mq02 and mq03:

    # 1.停止服务
    rabbitmqctl stop_app
    # 2.重置状态(可选)
    rabbitmqctl reset
    # 3.节点加入
    rabbitmqctl join_cluster --ram rabbit@mq01
    # 4.启动服务
    rabbitmqctl start_app
    

    Insert image description here

    Note:
    a. By default, rabbitmq is a disk node after startup. Under this cluster command, mq02 and mq03 are memory nodes, and mq01 is a disk node.
    b. If you want to make mq02 and mq03 both disk nodes, remove the –ram parameter.
    c. If you want to change the node type, you can use the command rabbitmqctl change_cluster_node_type disc(ram), provided that the rabbit application must be stopped.

  6. Check the cluster status.
    rabbitmqctl cluster_status
    Insert image description here
    You can see that the information of all nodes is displayed under nodes. The nodes on mq01 are all disc type, that is, disk nodes; while the nodes on mq02 and mq03 are ram, that is, memory nodes. This means that the cluster has been successfully built. The default cluster_name is rabbit@mq01. If you want to modify it, you can use the following command:
    rabbitmqctl set_cluster_name my_rabbitmq_cluster

  7. Log in to the rabbitmq web management console to view.
    Open the browser and enter http://ip:15672, enter the default Username: guest, enter the default Password: guest, and the interface as shown in the figure will appear after logging in.
    Insert image description here

RabbitMQ mirror cluster configuration

2.4 Mirror cluster configuration

Overview
The above has completed the RabbitMQ default cluster mode, but it does not guarantee the high availability of the queue. Although the switches and bindings can be copied to any node in the cluster, the queue contents will not be copied. Although this mode solves the node pressure of a project team, the downtime of the queue node directly causes the queue to be unable to be used and can only wait for restart. Therefore, if you want to be able to apply it normally even when the queue node is down or malfunctions, you must copy the queue content to the cluster. For each node, a mirror queue must be created.

The mirror queue is based on the ordinary cluster mode, and then some policies are added, so you still have to configure the ordinary cluster first, and then you can set up the mirror queue. We will continue with the above cluster.
Note:
The mirror queue can be set through the management end of the opened web page, or through commands. Here is the command method.
1. Enable the mirror queue
and execute the following command on the master node: (Add a node, and other nodes will also have this policy)

rabbitmqctl set_policy ha-all "^" '{"ha-mode":"all"}'

2. Replication coefficient
Above we specified the value of ha-mode as all, which means that the message will be synchronized to the same queue on all nodes. The reason why we configure this here is because we only have three nodes, so the performance overhead of the replication operation is relatively small. If your cluster has many nodes, then the performance overhead of replication will be relatively large. At this time, you need to choose an appropriate replication factor. Generally, the over-write principle can be followed, that is, for a cluster with n nodes, it only needs to be synchronized to n/2+1 nodes. At this time, you need to modify the mirroring policy to exactly and specify the replication coefficient ha-params. The example command is as follows:

rabbitmqctl set_policy ha-two "^" '{"ha-mode":"exactly","ha-params":2,"ha-sync-mode":"automatic"}'

In addition, RabbitMQ also supports the use of regular expressions to filter queues that require mirroring operations. The example is as follows:

rabbitmqctl set_policy ha-all "^ha\." '{"ha-mode":"all"}'

At this time, only the queues starting with ha will be mirrored.

2.5 Cluster destructive testing

  1. We create a new queue and send a message
    Insert image description here

  2. Shut down the service of the mq01 node, and then check whether the message record still exists through mq02 and mq03.
    rabbitmqctl stop_app
    Insert image description here
    You can see that the ab queue has been displayed as +1 from the previous +2, and the message record exists.

  3. Then close the service of the mq02 node and check whether the message record still exists through mq03.
    rabbitmqctl stop_app
    Insert image description here
    You can see that the ab queue and message records still exist, but they have become a node.

  4. Restart the services of mq01 and mq02
    rabbitmqctl start_app
    Insert image description here

You can see that the +2 behind the ab queue turns pink, and pointing the mouse up shows that the mirror cannot be synchronized. If the service of the mq03 node is stopped at this time, the messages in the queue will be lost.

The solution is to choose to execute the synchronization command on the mq02 node.
rabbitmqctl sync_queue 11
where 11is the queue name
Insert image description here

After the synchronization is completed, the +2 turns blue again.

In this way, we have tested the destructive test of the rabbitmq cluster, indicating that the cluster configuration is successful.

2.6 Node goes offline

The process of cluster building described above is the process of service expansion. If you want to reduce the service capacity, that is, if you want to remove a node from the cluster, there are two options:

rabbitmqctl stopThe first one: You can use to stop the service on the node first , and then execute forget_cluster_nodethe command on any other node. Here we take removing the service on mq03 as an example. At this time, you can execute the following command on mq01 or mq02:

rabbitmqctl forget_cluster_node rabbit@mq03

The second method: first use to rabbitmqctl stopstop the service on the node, and then execute. rabbitmqctl resetThis will clear all historical data on the node and actively notify other nodes in the cluster that it is leaving the cluster.

2.7 Cluster shutdown and restart

There is no direct command to shut down the entire cluster, you need to shut down one by one. However, it is necessary to ensure that when restarting, the node that was shut down last is started first. If the first node to start is not the last node to shut down, then this node will wait for the last node to shut down to start. By default, 10 connection attempts will be made with a timeout of 30 seconds. If it still does not wait, the node will fail to start.

One problem this brings is that, assuming that in a three-node cluster, the order of shutdown is mq01, mq02, mq03, if mq01 cannot be restored temporarily due to a failure, mq02 and mq03 cannot be started at this time. To solve this problem, you can first remove the mq01 node. The command is as follows:

rabbitmqctl forget_cluster_node rabbit@mq01 -offline

At this time, you need to add -offlinethe parameter, which allows the node to exclude other nodes without starting itself.

3. HAproxy load

I won’t go into details about the installation and use here. Students in need can refer to: Detailed explanation of the installation and configuration of HAProxy

3.1 Modify HAproxy configuration file

/etc/haproxy/haproxy.cfgAdd configuration information at the end of the file

listen admin_stats
    bind 0.0.0.0:8189
    stats enable
    mode http
    log global
    stats uri /haproxy_stats
    stats realm Haproxy\ Statistics
    stats auth admin:admin
    #stats hide-version   
    #stats admin if TRUE 
    stats refresh 30s


listen rabbitmq_admin
    bind 0.0.0.0:15673
    server mq01 192.168.92.100:15672
    server mq02 192.168.92.101:15672
    server mq03 192.168.92.110:15672


listen rabbitmq_cluster
    bind 0.0.0.0:5673
    mode tcp
    option tcplog
    maxconn 10000
    balance roundrobin
    server mq01 192.168.92.100:5672 check inter 1000 rise 2 fall 2
    server mq02 192.168.92.101:5672 check inter 1000 rise 2 fall 2
    server mq03 192.168.92.110:5672 check inter 1000 rise 2 fall 2

and comment outoption forwardfor

3.2 Start HAproxy load

systemctl restart haproxy

ps -ef|grep haproxy

You can see that it can also be accessed through port 15673.
Insert image description here
We stopped the 110 (mq03) and 101 (mq02) services rabbitmqctl stop_appand found that they can still be accessed normally. At this point, we have achieved load balancing.
Insert image description here

4. Keepalive configuration Haproxy load

The principle is the same as nginx. Students in need can refer to: Building keepalived+nginx hot standby high availability (active standby + dual active mode)

Then you can build Keepalived to solve the problem of HAProxy failover. Here I installed KeepAlived on mq02 and mq03. The steps for setting up the two hosts are exactly the same, but some configurations are slightly different, as follows:

4.1 keepalived installation

Install using yum here

# 安装ipvs
yum install ipvsadm
# 安装keepalived
yum install keepalived

Common commands

#启动
systemctl start keepalived
#停止
systemctl stop keepalived
重启# 
systemctl restart keepalived
#查看状态
systemctl status keepalived
#设置开机启动
systemctl enable keepalived
#关闭开机启动
systemctl disable keepalived

Relevant configurations can be modified by editing the keepalived.conf file in the /etc/keepalived/ directory.

vim /etc/keepalived/keepalived.conf 
  • CentOS 7 has the firewalld firewall installed by default. Turn off the firewall.
#启动防火墙
systemctl start firewalld
#关闭防火墙
systemctl stop firewalld

4.2 HAProxy check script configuration

/etc/keepalived/Write a script under this directory. This haproxy_check.sh
script is mainly used to determine whether the HAProxy service is normal. If it is abnormal and cannot be started, you need to turn off local Keepalived at this time to allow the virtual IP to drift to the backup node.

vim /etc/keepalived/haproxy_check.sh
#!/bin/bash

#!/bin/bash

# 判断haproxy是否已经启动
if [ $(ps -C haproxy --no-heading|wc -l) -eq 0 ] ; then
    #如果没有启动,则启动
   systemctl start haproxy
fi

#睡眠3秒以便haproxy完全启动
sleep 3

#如果haproxy还是没有启动,此时需要将本机的keepalived服务停掉,以便让VIP自动漂移到另外一台haproxy
if [ $(ps -C haproxy --no-header |wc -l) -eq 0 ] ; then
    systemctl stop keepalived
fi

After creation, give it execution permissions:

chmod +x /etc/keepalived/haproxy_check.sh

4.3 mq03 (110 server) main keep file configuration

vim /etc/keepalived/keepalived.conf

! Configuration File for keepalived

global_defs {
    
    
  router_id happroxy_01 #机器标识,唯一
}
vrrp_script haproxy_check {
    
    
    script "/etc/keepalived/haproxy_check.sh"  #脚本位置
    interval 2
    weight -5
    fall 3
    rise 2
}

vrrp_instance VI_1 {
    
    
    state MASTER # 指定instance(Initial)的初始状态, MASTER 或者BACKUP
    interface ens33 #实例绑定的网卡,因为在配置虚拟IP的时候必须是在已有的网卡上添加的,(注意自己系统,我的默认是ens33,有的是eth0)
    virtual_router_id 51  #这里设置VRID,这里非常重要,主从需要配置一致
    priority 150 #设置本节点的优先级,优先级高的为master
    advert_int 1
    authentication {
    
       #定义认证方式和密码,主从必须一样
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
    
    
        192.168.92.200  #vip ip 两台需配置一致
    }
    track_script {
    
     # 引用上面编写的vrrp_script 脚本
        haproxy_check 
    }
}

There are three basic modules, global_defs global module, vrrp_instance configuration vip module, and vrrp_script script module, used to detect nginx services.
Note: After vrrp_script defines the script, the track_script parameter must be added to the vrrp_instance module. I fell into this trap, causing the script to not take effect.

global_defs module parameters

  • notification_email: keepalived needs to send an email notification address when a switch operation occurs, and the following smtp_server is also known to be the email server address. You can also call the police through other methods. After all, emails are not notified in real time.
  • router_id: Machine identification, usually set to hostname. Email notifications are used when failures occur.

vrrp_instance module parameters

  • state: Specify the initial state of instance (Initial), MASTER or BACKUP, which is not unique and is related to the priority parameter behind it.
  • interface: The network card bound to the instance, because when configuring the virtual IP, it must be added to the existing network card. (Pay attention to your own system, mine is ens33 by default, and some are eth0)
  • mcast_src_ip: The source IP address when sending multicast packets. Pay attention here. This is actually sending VRRP notifications on that address. This is very important. You must choose a stable network card port to send. This is equivalent to the heartbeat port of heartbeat. If it is not set, the default IP of the bound network card will be used, which is the IP address specified by the interface.
  • virtual_router_id: Set VRID here, it is very important here, the same VRID is a group, it will determine the MAC address of multicast
  • priority: Set the priority of this node, the one with higher priority is master (1-255)
  • advert_int: Check interval, default is 1 second. This is the VRRP timer. Every such time interval, MASTER will send an advertisement message to notify other routers in the group that it is working normally.
  • authentication: Define the authentication method and password. The master and slave must be the same.
  • virtual_ipaddress: What is set here is the VIP, which is the virtual IP address. It is added and deleted as the state changes. It is added when the state is master and deleted when the state is backup. It is mainly determined by priority. It has little to do with the value set by state. Multiple IP addresses can be set here.
  • track_script: Reference to the VRRP script, the name specified in the vrrp_script section. Run them periodically to change priorities and eventually trigger an active/standby switchover.

The vrrp_script module parameter
tells keepalived to switch under what circumstances, so it is particularly important. There can be multiple vrrp_scripts

  • script: A detection script written by myself. It can also be a one-line command such as killall -0 nginx
  • interval 2: Detected every 2s
  • weight -5: If the detection fails (the script returns non-0), the priority is -5
  • fall 2: The test must fail 2 times in a row to be considered a true failure. Will use weight to reduce the priority (between 1-255)
  • rise 1: It is considered successful if the detection is successful once. but does not change the priority

4.4 mq02 (101 server) backup keep file configuration

vim /etc/keepalived/keepalived.conf

! Configuration File for keepalived

global_defs {
    
    
    router_id  happroxy_02 #机器标识,唯一
}
vrrp_script check_nginx {
    
    
    script "/etc/keepalived/haproxy_check.sh"  #脚本位置    
    interval 2
    weight -5
    fall 3
    rise 2
}
vrrp_instance VI_1 {
    
    
    state BACKUP     #配置为备用
    interface ens33  
    virtual_router_id 51
    priority 100 #优先级需要比master低
    advert_int 1
    authentication {
    
    
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
    
    
        192.168.92.200
    }

    track_script {
    
    
        haproxy_check
    }
}

After the configuration is completed, restart the two keepalived and check the IP

syetemctl restart keepalived
ip a

Insert image description here

4.5 Verify failover

Here we verify the failover, because according to our detection script above, the Keepalived service will stop if HAProxy has stopped and cannot be restarted. You can also directly use the following command to stop the Keepalived service:

[root@mq03 keepalived]# systemctl stop haproxy
[root@mq03 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet 127.0.0.1/24 brd 127.255.255.255 scope host lo
       valid_lft forever preferred_lft forever
    inet 192.168.92.150/32 brd 192.168.92.150 scope global lo:1
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:cc:31:90 brd ff:ff:ff:ff:ff:ff
    inet 192.168.92.110/24 brd 192.168.92.255 scope global ens33
       valid_lft forever preferred_lft forever
    inet 192.168.92.200/32 scope global ens33
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fecc:3190/64 scope link 
       valid_lft forever preferred_lft forever

It can be found that haproxythe VIP does not drift after manual stop. This is because haproxy_check.shthe code in the script first determines the haproxy status. If it hangs, it will restart first. If the restart fails after 3 seconds, keepalived will be stopped. Only then will Zhenzhen's VIP drift stop
. Mainly keepalived
, so here we need to test locally and directly manually stop keepalived in mq03

systemctl stop keepalived

#手动停掉keepalived
[root@mq03 keepalived]# systemctl stop keepalived
[root@mq03 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet 127.0.0.1/24 brd 127.255.255.255 scope host lo
       valid_lft forever preferred_lft forever
    inet 192.168.92.150/32 brd 192.168.92.150 scope global lo:1
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:cc:31:90 brd ff:ff:ff:ff:ff:ff
    inet 192.168.92.110/24 brd 192.168.92.255 scope global ens33
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fecc:3190/64 scope link 
       valid_lft forever preferred_lft forever

At this time, it is found that the IP has drifted. Check the mq02 server IP
Insert image description here
page for normal access.
Insert image description here
Restart the main keepalived
and you can find that the VIP has automatically drifted back.

[root@mq03 keepalived]# systemctl start keepalived
[root@mq03 keepalived]# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet 127.0.0.1/24 brd 127.255.255.255 scope host lo
       valid_lft forever preferred_lft forever
    inet 192.168.92.150/32 brd 192.168.92.150 scope global lo:1
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:0c:29:cc:31:90 brd ff:ff:ff:ff:ff:ff
    inet 192.168.92.110/24 brd 192.168.92.255 scope global ens33
       valid_lft forever preferred_lft forever
    inet 192.168.92.200/32 scope global ens33
       valid_lft forever preferred_lft forever
    inet6 fe80::20c:29ff:fecc:3190/64 scope link 
       valid_lft forever preferred_lft forever

Insert image description here

Summarize

For the above configured active and standby modes, please refer to the specific usage of dual-active mode: Set up keepalived+nginx hot standby high availability (active-standby + dual-active mode).
At this time, the VIP for external services is still available, which means that the failover has been successfully performed. At this point, the cluster has been successfully set up. Any client service that needs to send or receive messages only needs to connect to the VIP. The example is as follows:

ConnectionFactory factory = new ConnectionFactory();
factory.setHost("192.168.92.200");

Guess you like

Origin blog.csdn.net/qq_38055805/article/details/129620847