HAProxy+Keepalived (VIP) build Rabbitmq high-availability mirror queue

aims

  • Build a three-node Rabbitmq service cluster to meet the requirements of high availability
  • Requires high availability, keep-alive, mirror queue, and load balancing.

Main components

  • RabbitMQ is used to store and forward messages and mirror queues,
  • HAProxy is used to load balance traffic to rabbitmq,
  • Keepalived is used to keep HAProxy alive (rabbitmq can also be kept alive) and provide VIP for access.
  • download link

Architecture model

Insert picture description here

Access link

Insert picture description here
The picture is selected from: RabbitMQ load balancing (3)-Keepalived+HAProxy to achieve high-availability load balancing

  • The client establishes a communication link through VIP; the communication link is routed to the corresponding HAProxy through the Keeapled Master node; HAProxy distributes the load to each node in the cluster through a load balancing algorithm.
  • Under normal circumstances, client connections are load distributed through the left part of the figure.
  • When the Keepalived Master node is down or HAProxy is down and cannot be restored, then Backup is promoted to Master, and the client connection is distributed through the right part of the figure.

Reasons for selection

  • Why build a multi-node message queue?
    • 单节点rabbitmq存在宕机风险,因此需要搭建三台互相备份
  • How to do message mirroring?
    • 镜像其实是一种策略,首先是将三台rabbitmq设置为集群,然后开启镜像策略,使得消息传递到一个队列时,会自动拷贝到其他镜像的队列。
  • If a certain machine receives too much pressure, how to achieve load balancing?
    • rabbitmq集群本身不具备负载均衡的功能,需要配合其他软件使用,比如HAProxy,同样的道理,为了防止出现宕机问题,需要在集群中配置多台HAProxy。
  • If HAProxy goes down, how does the other one feel? And switch traffic request?
    • HAProxy在宕机时,keepalived有保活脚本,大概意思就是发现keepalived发现HAProxy不存在,就会启动它,过三秒发现还是没启动,那就关闭它,关闭它的时候,会触发vip漂移到另外一台上。
    • HAProxy 之间需要能够自动进行故障转移,通常的解决方案就是 KeepAlived。KeepAlived也是两台,一主一从,对外提供vip,主机宕机,从机会接管vip(具体见文末的彩蛋2)。
  • In this case, two HAProxy is selected for load balancing. Is the load mode for each HAProxy to load three Rabbitmqs, or each to load two of them?
    • 负载均衡三台。满足极端情况下的高可用。
  • In case a Rabbitmq goes down, will HAProxy still send messages to it and cause the message to be lost?
    • HAProxy的.cfg文件中定义的有健康检查机制,参数fall在指定多少次不成功的健康检查后,认为该服务宕掉了。
    • 配置文件中:server node3 hadoop003:5672 check inter 5000 rise 2 fall 3 weight 1,这里的rise 2 fall 3 就是用于判断死活的。
    • haproxy健康检查rabbitmq,keepalived的ha_check.sh脚本检测haproxy
  • If keepalived hangs, how does another keepalived work?
    • 这个问题我没有仔细研究,应该是两台keepalived互相有心跳保活,主机宕机,从机会立刻抢占vip.

Simple construction process

  • The first step: Install RabbitMQ on different virtual machines
  • Step 2: Verify the successful installation of RabbitMQ
  • Step 3: Build RabbitMQ mirror queue
  • Step 4: Set RabbitMQ mirroring rules and verify
  • Step 5: Build HAProxy
  • Step 6: Operations related to HAProxy permissions
  • Step 7: Build Keepalived
  • Step 8: Operations related to keepalived permissions
  • Step 9: Test the impact on message delivery under different downtime situations

Detailed construction process

  • The first step: Install RabbitMQ on different virtual machines
# 安装rabbitmq之前的依赖erlang和socat
rpm -ivh erlang-22.0.7-1.el7.x86_64.rpm
yum -y install socat
rpm -ivh rabbitmq-server-3.7.17-1.el7.noarch.rpm

# 启动服务 和 设置开机自启
service rabbitmq-server start
chkconfig rabbitmq-server on

# 新增用户并授权,第三句的 / 表示在vhost ”/“ 上的permission
rabbitctl    add_user   username    password
rabbitctl    set_user_tags   username    administrator
rabbitctl    set_permissions   -p    /   username    ".*"   ".*"   ".*"
  • Step 2: Verify the successful installation of RabbitMQ
# 配置rabbitmq的浏览器插件
rabbitmq-plugins   enable    rabbitmq_management
# 在浏览器输入 IP:15672  输入用户名和密码  登录验证是否成功 
# 点击浏览器中admin模块,看一下权限是否正确
  • Step 3: Build RabbitMQ mirror queue.
# 保持三台机器的.erlang.cookie一致,通过find / -iname .erlang.cookie 全局搜,然后copy到其他两台的root下,一般在/var/lib/rabbitmq/下可以找到,scp到其他两台上
scp   xx/.erlang.cookie      root@IP:/root/

# 分别在各自节点的/etc/hosts下设置相同的配置信息(copy),然后重启机器
IP1  hostname1
IP2  hostname2 
IP3  hostname3 

# 后台启动方式
rabbitmq-server -detached  #说明:该命令会同时启动 Erlang 虚拟机和 RabbitMQ 应用服务。而后文用到的 rabbitmqctl start_app 只会启动 RabbitMQ 应用服务, rabbitmqctl stop_app 只会停止 RabbitMQ 服务

# 集群模式,设置镜像队列,下面四句话分别在三台机器上执行,只有主机需要改成内存运行模式,才运行第三句的--ram
rabbitmqctl stop_app                              # 1.停止服务
rabbitmqctl reset                                 # 2.重置状态
# 3.节点加入,rabbit@hadoop001是集群的名字,请起的响亮点
rabbitmqctl join_cluster   --ram   rabbit@hadoop001   
rabbitmqctl start_app                             # 4.启动服务

# 查看镜像队列状态
rabbitmqctl cluster_status
# 设置policy
rabbitmqctl set_policy ha-all "^" '{"ha-mode":"all"}'
# 验证
任意选取一台Rabbitmq登陆,创建一个队列,然后去其他两台看看,该队列是否存在
  • Step 5: Build HAProxy
# 找个文件夹解压
tar -zxvf haproxy-2.0.3.tar.gz
# 编译,其中TARGET=Linux26 是通过uname -a 来查看Linux内核版本的,我的版本是31,PREFIX的地址自己指定
make TARGET=Linux31  PREFIX=/usr/app/haproxy-2.0.3
make install PREFIX=/usr/app/haproxy-2.0.3
# haproxy.cfg是haproxy的配置文件,需要自己创建,然后去网上找别人写好的配置文件,启动的时候可能会引起报错,解决不了的直接删除
touch haproxy.cfg #这个文件后续会用于启动haproxy,位置最好放在haproxy/conf下,haproxy文件就是刚才的PREFIX指定的位置。
# 启动:就是用haproxy/sbin/haproxy去启动刚才的配置文件
/usr/app/haproxy-2.0.3/sbin/haproxy  -f   /usr/app/haproxy-2.0.3/conf/haproxy.cfg   # 启动报错见下方的 《haproxy安装过程中的坑》


  • Step 6: Operations and verification of HAProxy permissions
# HAproxy设置开机自启【源码包安装方式是没有启动脚本的,是能通过命令的方式进行启动和关闭,什么都不会输出,只能通过netstat 的方式进行验证】
#将HAProxy服务启动脚本放置到/etc/init.d/,启动脚本见下方【HAProxy服务启动脚本】
cp haproxy /etc/init.d/
chkconfig --add haproxy
chkconfig --list haproxy
service haproxy start|restart|stop|status
# 验证
浏览器输入IP:8100/stats 就可以看见haproxy的界面 #stats是你在haproxy.cfg的lister起的名字

# 启动报错
解决办法:先看报错信息中的数字,它对应哪一行出错,然后去配置文件中,要么删除要么修改,具体可以见下面的链接:HAProxy实战搭建

# 访问ip:8100/stats空页面:  
解决办法:修改bind 为 0.0.0.0:8080

# 出现:Proxy 'monitor(就是你起的lister名)': in multi-process mode, stats will be limited to process assigned  to the current request
解决办法:修改:nbproc=1

#解压
tar -zxvf keepalived-2.0.18.tar.gz
cd keepalived-2.0.18

#安装keepalived的相关依赖
yum -y install libnl libnl-devel
#--prefix是指定安装路径,没有的话会帮你创建,/usr/app/是路径,也可以安装在/etc/local
./configure --prefix=/usr/app/keepalived-2.0.18

#编译
make && make install   #这一步可能会报错,需要你安装gcc或者openssl-devel,甚至涉及到换yum源,需要你yum repolist all.  yum clean all. yum makecache. yum install -y gcc-c++ tcl.  yum install -y openssl-devel. 

#进行环境配置,Keepalived 默认会从 /etc/keepalived/keepalived.conf 路径读取配置文件,所以需要将安装后的配置文件拷贝到该路径
make /etc/keepalived
#/usr/app选择自己的安装路径,keepalived.conf的配置见下面的链接
cp /usr/app/keepalived-2.0.18/etc/keepalived/keepalived.conf /etc/keepalived/

#将所有 Keepalived 脚本拷贝到 /etc/init.d/ 目录下:
#编译目录中的脚本,/usr/software/是之前你安装tar包的地方
cp /usr/software/keepalived-2.0.18/keepalived/etc/init.d/keepalived /etc/init.d/
#安装目录中的脚本
cp /usr/app/keepalived-2.0.18/etc/sysconfig/keepalived /etc/sysconfig/
cp /usr/app/keepalived-2.0.18/sbin/keepalived /usr/sbin/


  • Step 8: Operations related to keepalived permissions
#加入系统:
chmod +x /etc/init.d/keepalived
chkconfig --add keepalived 或者  chkcoonfig keepalived on
#设置/取消开机自动启动
systemctl    enable/disable keepalived.service
#启动/停止
systemctl    start/stop    keepalived.service
#haproxy的存活情况的判断脚本,记住这个路径,需要写入到keepalived.conf中
chmod +x /etc/keepalived/haproxy_check.sh
#查看keepalived状态
service keepalived status 或者 systemctl status  keepalived.service

Test program

  • Test program:

    • Solution 1: Under the following circumstances, visit VIP: 15672 to see if you can log in to the browser side of rabbitmq.
    • Solution 2: In the following situations, send messages to rabbitmq to see if there is any message delivery error.
  • Test situation:

    • Scenario 1: Turn off the rabbitmq service on a certain machine
    • Scenario 2: Turn off the rabbitmq service on two machines
    • Scenario 3: Shut down the haproxy service on a certain machine
    • Scenario 4: Turn off the keepalived service on a certain machine
    • Scenario 5: Shut down haproxy+keepalived service on a machine at the same time
    • Scenario 6: Shut down haproxy on one machine and keepalived service on another machine at the same time
    • Scenario 7: Shut down such a machine (also contains rabbitmq+haproxy+keepalived service)
    • Scenario 8: Shut down these two machines (one contains rabbitmq+haproxy+keepalived services at the same time, and the other contains only rabbitmq)
  • Test Results:

  • Easter egg 1:

  • Easter egg 2:

    • When doing test 1, I used the browser login to query, but found that when rabbitmq1 was down, the browser could not access, I thought that haproxy did not work, in fact, the configuration file of haproxy only configured 5672. But I didn't configure the port of 15672. Either I changed the plan instead of browser testing and used code link testing, or added 15672 port forwarding listen rules in haproxy.cfg, and added the following content to the .cfg file:
# 绑定配置
listen rabbitmq_browser
    # 注意此处的15671,你也可以用任何不冲突的端口,然后浏览器访问vip:15671就可以了
    bind :15671
    # 注意此处的http
    mode http
    # 采用加权轮询的机制进行负载均衡
    balance roundrobin
    # RabbitMQ 集群节点配置
    server 节点名 hadoop001:15672 check inter 5000 rise 2 fall 3 weight 1
    server node2 hadoop002:15672 check inter 5000 rise 2 fall 3 weight 1
    server node3 hadoop003:15672 check inter 5000 rise 2 fall 3 weight 1
  • Easter egg 3:
    • During the test, I used systemctl stop Haproxy.server and service Haproxy stop to shut down Haproxy, but when I checked the status with service Haproxy status, I found that it was still running. I thought the command was invalid or I installed a fake Haproxy.
      • The actual reason is: the ha_check.sh script I configured in keepalived will keep checking Haproxy, and restart Haproxy once it is not found. If sleep still fails for three seconds, stop the service, and then other components of keepalived will find When the detected service is closed, the other backup Haproxy will be notified to start taking on the role of the protagonist.
    • There is a state parameter in the keepalived configuration file. Generally, the article recommends setting the master to master and the slave to backup. It is recommended to set it to backup here. In this case, the master is started first. When VIP drift occurs, the original master Start will not drift again, reducing network jitter.

Guess you like

Origin blog.csdn.net/ljfirst/article/details/106012709