rabbitmq高可用集群搭建踩坑

rabbitmq高可用集群搭建踩坑

搭建rabbtmq集群时,执行 rabbitmqctl join_cluster rabbit@rabbit-node1报错

Clustering node rabbit@slave1 with rabbit@rabbit-node1 Error: unable
to perform an operation on node ‘rabbit@rabbit-node1’. Please see
diagnostics information and suggestions below.
Most common reasons for this are:

  • Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)

  • CLI tool fails to authenticate with the server (e.g. due to CLI tool’s Erlang cookie not matching that of the server)

  • Target node is not running
    In addition to the diagnostics info below:

  • See the CLI, clustering and networking guides on http://rabbitmq.com/documentation.html to learn more

  • Consult server logs on node rabbit@rabbit-node1
    DIAGNOSTICS

    • attempted to contact: [‘rabbit@rabbit-node1’]
      rabbit@rabbit-node1: * connected to epmd (port 4369) on rabbit-node1 * epmd reports node ‘rabbit’ uses port 25672 for
      inter-node and CLI tool traffic * TCP connection succeeded but
      Erlang distribution failed

    • Hostname mismatch: node “rabbit@master” believes its host is different. Please ensure that hostnames resolve the same way locally
      and on “rabbit@master”

      Current node details: * node name: ‘rabbitmqcli-14907-rabbit@slave1’ * effective user’s home
      directory: /var/lib/rabbitmq * Erlang cookie hash:
      N9VmcKjlLemcjmGbsPIdkw==

定位问题

文中有这样的提示:

Most common reasons for this are:

  • Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
  • CLI tool fails to authenticate with the server (e.g. due to CLI tool’s Erlang cookie not matching that of the server)
  • Target node is not running
  • 1.检查防火墙和网络连接:发现防火墙是关闭的,3台机之间ping hostname可以ping通
  • 2.检查cookie文件:本人是使用rpm安装的,cookie文件的路径:/var/lib/rabbitmq/.erlang.cookie,3台机的.erlang.cookie文件都是一样的,且权限都是400
  • 3.检查rabbit-node1节点上rabbitmq-server状态:目标节点运行正常
  • 发现问题不在这里,往下看

Hostname mismatch: node “rabbit@master” believes its host is
different. Please ensure that hostnames resolve the same way locally
and on “rabbit@master”

  • 于是修改rabbitmq-env.conf配置文件(rabbitmq默认路径:/etc/rabbitmq/rabbitmq-env.conf)
  • 在集群每台机器上执行 vim /etc/rabbitmq/rabbitmq-env.conf(该文件默认不存在,需手动添加),添加配置如下
 [root@master rabbitmq]# vim /etc/rabbitmq/rabbitmq-env.conf
RABBITMQ_NODENAME=rabbit@rabbit-node1
~
~
~
  • rabbit@后面是rabbit集群每台机器hosts中配置的hostname,如:
192.168.72.127 rabbit-node1
192.168.72.128 rabbit-node2
192.168.72.129 rabbit-node3

每台机器配置好后,执行ps -aux|grep mq 查看所有rabbitmq进程,然后kill -9 杀死所有RabbitMQ进程

[root@slave2 home]# ps -aux|grep mq
Warning: bad syntax, perhaps a bogus '-'? See /usr/share/doc/procps-3.2.8/FAQ
rabbitmq   8106  0.0  0.2  10968   560 ?        S    02:33   0:00 /usr/lib64/erlang/erts-10.2.1/bin/epmd -daemon
root      24738  0.0  0.0 108700   136 pts/1    S    05:21   0:00 /bin/sh /etc/init.d/rabbitmq-server start
root      24810  0.0  0.2 108208   472 pts/1    S    05:21   0:00 /bin/bash -c ulimit -S -c 0 >/dev/null 2>&1 ; /usr/sbin/rabbitmq-server
root      24812  0.0  0.2 130732   524 pts/1    S    05:21   0:00 /sbin/runuser -s /bin/sh -- rabbitmq /usr/lib/rabbitmq/bin/rabbitmq-server
rabbitmq  24842  0.0  0.2 106108   484 pts/1    S    05:21   0:00 sh /usr/lib/rabbitmq/bin/rabbitmq-server
rabbitmq  25010  0.4 18.4 1813924 41988 pts/1   Sl   05:21   0:19 /usr/lib64/erlang/erts-10.2.1/bin/beam.smp -W w -A 64 -MBas ageffcbf -MHas ageffcbf -MBlmbcs 512 -MHlmbcs 512 -MMmcs 30 -P 1048576 -t 5000000 -stbt db -zdbbl 128000 -K true -B i -- -root /usr/lib64/erlang -progname erl -- -home /var/lib/rabbitmq -- -pa /usr/lib/rabbitmq/lib/rabbitmq_server-3.7.9/ebin -noshell -noinput -s rabbit boot -sname rabbit@slave2 -boot start_sasl -kernel inet_default_connect_options [{nodelay,true}] -sasl errlog_type error -sasl sasl_error_logger false -rabbit lager_log_root "/var/log/rabbitmq" -rabbit lager_default_file "/var/log/rabbitmq/[email protected]" -rabbit lager_upgrade_file "/var/log/rabbitmq/rabbit@slave2_upgrade.log" -rabbit enabled_plugins_file "/etc/rabbitmq/enabled_plugins" -rabbit plugins_dir "/usr/lib/rabbitmq/plugins:/usr/lib/rabbitmq/lib/rabbitmq_server-3.7.9/plugins" -rabbit plugins_expand_dir "/var/lib/rabbitmq/mnesia/rabbit@slave2-plugins-expand" -os_mon start_cpu_sup false -os_mon start_disksup false -os_mon start_memsup false -mnesia dir "/var/lib/rabbitmq/mnesia/rabbit@slave2" -kernel inet_dist_listen_min 25672 -kernel inet_dist_listen_max 25672
rabbitmq  25108  0.0  0.1   4064   388 ?        Ss   05:21   0:00 erl_child_setup 1024
rabbitmq  25135  0.0  0.1  10800   448 ?        Ss   05:21   0:00 inet_gethost 4
rabbitmq  25136  0.0  0.3  17128   696 ?        S    05:21   0:00 inet_gethost 4

重启RabbitMQ服务加入集群

[root@slave1 rabbitmq]# service rabbitmq-server start
Starting rabbitmq-server: SUCCESS
rabbitmq-server.
[root@slave1 rabbitmq]# rabbitmqctl stop_app
Stopping rabbit application on node rabbit@rabbit-node2 ...
[root@slave1 rabbitmq]# rabbitmqctl join_cluster rabbit@rabbit-node1
Clustering node rabbit@rabbit-node2 with rabbit@rabbit-node1
[root@slave1 rabbitmq]# rabbitmqctl start_app
Starting node rabbit@rabbit-node2 ...
 completed with 3 plugins.

查看集群状态


[root@slave1 ~]#  rabbitmqctl cluster_status
Cluster status of node rabbit@rabbit-node2 ...
[{nodes,[{disc,['rabbit@rabbit-node1','rabbit@rabbit-node2',
                'rabbit@rabbit-node3']}]},
 {running_nodes,['rabbit@rabbit-node3','rabbit@rabbit-node1',
                 'rabbit@rabbit-node2']},
 {cluster_name,<<"rabbit@slave2">>},
 {partitions,[]},
 {alarms,[{'rabbit@rabbit-node3',[]},
          {'rabbit@rabbit-node1',[]},
          {'rabbit@rabbit-node2',[]}]}]

可以看到rabbit-node2和rabbit-node3已成功加入集群,问题解决。

发布了2 篇原创文章 · 获赞 5 · 访问量 457

猜你喜欢

转载自blog.csdn.net/Epoch_Elysian/article/details/94075456