【RocketMQ】RocketMQ自动容灾切换DLedger集群实践


本文主要演示如何搭建一个多主多从的自动容灾切换的RocketMQ集群,以及如何把现有普通的Master/Slave集群升级为自动容灾切换的集群。

DLedger配置介绍

​ 部署自动容灾的RocketMQ集群和普通的Master/Slave集群没太多差别,只需要在Broker.conf开启DLedger相关配置即可,官方称之为自动容灾切换的RocketMQ-on-DLedger Group,DLedger是分布式一致性协议Raft的一个实现,是一个轻量级的Java类库,RocketMQ引入其来实现Leader选举等一致性操作。

  • RocketMQ-on-DLedger Group是指一组相同名称的 Broker,至少需要 3 个节点,通过 Raft 自动选举出一个 Leader,其余节点 作为 Follower,并在Leader和Follower之间复制数据以保证高可用。
  • RocketMQ-on-DLedger Group能自动容灾切换,并保证数据一致。
  • RocketMQ-on-DLedger Group是可以水平扩展的,也即可以部署任意多个RocketMQ-on-DLedger Group同时对外提供服务。

相关配置

name 含义 举例
enableDLegerCommitLog 是否启动 DLedger,即是否启用RocketMQ主从切换,默认为false true
dLegerGroup DLedger Raft Group的名字,建议和brokerName保持一致 RaftNode00
dLegerPeers DLedger Group 内各节点的端口信息,同一个Group 内的各个节点配置必须要保证一致 n0-127.0.0.1:40911;n1-127.0.0.1:40912;n2-127.0.0.1:40913
dLegerSelfId 节点 id, 必须属于 dLegerPeers 中的一个;同Group 内各个节点要唯一 n0
sendMessageThreadPoolNums 发送线程个数,建议配置成 Cpu 核数 16

新建DLedger集群

机器规划

Broker

节点 ID 机器 端口 dLegerPeer
broker-a 0 172.24.29.213 20911 n0-172.24.29.213:10911
broker-a-s1 1 172.24.29.214 20911 n1-172.24.29.214:10911
broker-a-s2 2 172.24.29.215 20911 n2-172.24.29.215:10911
broker-b 0 172.24.29.214 20915 n0-172.24.29.214:10915
broker-b-s1 1 172.24.29.213 20915 n1-172.24.29.213:10915
broker-b-s2 2 172.24.29.215 20915 n2-172.24.29.215:10915
broker-c 0 172.24.29.215 20919 n0-172.24.29.215:10919
broker-c-s1 1 172.24.29.213 20919 n1-172.24.29.213:10919
broker-c-s2 2 172.24.29.214 20919 n2-172.24.29.214:10919

nameserver

节点 机器 端口
nameserver 172.24.29.215 9976

console

节点 机器 端口
console 172.24.29.215 8585

Broker配置

broker-a

listenPort = 20911
rocketmqHome = /neworiental/rocketmq-test/rocketmq-a/rocketmq471
brokerClusterName = rocketmq-test
namesrvAddr= 172.24.29.215:9976
brokerName = broker-a
#对外提供服务地址
brokerIP1 = 172.24.29.213
#Broker HAIP地址,供slave同步消息的地址
brokerIP2= 172.24.29.213
brokerId = 0
storePathRootDir = /neworiental/rocketmq-test/rocketmq-a/store
storePathCommitLog = /neworiental/rocketmq-test/rocketmq-a/store/commitlog
storePathConsumerQueue = /neworiental/rocketmq-test/rocketmq-a/store/consumequeue
deleteWhen = 04
fileReservedTime = 168
brokerRole = ASYNC_MASTER
flushDiskType = ASYNC_FLUSH
autoCreateTopicEnable = false
#是否自动创建消费组
autoCreateSubscriptionGroup = false
#集群名称是否可用作Topic使用
clusterTopicEnable = false
#Broker名称是否可用作Topic使用
brokerTopicEnable = false
useEpollNativeSelector = true

#是否开启ACL
aclEnable=true

#多副本自动主从选举Dledger相关
enableDLegerCommitLog = true
dLegerGroup = broker-a
dLegerSelfId = n0
dLegerPeers = n0-172.24.29.213:10911;n1-172.24.29.214:10911;n2-172.24.29.215:10911

broker-a-s1

listenPort = 20911
rocketmqHome = /neworiental/rocketmq-test/rocketmq-a-s1/rocketmq471
brokerClusterName = rocketmq-test
namesrvAddr= 172.24.29.215:9976
brokerName = broker-a
#对外提供服务地址
brokerIP1 = 172.24.29.214
#Broker HAIP地址,供slave同步消息的地址
brokerIP2= 172.24.29.214
brokerId = 1
storePathRootDir = /neworiental/rocketmq-test/rocketmq-a-s1/store
storePathCommitLog = /neworiental/rocketmq-test/rocketmq-a-s1/store/commitlog
storePathConsumerQueue = /neworiental/rocketmq-test/rocketmq-a-s1/store/consumequeue
deleteWhen = 04
fileReservedTime = 168
brokerRole = SLAVE
flushDiskType = ASYNC_FLUSH
autoCreateTopicEnable = false
#是否自动创建消费组
autoCreateSubscriptionGroup = false
#集群名称是否可用作Topic使用
clusterTopicEnable = false
#Broker名称是否可用作Topic使用
brokerTopicEnable = false
useEpollNativeSelector = true

#是否开启ACL
aclEnable=true

#多副本自动主从选举Dledger相关
enableDLegerCommitLog = true
dLegerGroup = broker-a
dLegerSelfId = n1
dLegerPeers = n0-172.24.29.213:10911;n1-172.24.29.214:10911;n2-172.24.29.215:10911

broker-a-s2

listenPort = 20911
rocketmqHome = /neworiental/rocketmq-test/rocketmq-a-s2/rocketmq471
brokerClusterName = rocketmq-test
namesrvAddr= 172.24.29.215:9976
brokerName = broker-a
#对外提供服务地址
brokerIP1 = 172.24.29.215
#Broker HAIP地址,供slave同步消息的地址
brokerIP2= 172.24.29.215
brokerId = 2
storePathRootDir = /neworiental/rocketmq-test/rocketmq-a-s2/store
storePathCommitLog = /neworiental/rocketmq-test/rocketmq-a-s2/store/commitlog
storePathConsumerQueue = /neworiental/rocketmq-test/rocketmq-a-s2/store/consumequeue
deleteWhen = 04
fileReservedTime = 168
brokerRole = SLAVE
flushDiskType = ASYNC_FLUSH
autoCreateTopicEnable = false
#是否自动创建消费组
autoCreateSubscriptionGroup = false
#集群名称是否可用作Topic使用
clusterTopicEnable = false
#Broker名称是否可用作Topic使用
brokerTopicEnable = false
useEpollNativeSelector = true

#是否开启ACL
aclEnable=true

#多副本自动主从选举Dledger相关
enableDLegerCommitLog = true
dLegerGroup = broker-a
dLegerSelfId = n2
dLegerPeers = n0-172.24.29.213:10911;n1-172.24.29.214:10911;n2-172.24.29.215:10911

broker-b

listenPort = 20915
rocketmqHome = /neworiental/rocketmq-test/rocketmq-b/rocketmq471
brokerClusterName = rocketmq-test
namesrvAddr= 172.24.29.215:9976
brokerName = broker-b
#对外提供服务地址
brokerIP1 = 172.24.29.214
#Broker HAIP地址,供slave同步消息的地址
brokerIP2= 172.24.29.214
brokerId = 0
storePathRootDir = /neworiental/rocketmq-test/rocketmq-b/store
storePathCommitLog = /neworiental/rocketmq-test/rocketmq-b/store/commitlog
storePathConsumerQueue = /neworiental/rocketmq-test/rocketmq-b/store/consumequeue
deleteWhen = 04
fileReservedTime = 168
brokerRole = ASYNC_MASTER
flushDiskType = ASYNC_FLUSH
autoCreateTopicEnable = false
#是否自动创建消费组
autoCreateSubscriptionGroup = false
#集群名称是否可用作Topic使用
clusterTopicEnable = false
#Broker名称是否可用作Topic使用
brokerTopicEnable = false
useEpollNativeSelector = true

#是否开启ACL
aclEnable=true

#多副本自动主从选举Dledger相关
enableDLegerCommitLog = true
dLegerGroup = broker-b
dLegerSelfId = n0
dLegerPeers = n0-172.24.29.214:10915;n1-172.24.29.213:10915;n2-172.24.29.215:10915

broker-b-s1

listenPort = 20915
rocketmqHome = /neworiental/rocketmq-test/rocketmq-b-s1/rocketmq471
brokerClusterName = rocketmq-test
namesrvAddr= 172.24.29.215:9976
brokerName = broker-b
#对外提供服务地址
brokerIP1 = 172.24.29.213
#Broker HAIP地址,供slave同步消息的地址
brokerIP2= 172.24.29.213
brokerId = 1
storePathRootDir = /neworiental/rocketmq-test/rocketmq-b-s1/store
storePathCommitLog = /neworiental/rocketmq-test/rocketmq-b-s1/store/commitlog
storePathConsumerQueue = /neworiental/rocketmq-test/rocketmq-b-s1/store/consumequeue
deleteWhen = 04
fileReservedTime = 168
brokerRole = SLAVE
flushDiskType = ASYNC_FLUSH
autoCreateTopicEnable = false
#是否自动创建消费组
autoCreateSubscriptionGroup = false
#集群名称是否可用作Topic使用
clusterTopicEnable = false
#Broker名称是否可用作Topic使用
brokerTopicEnable = false
useEpollNativeSelector = true

#是否开启ACL
aclEnable=true

#多副本自动主从选举Dledger相关
enableDLegerCommitLog = true
dLegerGroup = broker-b
dLegerSelfId = n1
dLegerPeers = n0-172.24.29.214:10915;n1-172.24.29.213:10915;n2-172.24.29.215:10915

broker-b-s2

listenPort = 20915
rocketmqHome = /neworiental/rocketmq-test/rocketmq-b-s2/rocketmq471
brokerClusterName = rocketmq-test
namesrvAddr= 172.24.29.215:9976
brokerName = broker-b
#对外提供服务地址
brokerIP1 = 172.24.29.215
#Broker HAIP地址,供slave同步消息的地址
brokerIP2= 172.24.29.215
brokerId = 2
storePathRootDir = /neworiental/rocketmq-test/rocketmq-b-s2/store
storePathCommitLog = /neworiental/rocketmq-test/rocketmq-b-s2/store/commitlog
storePathConsumerQueue = /neworiental/rocketmq-test/rocketmq-b-s2/store/consumequeue
deleteWhen = 04
fileReservedTime = 168
brokerRole = SLAVE
flushDiskType = ASYNC_FLUSH
autoCreateTopicEnable = false
#是否自动创建消费组
autoCreateSubscriptionGroup = false
#集群名称是否可用作Topic使用
clusterTopicEnable = false
#Broker名称是否可用作Topic使用
brokerTopicEnable = false
useEpollNativeSelector = true

#是否开启ACL
aclEnable=true

#多副本自动主从选举Dledger相关
enableDLegerCommitLog = true
dLegerGroup = broker-b
dLegerSelfId = n2
dLegerPeers = n0-172.24.29.214:10915;n1-172.24.29.213:10915;n2-172.24.29.215:10915

broker-c

listenPort = 20919
rocketmqHome = /neworiental/rocketmq-test/rocketmq-c/rocketmq471
brokerClusterName = rocketmq-test
namesrvAddr= 172.24.29.215:9976
brokerName = broker-c
#对外提供服务地址
brokerIP1 = 172.24.29.215
#Broker HAIP地址,供slave同步消息的地址
brokerIP2= 172.24.29.215
brokerId = 0
storePathRootDir = /neworiental/rocketmq-test/rocketmq-c/store
storePathCommitLog = /neworiental/rocketmq-test/rocketmq-c/store/commitlog
storePathConsumerQueue = /neworiental/rocketmq-test/rocketmq-c/store/consumequeue
deleteWhen = 04
fileReservedTime = 168
brokerRole = ASYNC_MASTER
flushDiskType = ASYNC_FLUSH
autoCreateTopicEnable = false
#是否自动创建消费组
autoCreateSubscriptionGroup = false
#集群名称是否可用作Topic使用
clusterTopicEnable = false
#Broker名称是否可用作Topic使用
brokerTopicEnable = false
useEpollNativeSelector = true

#是否开启ACL
aclEnable=true

#多副本自动主从选举Dledger相关
enableDLegerCommitLog = true
dLegerGroup = broker-c
dLegerSelfId = n0
dLegerPeers = n0-172.24.29.215:10919;n1-172.24.29.213:10919;n2-172.24.29.214:10919

broker-c-s1

listenPort = 20919
rocketmqHome = /neworiental/rocketmq-test/rocketmq-c-s1/rocketmq471
brokerClusterName = rocketmq-test
namesrvAddr= 172.24.29.215:9976
brokerName = broker-c
#对外提供服务地址
brokerIP1 = 172.24.29.213
#Broker HAIP地址,供slave同步消息的地址
brokerIP2= 172.24.29.213
brokerId = 1
storePathRootDir = /neworiental/rocketmq-test/rocketmq-c-s1/store
storePathCommitLog = /neworiental/rocketmq-test/rocketmq-c-s1/store/commitlog
storePathConsumerQueue = /neworiental/rocketmq-test/rocketmq-c-s1/store/consumequeue
deleteWhen = 04
fileReservedTime = 168
brokerRole = SLAVE
flushDiskType = ASYNC_FLUSH
autoCreateTopicEnable = false
#是否自动创建消费组
autoCreateSubscriptionGroup = false
#集群名称是否可用作Topic使用
clusterTopicEnable = false
#Broker名称是否可用作Topic使用
brokerTopicEnable = false
useEpollNativeSelector = true

#是否开启ACL
aclEnable=true

#多副本自动主从选举Dledger相关
enableDLegerCommitLog = true
dLegerGroup = broker-c
dLegerSelfId = n1
dLegerPeers = n0-172.24.29.215:10919;n1-172.24.29.213:10919;n2-172.24.29.214:10919

broker-c-s2

listenPort = 20919
rocketmqHome = /neworiental/rocketmq-test/rocketmq-c-s2/rocketmq471
brokerClusterName = rocketmq-test
namesrvAddr= 172.24.29.215:9976
brokerName = broker-c
#对外提供服务地址
brokerIP1 = 172.24.29.214
#Broker HAIP地址,供slave同步消息的地址,不配此项则不会进行主从复制
brokerIP2= 172.24.29.214
brokerId = 2
storePathRootDir = /neworiental/rocketmq-test/rocketmq-c-s2/store
storePathCommitLog = /neworiental/rocketmq-test/rocketmq-c-s2/store/commitlog
storePathConsumerQueue = /neworiental/rocketmq-test/rocketmq-c-s2/store/consumequeue
deleteWhen = 04
fileReservedTime = 168
brokerRole = SLAVE
flushDiskType = ASYNC_FLUSH
autoCreateTopicEnable = false
#是否自动创建消费组
autoCreateSubscriptionGroup = false
#集群名称是否可用作Topic使用
clusterTopicEnable = false
#Broker名称是否可用作Topic使用
brokerTopicEnable = false
useEpollNativeSelector = true

#是否开启ACL
aclEnable=true

#多副本自动主从选举Dledger相关
enableDLegerCommitLog = true
dLegerGroup = broker-c
dLegerSelfId = n2
dLegerPeers = n0-172.24.29.215:10919;n1-172.24.29.213:10919;n2-172.24.29.214:10919

验证

通过管理命令可以看到集群信息,如下图,每个组有一个Leader,两个Follower,集群搭建成功,日志也正常

Leader日志:

Follower日志:

下面手动将broker-b的Leader杀掉

操作之前:

Leader是在215上

操作之后:

Leader成功选举为214了

再次将215恢复:

可以看到215成为了新的Follower

旧集群升级

如果旧集群采用 Master 方式部署,则每个 Master 都需要转换成一个 RocketMQ-on-DLedger Group。
如果旧集群采用 Master-Slave 方式部署,则每个 Master-Slave 组都需要转换成一个 RocketMQ-on-DLedger Group。

杀掉旧的 Broker

可以通过 kill 命令来完成,也可以调用 bin/mqshutdown broker

检查旧的 Commitlog

​ RocketMQ-on-DLedger 组中的每个节点,可以兼容旧的 Commitlog ,但其 Raft 复制过程,只能针对新增加的消息。因此,为了避免出现异常,需要保证 旧的 Commitlog 是一致的。
​ 如果旧的集群是采用 Master-Slave 方式部署,有可能在shutdown时,其数据并不是一致的,建议通过md5sum 的方式,检查最近的最少 2 个 Commmitlog 文件,如果发现不一致,则通过拷贝的方式进行对齐。

​ 虽然 RocketMQ-on-DLedger Group 也可以以 2 节点方式部署,但其会丧失容灾切换能力(2n + 1 原则,至少需要3个节点才能容忍其中 1 个宕机)。
​ 所以在对齐了 Master 和 Slave 的 Commitlog 之后,还需要准备第 3 台机器,并把旧的 Commitlog 从 Master 拷贝到 第 3 台机器(记得同时拷贝一下 config 文件夹)。

​ 在 3 台机器准备好了之后,旧 Commitlog 文件也保证一致之后,就可以开始走下一步修改配置了。

修改配置

参考新集群部署。

重新启动 Broker

参考新集群部署。

Tips

  • dLegerPeers:每个DLedger Group内各节点的端口信息,同一个 Group 内的各个节点配置必须要保证一致,多个节点用英文分号隔开,单个条目遵循 legerSlefId-IP:端口,这里的端口用作dledge内部通信,注意不要和RocketMQ本身的端口冲突
  • dLegerSelfId:节点 id, 必须属于dLegerPeers中的一个;同 Group 内各个节点要唯一,并且此ID要和dLegerPeers配置的内容中对应上
  • Broker配置文件中的dLedger少了一个d,不知道是笔误还是有意为之

参考:

https://github.com/apache/rocketmq/blob/master/docs/cn/dledger/deploy_guide.md

Guess you like

Origin blog.csdn.net/sinat_14840559/article/details/116302823