Apache Pulsar Technology Series - Synchronization Principle of Subscription State in GEO Replication

Introduction

Apache Pulsar is a multi-tenant, high-performance inter-service message transmission solution that supports multi-tenancy, low latency, read-write separation, cross-regional replication (GEO Replication), rapid expansion, flexible fault tolerance and other features. GEO Replication can natively support data and subscription status replication between multiple clusters. GEO has long-term and stable practices in Apache InLong. This article mainly describes the synchronization of subscription status in GEO.

Introduction to GEOs

GEO Replication provides the ability to replicate data between multiple clusters.

picture

The figure above describes three clusters, and different GEO Replication strategies are configured between the clusters, among which

  • Cluster-A and Cluster-B are two-way replication. The topic data in both clusters will be copied to the peer cluster, that is, the data of cluster A will be copied to cluster B, and the data of cluster B will also be copied to cluster A. Both clusters A and B have all the data of each other;

  • Cluster-A and Cluster-C are one-way replication: the data of cluster A will be replicated to cluster C, and the data of cluster C will not be replicated to cluster A;

  • Cluster-B and Cluster-C have no replication relationship: no data replication occurs between clusters B and C.

The above describes a typical scenario of data synchronization/replication. Another scenario in GEO Replication is subscription status synchronization.

Scenarios for subscription state synchronization

A typical application of subscription synchronization is cluster disaster recovery. Under normal circumstances, only the primary cluster provides write and consumption services. After the primary cluster fails, production and consumption will switch to the standby cluster.

Production switching is seamless, and writing can continue after the cluster is switched. Consumption is more complicated than production. If only data is synchronized, after cluster switching, the subscription of the standby cluster will consume historical data repeatedly.

picture

如上图:在主、备两个集群之间,每个 Topic(分区)的 Ledger 并不是一一对应的,比如在主集群中,订阅 sub-00 消费到了一条消息,这个消息所在的 Ledger 是 Ledger-x;经过复制之后,在备集群中这条消息对应的 Ledger 是 Ledger-y,这里 Ledger-x 和 Ledger-y 没有直接关系,所以订阅状态(MDP)不能简单的直接映射。

GEO 订阅状态同步原理

订阅状态的同步,大体上可以分为两个主要的步骤:

  • 第一步是实现两个集群之间 MessageId(可以理解为 Offset 信息)的映射,即在主集群的一条消息的 MessageId 复制到备集群之后的 MessageId;

  • 第二步是在主集群中一个订阅 ack 数据时,如果有 (MDP) 的变动,根据第一步中的主、备集群 MessageId 的映射关系,将主集群的 MDP 信息映射到备集群订阅的 MDP 中。

下面我们来详细看下整个流程。

MessageId 映射

MessageId 映射最直观的方法,就是维护主、备集群中每个 Message 的映射关系,但是这种方案的需要维护的映射关系太多,代价太大。

Pulsar 采用的方式是一个定时任务的方式,每隔一段时间同步一次主、备集群 LAC 信息之间的关系。假设集群 A 向集群 B 复制数据和订阅状态信息。

picture

首先,集群 A 会定时生产一个 SnapshotRequest 信息,写入到本地 Topic(分区)中,这个信息会随着数据复制写入到集群 B 的 Topic 中。

picture

B 集群会处理 SnapshotRequest 信息,然后将本地 Topic(分区)的 LAC(LAC-B) 信息封装在 SnapshotRespnse 中,写入到本地 Topic 中,通过 GEO Replciation 复制到 A 集群。

picture

集群 A 在处理 SnapshotRespnse 时,记录 SnapshotRespnse 在本地的 MessageId(LocalMessageId) 和 LAC-B 的映射关系,由于 A -> SnapshotRequest -> B -> SnapshotRespnse -> A 的操作顺序,可以保证集群 A 订阅的 MDP 大于 LocalMessageId 时,LAC-B 对应的数据一定是被消费过的,通过这种方式实现了两个集群之间 MessageId 的映射关系。

订阅信息同步

集群 A 中的订阅会不断消费、Ack,当 Ack 触发了 MDP 的移动时,集群 A 会检查 LocalMessageId 是否小于 MDP,如果发现小于,说明需要更新集群 B 订阅的 MDP 信息,此时集群 A 会根据映射关系,找到 LAC-B 信息,然后构造一个 ReplicatedSubscriptionsUpdate 消息,写入到本地 Topic,这个 ReplicatedSubscriptionsUpdate 消息会通过 GEO 复制到集群 B。

picture

集群 B 接收到 ReplicatedSubscriptionsUpdate 消息之后,会解析出 LAC 和订阅信息,然后更新订阅的 MDP。

至此,就完成了订阅状态的一次复制流程。

总结与思考

Pulsar 的订阅状态复制,依赖于原生的 GEO Replication 机制,并且需要主备集群之间双向的交互,所以对于单向复制的 GEO 集群,订阅状态是不能实现订阅状态同步的。

In addition, the current subscription status synchronization only considers the MDP information. In fact, for a subscription (especially for Shared and Key-Shared type subscriptions), the subscribed IndividuallyDeletedMessages information is also very important, especially in scenarios where a large number of consumers use Individual Ack. If the IndividuallyDeletedMessages information is not synchronized, data duplication will result.

Since IndiviindividuallyDeletedMessages records the Ack of each message, to solve this problem you need:

  • Record the mapping relationship between each MessageId of the active and standby clusters, for example, record the MessageId information of the original message in the attributes of the copied message.

  • Copy IndiviindividuallyDeletedMessages to the standby cluster.

When the subscription of the standby cluster consumes data, it can determine whether the message has been Acked according to the MessageId mapping relationship between the active and standby clusters and the IndiviindividuallyDeletedMessages copied from the active cluster. If the Ack is acknowledged, it will not be pushed to the consumer, so that the data will not be repeated when the cluster subscription is switched.

Guess you like

Origin juejin.im/post/7259668108649218103