In-depth analysis of Apache Pulsar series (1): client message confirmation

Parsing Apache Pulsar - Client Message Acknowledgement

about the author:

Tencent Cloud Middleware Expert Engineer

Apache Pulsar PMC, author of "In-depth Analysis of Apache Pulsar".

Currently focusing on the field of middleware, he has rich experience in message queue and microservices.

Responsible for the design and development of CKafka and TDMQ, and is currently committed to creating stable, efficient and scalable basic components and services.

Introduction

In Apache Pulsar, in order to avoid repeated delivery of messages, it is a very important step for consumers to confirm messages. When a message is consumed by the consumer, the consumer needs to send an Ack request to the broker, and the broker will think that the message is actually consumed. Messages marked as consumed will not be re-delivered to consumers in the future. In this article, we will introduce the mode of message confirmation in Pulsar and how normal message confirmation is implemented on the Broker side.

1 Mode for acknowledging messages

Before understanding the Pulsar message confirmation mode, we need to understand some pre-knowledge - subscription and cursor (Cursor) in Pulsar. There are multiple consumption modes in Pulsar, such as: Share, Key_share, Failover, etc. No matter which consumption mode the user uses, a subscription will be created. Subscriptions are divided into durable subscriptions and non-durable subscriptions. For durable subscriptions, there will be a persistent Cursor on the Broker, that is, the metadata of the Cursor is recorded in ZooKeeper. Cursor uses subscription (or consumer group) as a unit, and saves where the current subscription has been consumed. Because the subscription models used by different consumers are different, the ack behaviors that can be performed are also different. Generally speaking, it can be divided into the following Ack scenarios:

(1) Single message confirmation (Acknowledge)

Unlike some other messaging systems, Pulsar supports a Partition being consumed by multiple consumers. Assuming that messages 1, 2, and 3 are sent to Consumer-A, and messages 4, 5, and 6 are sent to Consumer-B, and Consumer-B consumes faster, Ack message 4 first, then Cursor will record it separately Message 4 is Ack status. If other messages are consumed but not Acked, and both consumers are offline or Ack times out, the Broker will only push messages 1, 2, 3, 5, and 6, and the message 4 that has been Acked will not be pushed again push.

(2) Acknowledge Cumulative

Assuming that the Consumer receives messages 1, 2, 3, 4, and 5, in order to improve the performance of Ack, the Consumer can not separate Ack 5 messages, but only needs to call AcknowledgeCumulative, and then pass in message 5. Broker will send message 5 and the previous message. Messages are all marked as Ack.

(3) Single message acknowledgment (Acknowledge) in batch messages

In this message confirmation mode, the calling interface is the same as the confirmation of a single message, but this capability requires the Broker to enable the configuration item AcknowledgmentAtBatchIndexLevelEnabled. When enabled, Pulsar can support certain messages in only Ack a Batch. Suppose the Consumer gets a batch message, which contains messages 1, 2, and 3. If this option is not enabled, we can only consume the entire Batch and then Ack, otherwise the Broker will re-deliver all of them in batches. After the options described above are turned on, we can use the Acknowledge method to confirm a single message in a batch of messages.

(4) Negative Acknowledge

The client sends a RedeliverUnacknowledgedMessagescommand to the Broker to explicitly inform the Broker that the current Consumer cannot consume this message, and the message will be redelivered.

Not all subscription modes can use the above ack behaviors, for example, AcknowledgeCumulative is not supported in Shared or Key_shared modes. Because in Shared or Key_Shared mode, the previous message is not necessarily consumed by the current Consumer. If AcknowledgeCumulative is used, other people's messages will also be confirmed. The relationship between subscription mode and message acknowledgment is as follows:

subscription model Single Ack Accumulated Ack Single Ack in Bulk Message Negative Ack
Exclusive support support support not support
Shared support not support support support
Failover support support support not support
Key_Shared support not support support support

2 Implementation of Acknowledge and AcknowledgeCumulative

The Acknowledge and AcknowledgeCumulative interfaces do not directly send a message confirmation request to the Broker, but forward the request to the AcknowledgmentsGroupingTracker for processing. This is the first Tracker in the Consumer we are going to introduce. It is just an interface. There are two implementations under the interface, one is the implementation of persistent subscription, and the other is the implementation of non-persistent subscription. Since the Tracker implementations of non-persistent subscriptions are empty, that is, do nothing, we only introduce the implementation of persistent subscriptions - PersistentAcknowledgmentsGroupingTracker.

In Pulsar, in order to ensure the performance of message confirmation and prevent the Broker from receiving very high concurrent Ack requests, Tracker supports batch confirmation by default. Even the confirmation of a single message will enter the queue first, and then send it to the Broker in batches. We can set the parameter AcknowledgementGroupTimeMicros when creating the Consumer. If it is set to 0, the Consumer will immediately send an acknowledgement request every time. All single acknowledgement (individualAck) requests will be put into a named PendingIndividualAcksSet first. The default is to send a batch of acknowledgement requests every 100ms or if the accumulated acknowledgement requests exceed 1000.

The request for message confirmation is ultimately sent out asynchronously. If the Consumer sets a receipt (Receipt), it will return a CompletableFuture, and success or failure can be sensed through the Future. By default, no receipt is required, and a completed CompletableFuture is returned directly.

For a single acknowledgment (IndividualBatchAck) in a Batch message, use a named PendingIndividualBatchIndexAcksMap to save it instead of a Set of ordinary single messages. The Key of this Map is the MessageId of the Batch message, and the Value is a BitSet, which records which messages in this batch need Ack. Using BitSet can greatly reduce the memory usage for saving message IDs, and 1KB can record whether 8192 messages are confirmed. Since the contents stored in BitSet are all 0 and 1, it can be easily stored outside the heap. BitSet objects are also pooled and can be used cyclically without creating new ones every time, which is very memory-friendly.

As shown in the figure below, only 8 bits are used to indicate the Ack situation of the 8 messages in the Batch. The figure below indicates that the Entry whose EntryId is 0, 2, 5, 6, and 7 have been confirmed, and the confirmed position will be set. is 1:

For the cumulative confirmation (CumulativeAck) implementation method is simpler, only the latest confirmation location point is saved in the Tracker. For example, the CumulativeAck position saved in Tracker is now 5:10, which means that the subscription has been consumed to the message with LedgerId=5 and EntryId=10. If you ack another 5:20 later, you can directly replace the previous 5:10 with 5:20.

The last is the Flush of the Tracker. All confirmations need to be sent to the Broker by triggering the flush method. No matter what kind of confirmation, the Flush creates the same command and sends it to the Broker, but the AckType in the parameters will be different. .

3 Implementation of NegativeAcknowledge

Negative acknowledgment, like other message acknowledgments, does not immediately request the Broker, but forwards the request to the NegativeAcksTracker for processing. Tracker records each message and the time it needs to be delayed. Tracker reuses the time wheel of PulsarClient. The default is a time scale of about 33ms for checking. The default delay time is 1 minute. It extracts expired messages and triggers re-delivery. Tracker's main purpose is to merge requests. In addition, if the delay time has not expired, the message will be temporarily stored in the memory. If there are a large number of messages on the business side that need to be delayed for consumption, it is recommended to use the ReconsumeLater interface. The only advantage of NegativeAck is that you don't need to specify a time for each message, you can set the delay time globally.

4 Handling of unacknowledged messages

What if the consumer does not Ack after getting the message? There are two cases. The first is that the business side has called the Receive method, or has called back the consumer who is asynchronously waiting. At this time, the reference of the message will be saved in the UnAckedMessageTracker, which is the third Tracker in the Consumer. . A time wheel is maintained in UnAckedMessageTracker, and the scale of the time wheel is generated according AckTimeoutto TickDurationInMsthese two parameters, each time scale = AckTimeout / TickDurationInMs. The newly tracked message will be placed in the last tick. Each scheduling will remove the first tick at the head of the queue, and add a new tick to the end of the queue to ensure that the total number of ticks remains unchanged. The messages in the queue head ticks will be cleaned up each time they are scheduled, and UnAckedMessageTracker will automatically redeliver these messages.

Redelivery is when the client sends a RedeliverUnacknowledgedMessagescommand to the Broker. Every message pushed to the consumer but not Ack will be recorded by a collection (Pengding Ack) on the Broker side, which is used to avoid repeated delivery. After triggering re-delivery, the Broker will remove the corresponding messages from this set, and then these messages can be consumed again. Note that when re-delivery, if the consumer is not in Share mode, it cannot re-deliver a single message, and can only re-deliver all the messages that the consumer has received but not Ack. The following figure is a simple example of a time wheel:

Another situation is that the consumer has pre-fetched, but has not called any Receive method, and the messages will always be accumulated in the local queue. Pre-fetching is the default behavior of the client SDK, which will pre-pull messages to the local. We can control the number of pre-pull messages through the ReceiveQueueSize parameter when creating a consumer. The Broker side will record these messages that have been pushed to the local consumer as PendingAck, and these messages will not be delivered to other consumers, and the Ack will not time out. Unless the current Consumer is closed, the message will be redelivered. There is a RedeliveryTracker interface on the Broker side, and the temporary implementation is the memory tracking (InMemoryRedeliveryTracker). The tracker will record how many times the message has been re-delivered. When each message is pushed to the consumer, it will first query the number of red-delivery times from the tracker's hash table, and push it to the consumer together with the message.

From the above logic, we can know that the ReceiveQueueSize set when creating a consumer is really careful to avoid a large number of messages from accumulating in the local pre-pull queue of a certain Consumer, while other Consumers have no messages to consume. The ConsumerStatsRecorder can be enabled on the PulsarClient. After enabling, the consumer will print out the metrics information of the current consumer at regular intervals, such as the accumulation of local messages, the number of received messages, etc., to facilitate business troubleshooting of performance problems.

end

There are many design details in Pulsar. Due to the limited space, the author will organize a series of articles for technical sharing, so stay tuned. If you want to learn Pulsar systematically, you can buy the author's new book "In-depth Analysis of Apache Pulsar".

{{o.name}}
{{m.name}}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324134401&siteId=291194637