And Kafka RabbitMQ message reliability Comparative

RabbitMQ and Kafka are provided to ensure a lasting message. Both provide a guarantee of at least one and up to the other, Kafka in some cases provide accurate defined time (exactly-once) guarantee.

Let's first understand what the meaning of the above terms:

At most once delivered : Message delivery will not be repeated, but the message may be lost

At least one delivery : the message will not be lost, but may be repeated consumption

Accurate delivery time : the holy grail of the message system. All messages are delivered exactly once.

"Delivery" seemingly not an accurate description language, "treatment" is. No matter how described, we are concerned that consumers can process messages, and the number of processes. However, using the "process" will make the problem more complicated. For example, the message must be delivered twice in order to be processed once. As another example, if the consumer during processing downtime, the message to be delivered a second time (to another consumer).

Secondly, it is expressed using "treatment" will be such that the partial failure (partial failure) becomes troublesome. It comprises a plurality of message processing steps generally. Process start to the end, and includes a communication application logic and application messaging system. Partial failure application logic is handled by the application. If the transaction is a logical application processing, the result is all or nothing, then the application logic can prevent some failures. Actually, however, often involve a plurality of steps of different systems, such transactional impossible. If we take into account communication, application, caching, database, we can not reach a deal with precise (exactly-once processing).

Therefore, a precise only in the following case: only the message includes a message processing system itself, and the processing system itself is the message transaction. In this limited scenario, we can process the message, write the message, ACK sending a message to be processed, everything in the transaction. Kafka stream which is able to provide.

However, if the message processing is idempotent (idempotent), we can bypass the guarantee based on the exact time of the transaction. If message processing is idempotent, we can safely handle duplicate messages. Of course, not all message processing is idempotent.

 

Chain of Responsibility

Essentially, the producer can not know whether the message is consumed. They know that the messaging system whether to receive the message, whether the message is stored for safe delivery. Here there is a chain of responsibility, producers began to move to the messaging system, and finally to the consumer. Each link must be properly executed, the transfer between the link should be executed correctly. This means that, as an application developer, you want to write the correct procedures to prevent losing messages, or messages abuse.

Message Sequence

This article focuses on how to provide RabbitMQ and Kafka at least once and at most once delivery. However, the message also comprising the sequence. Simply put, both of which support the FIFO order. RabbitMQ queue at this level, Kafka partition level in the subject.

RabbitMQ

Delivery guarantee is dependent on:

Persistent messages - Once stored, it will not be lost

ACK message - the RabbitMQ and signals between producers, consumers

Mirroring queue

Queue may be mirrored (replicated) between nodes. For each queue, there is one main queue on a single node. Suppose that we have three nodes, 10 queues, each two mirrors. Then the main queue 10 and the mirror 20 will be distributed among the three nodes. How the main queue distribution can be configured. When a node goes down,

Each of the main queue at the node is down, on the other mirror image queue node is promoted to the main queue

On other nodes mirror queue is created for it, instead of the mirror image queue at the node is down, so as to maintain the replication factor (replication factor)

Durable queue

RabbitMQ There are two queues: persistent and non-persistent. Durable queue will be on the disk, will rebuild it after the storage node restart.

Durable messages

Durable queue can not guarantee that the message may be retained during downtime. Only is set to persistent message will be restored after the restart downtime.

For RabbitMQ, the message is more durable, the queue throughput worse. So if you have a live stream, but also slightly lost data will not have a big problem, then you should consider mirroring the queue, and the message should be set to non-persistent. If, however, you must not lose data when a node goes down, you should use the queue table mirror, lasting queues and persistent messages.

ACK message

News release

When news release may be lost or duplicated. It depends on the behavior of producers.

Fire and Forget publisher can choose not to use the producer ACK, simply launch news abandoned. Messages are not copied, but can be lost (at most once delivery)

Published confirmed: When the publisher and the intermediary (broker) to create a channel, the channel can be set using the confirmation message. The broker will return the publisher's message is as follows:

basic.ack: Positive ACK message has been received, the message is now in the RabbitMQ here.

basic.nack: Negative-ACK error occurs, the message was not processed. Responsibility is also the publisher. Publishers may need to be resent.

In addition to the above two, there is a reply basic.return. Sometimes publishers not only need to know the middleman received the message, but also need to know the message has persisted in a number of the queue. For example, sometimes the publisher posted a message to the switch, but does not bind any matching queues on the switch, then the broker will simply discard the message. In most cases, this is no problem, but sometimes, the publisher needs to know the message is discarded or be dealt with. Mandatory flag may be set for each message, a result, if the message is not processed but is discarded, it will return a basic.return

Publishers can choose to send each message waiting ACK, but will seriously affect throughput. So, the publisher announced the general flow, but will not limit the number of ACK message. Once the number reaches message in flight restrictions, the publisher will pause, waiting for the arrival of ACK.

Now, we have a number of messages in transit (between publisher and RabbitMQ), in order to improve throughput, RabbitMQ use multiple bits to mark a group consisting of the ACK. Thus, all messages are assigned a monotonically increasing sequence number (Sequence Number). ACK message will include a corresponding sequence number. When used in combination Multiple flag, publishers need to maintain the sequence number of the transmitted message, so that it knows what the message is ACK.

Therefore, the use of ACK, we can avoid the loss of messages in the following ways:

Upon receiving nack, republish news.

When receiving or nack basic.return, persistent messages to a place.

 

Services: In RabbitMQ, it is not commonly used in the transaction. because

No clear guarantees: If the message is routed to a plurality of queues, or hired a mandatory flag, then the atomic transaction is not reliable.

Relatively poor performance.

Frankly, I've never used the transaction, which adds an extra guarantee, increase uncertainty.

Connect / Channel exception: in addition to the ACK message, the publisher also need to consider is disconnected or middleman errors, both of which can lead to loss of channels. ACK channel can lead to the loss can not receive messages. On this issue, the publisher may consider a compromise, one is to take the risk of message loss of one is to take the risk of repeating the message.

If the broker is down, the message is still possible at this time in the OS buffer, or are being resolved, so it is lost. Or when, this message has been persistent, just as middleman to send ACK, downtime, and in this case, in fact, the message has been successfully delivered.

Disconnect same. We can not know the specific timing is down, you can only choose:

Not republish, run the risk of message loss

Republish, run the risk of repeating the message

If the publisher has a lot of messages in transit, the problem will worsen. One way is to provide prompt the publisher, to tell consumers the message is retransmitted, so that consumers try to go heavy.

 

 

consumer

For ACK, consumers have two choices

No ACK Mode

ACK manual mode.

Mode No ACK: ACK or called automatic mode, is dangerous. First of all, as long as the message is delivered to the application layer, it will be removed from the queue. This can lead to loss of messages:

Also in the internal message buffer, but the application layer downtime

Message processing failed

Secondly, we can not control the speed of message delivery. ACK manual, we can set prefetch (QoS) value to limit the number of outstanding ACK message applications available. Without this feature, RabbitMQ will deliver the message quickly, Nana management beyond the consumer can handle, resulting in an internal buffer overflow or memory problems.

Manual mode ACK: ACK consumer must manually given message consumers can pre-set value greater than one, it can process a plurality of data in parallel. Consumers can choose a single ACK message transmission, multiple flag may be set, a plurality of ACK messages. Batch improve performance.

When a consumer opens a channel, the message is delivered receive a monotonically increasing integer value Delivery Tag. The ACK message includes an identifier in which a message.

ACK are summarized as follows:

basic.ack.RabbitMQ will remove the article from the message queue. You can use multiple tags.

basic.nack. Consumers need to tell the need to re-RabbitMQ message queue pushed. Re-entry queue means that the message will be placed in the queue head, again delivered to the consumer. Also supports multiple flags.

basic.reject. basic.nack with similar, but does not support multiple flag.

From the semantics so superior, basic.ack and (basic.nack & requeue == false) are equivalent. It will cause the message removed from the queue.

The next question is, when to send ACK? If the message is processed quickly, you can select the message and then sends an ACK processed. However, if the message processing takes a few minutes, then processed and then sends an ACK is problematic. If the channel is down, all ACK messages will not re-enter the queue, resulting in duplicate messages.

Communication / channel fault

If a communication failure, or failure resulting intermediary channel is down, then all non-ACK message will re-enter the queue delivered again, this does not lead to message loss, but will result in duplicate messages.

Consumers remain un-ACK message longer, the higher the risk of a message being rerouted. When the message is re-delivered, the message flag is set redelivered. So the worst case, at least the consumer can know the message is a retransmission of the message.

Idempotence

If you need power, etc. and to ensure that messages are not lost, it means that you need to implement the repeated message, or other power modes. If the message is very time-consuming de-duplication, then you can let the publisher adds header data retransmission of the message, so that consumers check the header data and redelivered flag.

Conclusion
RabbitMQ provides a powerful, reliable and lasting message of assurance, however, there are many ways you mess up it.

Here are some considerations

If you want to ensure that at least one delivery, the use of a mirror queue, the queue is long-lasting, durable messages, publisher ACK, mandatory flag manually consumers ACK;

Use at least once a delivery, you may need to increase the use of de-duplication logic or paradigm idempotent

If you do not care message is lost, but more concerned about the low latency and highly scalable, so you do not need to queue to use the mirror, persistent messages and publishers ACK. Of course, I will keep using the manual consumer ACK, through pre-set take 2 value to control the speed of message delivery, of course, you need to set multiple flags and bulk ACK.

 

Kafka

Kafka guarantee delivery by:

Message persistence: Once into the topic, the message will not be lost

Message ACK: kafka (or including the Zookeeper) with producers, consumers signal

About Batch

Kaka and RabbitMQ have sent in the message volume, different consumption. RabbitMQ can achieve the following:

Each message transmitted on the x suspended until all ACK messages are received. RabbitMQ typically a group consisting of a plurality of ACK, using multiple flags

A pre-set value of the consumer, the group consisting of the ACK message

However, the message itself is not sent in bulk, it refers more to allow a set of messages in transit, using the multiple flags. This is a lot like TCP.

And Kafka is a clear message batch processing. Batch can improve performance, but also need to weigh trade-offs as without ACK message RabbitMQ same way. The more messages in transit, can lead to more serious message repetition (when a failure occurs).

Kafka can be more efficient batch processing in the consumer side, because kafka concept of partition. Each partition corresponds to a consumer, so timely a big batch will not be distributed camp Ziang load. However, for RabbitMQ, if the use has been abandoned pull pulling batch API message, it will lead to very serious load imbalance. And a long processing delay. RabbitMQ is not designed for batch processing.

Endurance

Log Replication

For fault tolerance, Kafka at the district level there is a master-slave architecture, become master primary partition, copy partition become slave or follower. Each master can have many follower. When the server goes down the primary partition, follower there will be promoted to a primary partition, so it will only cause a brief stop service, but will not cause data loss.

Kafka has a concept called In Sync Replicas (synchronous replication). Each copy may be synchronized or unsynchronized. Synchronization means that compared with the primary partition, with the same message. It may become asynchronous replication, if it left behind. This may be because of network latency, host failure. Message loss occurs only in the following cases: primary partition server goes down, all of the copy are asynchronous.

ACK message tracking offset

Depending on how Kafka store messages and how consumers consume messages, Kafka rely on message ACK to offset track.

Producers message ACK

When a producer sends a message that will show what kind of middleman expect ACK:

It does not require ACK: fire and forget, corresponding to acks = 0

Primary partition has a message persistence. Acks = 1 corresponds to

Primary partition and all messages are synchronized persistent replication, corresponding to acks = ALL

Messages can be copied at the time of publication, the same as RabbitMQ. If the intermediary network failure or downtime, the publisher will not receive the ACK message retransmission. Of course, in most cases, the message should be the primary partition persisted and replicated.

However, Kafka has a good characteristic to heavy, but it must be set as follows:

enable.idempotence set to true

max.in.flight.requests.per.connection 低于5

retries set to 1 or higher

acks set to ALL

In this configuration, if you order throughput, batch processing unit is set to 6 or acks set to 0/1, then you have no way to get heavy.

Consumers offset tracking

Consumers need to store their shift to prepare for downtime, let another customer succeed. In the offset storage kafka zookeeper or topic.

Once the consumer to read a message from the bulk of the partition, which has a variety of options to update offset:

Update Now: At the beginning of treatment before the news. This corresponds to a maximum time for delivery. Regardless of whether the consumer is down, the message will not be repeated. Such as 10 being treated, and consumers in the fifth message processing is down, only the first four message is processed, the remaining skipped, replacing consumers from a batch starts.

latest update. When all messages have been processed. This corresponds to the at least one delivery. Regardless of whether the consumer is down, the message will not be lost, even though the message will be processed twice. For example, 10 messages are being processed, when consumers down in the consumer fifth message, the entire message is to take over 10 consumer process again.

Exactly once semantics only when using Java Library Kafka Stream be guaranteed. If you are using Java, I highly recommended. As long as a precise semantic problem that needs processing and the offset update message hey transaction completion. For example, if the process is to send an e-mail message, then we will not be able to complete an accurate one. For example, we send a message play, the consumer goes down, we can update the offset, but will cause the message to be sent again.

Kafka Stream Java applications, will generate different topic new message after message processing, then this application will meet a precise semantics. Because we can use the transaction feature Kafka and write messages and update offsets.

On transactions and isolation levels

Application Kafka transaction is mainly read - Processing - write mode. Transactions can span multiple topics and partitions. A producer to open a business, write a batch of messages, then commit the transaction.

When consumers use the default read uncommited isolation level, consumers can see all the news, whether it is submitted, uncommitted, or termination. When consumers read committed isolation level, consumers will not see the message or the termination of the uncommitted.

You may be more confused, how isolation levels affect message ordering. The answer is not affected. Consumers still sequential read messages. Last Stable Offset previous messages (LSO) will be read.

to sum up

RabbitMQ and Kafka are reliable, durable messaging system, so if reliability is important to you, then you can be assured that both are reliable. At that time, Kafka even better, because provide release idempotent, and timely operation of the offset error, the message will not be lost.

Obviously, there is no perfect product, but as long as the application of the correct use of ACK, the administrator copied the correct configuration, and your data center is not collapsed, you can rest assured that messages are not lost. As for fault tolerance and availability, but also require additional discussion.

Here are some simple conclusions:

Both provide at least one and at most once semantics

Both provide copy

Both the message and repeat the same throughput choice. Although kafka provide release idempotent, but only to a certain body mass.

Number of ACK messages are not both may be controlled in the way of

Both guarantee the order

Kafka provide real business operations, mainly for reading - Processing - write. Although you need to pay attention throughput.

Use Kafka, timely consumer error handling, but you can use the offset rollback. RabbitMQ is not.

Kafka based on the concept of partition, you can use batch processing to improve performance. And RabbitMQ is not suitable for batch processing because it is a push-based model, competition and consumer use.

Guess you like

Origin www.cnblogs.com/royfans/p/10960526.html