MQ classification and basic information

I haven't updated my blog for a long time. I have studied MQ before, and now I plan to systematically study message queues.
1. Introduction
1. It is a cross-process communication mechanism used to transmit messages upstream and downstream.
MQ is a very common upstream and downstream "logical decoupling + physical decoupling" message communication service.
Message sending upstream -> MQ -> message sending downstream. This case uses MQ
to call upstream -> is called downstream. MQ is not used in this case.
Advantages and disadvantages:
2. Disadvantages:
1) The system is more complex, with an additional MQ component.
2) The communication time is longer, the message transmission path increases, and the delay will increase.
3) The reliability and non-repetition of messages cannot be guaranteed. It is difficult to ensure that the message is not lost or heavy at the same time.
4) The upstream cannot know the execution result of the downstream. For example, the upstream cannot know whether the login is successful. This case uses a call relationship.
1. Use scenarios
such as timed tasks. Execute in the order of execution. task1, task2, task3. Execution order of 1-2-3. Decoupling can be done using MQ. 1 is the publisher, 2 is the subscriber and publisher, and 3 is the subscriber.
If you execute it according to cron, you need to reserve time. There is time to waste. If MQ is used, the execution order and execution time are guaranteed. The execution time of task1 changes, 2 and 3 do not need to be changed.
1. When not to use MQ?
Upstream pays attention to execution results in real time
2. When to use MQ?
1) Data-driven task dependencies
2) Upstream does not care about multiple downstream execution results
3) Asynchronous return execution takes a long time
2. MQ classification
RabbitMQ:
It has good support for routing, load balance or data persistence.
When Redis joins
the queue, the performance of Redis is higher than that of RabbitMQ when the data is relatively small, and if the data size exceeds 10K, Redis is unbearably slow; The dequeue performance of RabbitMQ is much lower than that of Redis.
ZeroMQ
is known as the fastest message queuing system, especially for high-throughput demand scenarios. ZMQ can implement advanced/complex queues that RabbitMQ is not good at, but developers need to combine multiple technical frameworks by themselves. The technical complexity is a challenge to the successful application of this MQ. ZeroMQ has a unique non-middleware model, you don't need to install and run a message server or middleware because your application will act as this service. All you need is a simple reference to the ZeroMQ library, which can be installed using NuGet, and you can happily send messages between applications. But ZeroMQ only provides non-persistent queues, which means that if the machine goes down, the data will be lost. Among them, Twitter's Storm uses ZeroMQ as the transmission of data streams.
ActiveMQ
is a sub-project under Apache. Similar to ZeroMQ, it can implement queues in broker and peer-to-peer technology. At the same time, similar to RabbitMQ, it can efficiently implement advanced application scenarios with a small amount of code. RabbitMQ, ZeroMQ, ActiveMQ all support commonly used multiple language clients C++, Java, .Net, Python, Php, Ruby, etc.
Kafka
Kafka is a sub-project under Apache. It is a high-performance cross-language distributed Publish/Subscribe message queue system. Jafka is incubated on top of Kafka, which is an upgraded version of Kafka. It has the following characteristics: fast persistence, message persistence can be performed under O(1) system overhead; high throughput, a throughput rate of 10W/s can be achieved on an ordinary server; a complete distributed system, Broker , Producer, and Consumer all natively and automatically support distribution and automatically achieve complex balance; support Hadoop data parallel loading, for log data and offline analysis systems like Hadoop, but require real-time processing constraints, this is a feasible solution. . Kafka unifies online and offline message processing through Hadoop's parallel loading mechanism, which is also important to the system studied in this topic. Apache Kafka is a very lightweight messaging system compared to ActiveMQ. In addition to its very good performance, it is also a well-working distributed system
RocketMQ
independently developed by Alibaba.
3. Push Type
Scenario 1: Single sending and single receiving
Usage scenario: simple sending and receiving, no special processing.
Scenario 2: Single-send multiple-receive
usage scenario: one sender, multiple receivers, such as distributed task dispatch. In order to ensure the reliability of message sending, the message is not lost, and the message is made persistent. At the same time, in order to prevent the receiver from going down when processing the message, the ack message is only sent after the message processing is completed.
Scenario 3: Publish/Subscribe
usage scenario: publish and subscribe mode, the sender sends a broadcast message, and multiple receivers receive it.
Scenario 4: Routing (send and receive by route)
usage scenario: the sender sends messages according to the routing key, and different receivers receive messages according to different routing keys.
Scenario 5: Topics (send and receive by topic)
usage scenario: the sender not only sends messages according to a fixed routing key, but also sends messages according to a string "match", and the same is true for the receiver.
4. Data Accuracy
1) Reachability:
The message is divided into upper and lower halves: in the first half, the sender sends the message to MQ. In the second half, MQ sends the message to the receiver.
Messages may be lost in both halves. In order to avoid this situation, MQ timeout and retransmission are required.
Timeout and retransmission in the
first half If MQ is lost or timed out in the first half, the timer in the MQ-client-sender will resend the message until it expects to receive 3. If it is not received after N retransmissions, SendCallback will call back If the sending fails, it should be noted that MQ-server may receive multiple retransmissions of the same message during this process.
Timeout and retransmission in the
second half If MQ is lost or timed out in the second half, the timer in the MQ-server will retransmit the message until it is successfully executed. This process may retransmit the message many times. Generally, the exponential backoff strategy is used, and the message is retransmitted every x seconds. It should be noted that MQ-client-receiver may also receive multiple retransmissions of the same message during this process.
How to deduplicate messages between MQ-client and MQ-server, and how to design idempotent architecture
2) The
first half of idempotency:
1. The sender MQ-client sends the message to the server MQ-server
2, the server MQ -The server places the message to
3, and the server MQ-server returns an ACK to the sender MQ-client
. If 3 is lost, the sender MQ-client will resend the message after timeout, which may cause the server MQ-server to receive duplicate messages.
At this time, the retransmission is initiated by MQ-client, and the message processing is MQ-server. In order to avoid repeated messages in step 2, for each message, an inner-msg-id must be generated in the MQ system as a deduplication and power On the basis of etc., the characteristics of this internal message ID are:
(1) Globally unique
(2) MQ generation, with business irrelevance, shielding the message sender and message receiver
With this inner-msg-id, you can guarantee the In the half-time retransmission, only one message falls into the DB of the MQ-server, achieving idempotency in the first half.
Second half:
4. The server MQ-server sends the message to the receiver MQ-client
5, the receiver MQ-client returns an ACK to the server
6, and the server MQ-server deletes the landing message.
It should be emphasized that the receiver MQ- The client returns an ACK to the server MQ-server, which is an active call behavior of the message consumer business side. It cannot be automatically initiated by the MQ-client, because the MQ system does not know when the consumer actually consumes successfully.
If 5 is lost, the server MQ-server will resend the message after timeout, which may cause the MQ-client to receive duplicate messages.
At this time, the retransmission is initiated by MQ-server, and the message processing is the message consumer business side. Message retransmission will inevitably lead to repeated consumption by the business side (one-time payment in the above example, repeated card issuance). In order to ensure business idempotency, business message In the body, there must be a biz-id as the basis for deduplication and idempotency. The characteristics of this business ID are:
(1) For the same business scenario, globally unique
(2) Generated by the business message sender, business related, Transparent to MQ
(3) The business message consumer is responsible for judging the weight to ensure idempotency.
The most common business IDs are: payment ID, order ID, post ID, etc.
Specific to the payment card purchase scenario, the sender must put the payment ID in the message body, and the consumer must evaluate the same payment ID to ensure the idempotency of the card purchase.
With this business ID, it can be guaranteed that even if the message consumer business party receives duplicate messages in the second half, only one message will be consumed, ensuring idempotency.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324921431&siteId=291194637