The correct way to open RocketMQ, don’t say you don’t know RocketMQ after reading it

Introduction to RocketMQ

Apache RocketMQ is a distributed messaging middleware with low latency, high concurrency, high availability and high reliability. The message queue RocketMQ can provide asynchronous decoupling and peak-shaving and valley-filling capabilities for distributed application systems. It also has the characteristics of massive message accumulation, high throughput, and reliable retry required by Internet applications.

RocketMQ concept

  • Topic: Message topic, used to classify a type of message, such as order topic, that is, all order-related messages can be carried by this topic, and the producer sends messages to this topic.
  • Producer: The role responsible for producing messages and sending them to Topic.
  • Consumer: The role responsible for receiving and consuming messages from Topic.
  • Message: The content sent by the producer to the Topic will be consumed by the consumer.
  • Message attributes: The producer can customize some business-related attributes for the message when sending, such as Message Key and Tag.
  • Group: A type of producer or consumer, this type of producer or consumer usually produces or consumes the same type of message, and the logic of message publishing or subscription is consistent.

Why use RocketMQ?

Asynchronous decoupling

With the popularity of microservice architecture, it is very important to sort out the relationship between services. Asynchronous decoupling can reduce the degree of coupling between services, while also increasing the throughput of services.

There are many business scenarios that use asynchronous decoupling, because the business of each industry will be different. I believe that everyone can understand it with some more common businesses.

For example, in the business scenario of placing an order in the e-commerce industry, the simplest ordering process is as follows:

  1. Lock inventory
  2. Create Order
  3. User payment
  4. Deduction of inventory
  5. Send SMS notification of purchase to users
  6. Add points to users
  7. Notify the merchant to ship

After our order is successfully placed, the user will make a payment. After the payment is completed, there will be a logic called payment callback, and some business logic needs to be done in the callback. First look at the time it takes to synchronize, as shown below:

The above order process from 3 to 5 can be processed in an asynchronous process. For the user, after the payment is completed, he does not need to pay attention to the following process. Slow processing in the background is enough, which can simplify the three steps and improve the processing time of callbacks.

Peak cutting and valley filling

Peak shaving and valley filling means that under the impact of large traffic, RocketMQ can withstand instantaneous large traffic, protect the stability of the system, and improve user experience.

In the e-commerce industry, the most common traffic impact is the spike activity. The use of RocketMQ to achieve a complete spike business is still a lot of work that needs to be done. It is beyond the scope of this article. I have the opportunity to talk with you alone later. What I want to tell you is that scenarios like this can use RocketMQ to handle high concurrency, provided that the business scenario supports asynchronous processing .

Eventual consistency of distributed transactions

As we all know, distributed transactions have 2PC, TCC, eventual consistency and other solutions. Among them, the use of message queues for eventual consistency solutions is more commonly used.

In the business scenario of e-commerce, transaction-related core business must ensure data consistency. By introducing the distributed transaction of the RocketMQ version of the message queue, the decoupling between systems can be achieved and the final data consistency can be ensured.

Data distribution

Data distribution refers to the ability to distribute the original data to multiple systems that need to use this data to achieve data heterogeneity. The most common is to distribute data to ES, Redis to provide services such as search and caching for business.

In addition to manual data distribution through the message mechanism, you can also subscribe to Mysql's binlog to distribute. In this scenario, you need to use RocketMQ's sequential messages to ensure data consistency.

RocketMQ architecture

Image source Alibaba Cloud official document

  • Name Server: It is an almost stateless node that can be deployed in clusters. It provides naming services in the RocketMQ version of the message queue, updating and discovering Broker services. It is a registry.
  • Broker: Message relay role, responsible for storing and forwarding messages. It is divided into Master Broker and Slave Broker. One Master Broker can correspond to multiple Slave Brokers, but one Slave Broker can only correspond to one Master Broker. After the Broker starts, it needs to complete an operation of registering itself to the Name Server; then, it regularly reports Topic routing information to the Name Server every 30s.
  • Producer: Establish a long link (Keep-alive) with one of the nodes in the Name Server cluster (randomly), periodically read Topic routing information from the Name Server, and establish a long link to the Master Broker that provides Topic services, and regularly report to the Master Broker sends a heartbeat.
  • Consumer: Establish a long connection with one of the nodes in the Name Server cluster (randomly), regularly pull Topic routing information from the Name Server, and establish a long connection to the Master Broker and Slave Broker that provide Topic services, and regularly send it to the Master Broker, Slave Broker sends a heartbeat. Consumers can subscribe to messages from Master Broker or Slave Broker. The subscription rules are determined by the Broker configuration.

RocketMQ message type

RocketMQ supports rich message types, which can meet the business needs of multiple scenarios. Different messages have different application scenarios. Here are four commonly used message types.

General news

Common messages refer to messages with no characteristics in RocketMQ. When there is no special business scenario, ordinary messages are sufficient. If there are special scenarios, you can use special message types, such as sequence, transaction, etc.

Send synchronously

Synchronous sending: The sender of the message sends a message, and the result returned by the server will be synchronized.

Asynchronous send

Asynchronous sending: The message sender sends a message without waiting for the server to return the result, and can send the next message. The sender can receive the server response through the callback interface and process the response result.

One-way send

One-way sending: The sender of the message is only responsible for sending the message, and after sending it out, the sending speed is very fast and there is a risk of losing the message.

Sequential message

Sequential messaging means that producers publish messages in a certain order; consumers subscribe to messages in a predetermined order, that is, messages that are published first will be received by consumers first.

For example, in the data distribution scenario, if we subscribe to Mysql's binlog for data heterogeneity. If the messages are out of order, there will be data disorder.

For example, add a piece of data with id=1 and then delete it immediately. This results in two messages. The normal consumption sequence is to add first, then delete, at this time there is no data. If the messages are not in order, the deleted ones are consumed first, and then the newly added ones are consumed. At this time, the data is still there and not deleted, which will cause inconsistencies.

Timed message

Timed message means that the message has the function of timed sending. When the message is sent to the server, it will not be delivered to the consumer immediately. Instead, the message will not be delivered to consumers for consumption until the time specified by the message.

Delayed messages are also timed messages. Timed messages are scheduled to be sent at a certain point in time, such as 2020-11-11 12:00:00.

Delayed messages are generally based on the current sending time based on how long the delay is sent. For example, the current time is 2020-09-10 12:00:00, and the delay is 10 minutes, then the message will be sent at 2020-09-10 12:10 after the message is successfully sent. :00 for delivery to consumers.

Timed messages can be used in scenarios such as automatic cancellation of orders without payment after timeout.

Transaction message

RocketMQ provides a distributed transaction function similar to X/Open XA. Through RocketMQ transaction messages, the final consistency of distributed transactions can be achieved.

Interactive process:

Image source Alibaba Cloud official document

  1. The sender first sends a semi-transactional message to the RocketMQ server.
  2. After the RocketMQ server receives the message and persists the message successfully, it returns an Ack to the sender to confirm that the message has been sent successfully. At this time, the message is a semi-transactional message and will not be delivered to the consumer.
  3. After receiving the Ack of the semi-transactional message, the sender starts to execute the local transaction logic.
  4. The sender submits a second confirmation to the server based on the execution result of the local transaction. If the local transaction is executed successfully, the message is committed, if the execution fails, the message is rolled back, and the server receives the Commit status and marks the semi-transactional message as deliverable , The consumer will eventually receive the message; the server will delete the semi-transactional message when it receives the Rollback status, and the consumer will not receive the message.
  5. If an unexpected situation occurs, there is no second confirmation of the message in step 4, and the server will initiate a message back-check for the message after waiting for a fixed time.
  6. After receiving the message, the sender needs to check the final result of the local transaction execution of the corresponding message. The sender submits the second confirmation again according to the final status of the local transaction obtained by the inspection, and the server still performs operations on the half-transaction message according to step 4.

Best Practices

Message retry

After the message fails to be consumed by the consumer, the RocketMQ server will re-deliver the message, knowing that the consumer has successfully consumed the message, of course, there is a limit on the number of retries, 16 by default.

The message retry ensures that the message is not lost to a certain extent, and the final consumption is achieved through retry. It should be noted that when consumers consume, they must wait for the success of the local business before ACK (consumption confirmation), otherwise consumption failure will occur, but the ACK has already been made, and the message will not be delivered repeatedly.

If you use asynchronous consumption, you need to perform asynchronous conversion, and wait for the asynchronous operation to complete the ACK. For details, please refer to an article I wrote earlier https://mp.weixin.qq.com/s/Bbh1GDpmkLhZhw5f0POJ2A.

Finally, you need to do the corresponding monitoring. If you retry 4 or 5 times, it still fails. Basically, the subsequent retries also fail. At this time, you need to let the developer know that the manual processing is manual intervention. Or directly monitor the dead letter queue.

Message filtering

Message subject, generally used for unified classification of a type of message. For example, the subject of the order, but the messages under the order will be divided into many types. For example, create an order, cancel an order, etc.

Different types of messages have different business processes. We can define the message format uniformly, and then use a field to distinguish the message types to do different business logic. The bad point is that all messages will be pushed to the consumer and cannot be consumed on demand.

In RocketMQ, you can assign tags to messages, and distinguish the message types by tag. Consumers can complete message filtering on the RocketMQ server based on tags to ensure that consumers only consume the message types they care about.

I once encountered a tag that was not used correctly, there was only one MQ instance, and tags were used to distinguish the environment. All messages are in one topic, the test environment consumes the tag of the test environment, and the online consumer tag is online.

The problem with this approach is that the messages are not isolated, and the online and offline messages are all together. The other is that tags are fixed as a distinction between environments and cannot be used in message type scenarios. As a result, multiple topics can only be built to carry multiple business message types.

Consumption pattern

There are two consumption modes for RocketMQ, cluster consumption and broadcast consumption.

Cluster consumption:

Consumers deploy multiple instances, which we call a cluster, and cluster consumption will only be consumed by one of the instances.

Suitable for most business scenarios. In most scenarios, our message is only allowed to be consumed once, and only one consumer can consume it. For example, in the payment callback scenario, if a message is consumed by multiple instances at the same time, then there will be simultaneous consumption. To modify the order status, and to deduct the inventory.

Broadcast consumption:

Broadcast consumption will make every instance in the cluster consume once.

For example, we use a local cache. When the data changes, we need to refresh the local cache of each node, so each node needs to receive a message.

Consumption idempotence

The idempotent problem is encountered in both the API request scenario and the message consumption scenario. A message cannot be repeatedly consumed multiple times. This must be guaranteed, because we cannot guarantee that the message sender will not send it multiple times, nor can it guarantee that the message will not be delivered repeatedly.

RocketMQ's Exactly-Once delivery semantics is used to solve idempotent problems. Exactly-Once means that the message sent to the messaging system can only be processed by the consumer and processed only once. Even if the producer retrying the message sending causes a message to be delivered repeatedly, the message will only be consumed once on the consumer.

The best idempotent processing method still needs a unique business identifier. Although each message has a MessageId, it is not recommended to use MessageId to make idempotent judgments. When sending messages, you can set a MessageKey for each message. This MessageKey can be used to uniquely identify the business.

I won’t go into details about how to deal with idempotence. You can refer to an article I wrote before https://mp.weixin.qq.com/s/9fhqnbeXPz7-7x0Eadd8DA, a general idempotent implementation scheme.

Local transaction message encapsulation

The transaction message was introduced above. RocketMQ's transaction message adopts the two-phase commit method. And combined with the message counter-check mechanism to ensure final consistency.

From the perspective of usage, each business scenario has to implement a counter-check logic, which is a bit annoying.

Here is another frequently used method, which is local transaction messages. The local message table was originally proposed by eBay. Local transaction messages need to create a message table in the database corresponding to the service. When sending a message, the message is not actually sent to MQ, but a message data is inserted into the message table.

The inserted action is the same transaction as the local business logic. If the local transaction is executed successfully, the message will be successfully dropped and sent to MQ. If the local transaction fails, the message data will be rolled back.

Then you need a special program to pull the unsent messages in the message table and deliver them to MQ. If the delivery fails, you can keep retrying until it succeeds or manual intervention.

The message is written to the message table, and then sent to MQ all the time. This step is no problem. If after MQ receives the message, the Broker is down while the message is still in PageCache, and the message is lost at this time. Of course, you can also use synchronous flashing to avoid loss. If we are flushing the disk asynchronously, is there a way to ensure that the message is not lost?

As we mentioned earlier, RocketMQ transaction messages will have a back-check mechanism, and the message table method also needs a mechanism to ensure that the message is consumed, otherwise it will need to constantly retry to send the message until the message is consumed.

There needs to be a field in the message table to identify the current status of the message, such as unsent, sent, and consumed. When the message is still not sent, it will be sent to MQ. If the sending is successful, the status is sent. But after a few minutes, the status is still sent, this time we need to do some actions.

In this scenario, it is possible that consumers cannot keep up with the speed of production, and messages have accumulated, resulting in messages that have not been consumed. Another possibility is that the message is lost?

You can obtain the corresponding message accumulation data to determine whether the message has accumulated, if not, resend the message to MQ, knowing that the message is consumed.

The problem is that the message has been consumed, how do I know?

Like the cloud service I use, there is a corresponding Open API that can directly query the message track. There should be an open source version too. Without careful study, it should be similar to the commercial version.

According to the message trajectory, you can know whether the message has been consumed, and the process ends here. If the message sent to MQ fails, it will be retried. If the message is not consumed for a long time, it will be resent. Even if it finally enters the dead letter queue, it can be manually intervened through the monitoring of the dead letter queue. It will definitely be final consistency.

Compared with the built-in transaction message, the local message table method does not need to implement the back-check logic, but it is troublesome to increase the message table and also support various sending and checking logic. Especially when the amount of messages is large, how to quickly send the messages in the message table requires a lot of processing. Simple table lookup polling is not suitable for large amounts.

Both methods can be used, as long as the purpose we want can be achieved.

The code word is not easy, pay attention if you can, thanks!

If you think this article is helpful to you, you can like it and follow it to support it, or you can follow my public account, there are more technical dry goods articles and related information sharing, everyone can learn and progress together!

 

Guess you like

Origin blog.csdn.net/weixin_50205273/article/details/108598812