Design problems of message architecture and how to deal with it

Overview

In the development of microservices, we often introduce message middleware to achieve business decoupling and perform asynchronous operations. Now let us look at the advantages and disadvantages of using message middleware.

The first thing to be sure is that there are many benefits to using message components, of which the three core ones are: decoupling, asynchronous, and peak clipping.

  • Decoupling : The client only needs to send the request to a specific channel, and does not need to be aware of the situation of receiving the request instance.
  • Asynchronous : Write messages to the message queue, and non-essential business logic runs asynchronously to speed up the response.
  • Peak clipping : The message middleware always caches the message before the message is consumed. The message processing end can slowly process the message from the message queue according to the amount of concurrency processed by itself, and it will not overwhelm the business in a moment.

Of course, message middleware is not a silver bullet. After the introduction of the message mechanism, there will be some disadvantages as follows:

  • Potential performance bottleneck : The message broker may have a performance bottleneck. Fortunately, the current mainstream messaging middleware all support a high degree of horizontal expansion.
  • Potential single point of failure : The high availability of the message broker is essential, otherwise the overall system reliability will be affected. Fortunately, most message middleware are highly available.
  • Additional operational complexity : The messaging system is a system component that must be installed, configured, and operated independently, which increases the complexity of operation and maintenance.

These shortcomings can be solved with the help of the extended and high-availability capabilities provided by the message middleware itself, but to really use the message middleware, we need to pay attention to some design problems that may be encountered.

Handling concurrent and sequential messages

In order to improve message processing capabilities and application throughput in a production environment, consumers will generally deploy multiple instance nodes. The challenge then is how to ensure that each message is processed only once, and in the order in which they are sent.

For example: Suppose there are three examples of the same recipient reads the message from the same point channel, the transmission side sequentially released Order Created, Order Updatedand Order Cancelledthree event messages. A simple message implementation may allow colleagues to tell each message to different recipients. If due to network problems cause a delay, the message may not be processed in the order in which they are issued, which will lead to strange behavior, service instances may be another server to handle Order Createdprocessing the message before Order Cancelledthe message.

The solution used by Kafka is to use sharded (partitioned) channels. The overall solution is divided into three parts:

  1. A topic channel is composed of multiple fragments, and each fragment behaves like a channel.
  2. The sender specifies the fragment key such as orderId in the message header, and Kafka uses the fragment key to assign the message to a specific fragment.
  3. Group multiple instances of receivers together and treat them as the same logical receiver (consumer group). Kafka assigns each shard to a single receiver, and it redistributes the shards when the receiver starts and shuts down.

Kafka solution

As shown in the figure above, each Order event message has orderId as its shard key. Every event of a particular order is published to the same shard. And the messages in the fragment are always read by the same receiver instance, so it can be guaranteed that these messages are processed in order.

Handling duplicate messages

Another challenge that must be solved when introducing a message architecture is to deal with duplicate messages. In an ideal situation, the message broker should only deliver the message once, but the cost of ensuring that the message is delivered only once is usually very high. On the contrary, many message components promise to guarantee at least one successful message delivery.

Under normal circumstances, the message component will only deliver the message once. But when the client, network or message component fails, the message may be delivered multiple times. Suppose that the client's database crashes before sending the confirmation message after processing the message. At this time, the message component will send the unconfirmed message again and send it to the client when the database is restarted.

There are two different ways to deal with duplicate messages:

  • Write idempotent message handler
  • Track messages and discard duplicates

Writing an idempotent message processor

If the logic of the application processing messages is idempotent, then repeating the messages is harmless. The idempotence of the program means that even if the application is called repeatedly with the same input parameters, it will not produce additional effects. For example: canceling a cancelled order is an idempotent operation. The same is true for creating an existing order operation. A message processing program that satisfies idempotence can be safely executed multiple times, as long as the message components maintain the same message order when delivering messages.

Unfortunately, applications are usually not idempotent. Or the message component you are currently using will not preserve the ordering when redelivering the message. Duplicate or out-of-order messages may cause errors. In this case, you need to write a message handler that tracks messages and discards duplicate messages.

Track messages and discard duplicate messages

Consider a message processing program that authorizes a consumer's credit card. It must perform only one credit card authorization operation for each order. This application will produce different effects every time it is called. If repeated messages cause the message handler to execute the logic multiple times, the behavior of the application will be incorrect. Message handlers that execute such application logic must make it idempotent by detecting and discarding duplicate messages.

A simple solution is for the message receiver to use the message id to track the messages he has processed and discard any duplicates. For example, store the message id of each message it consumes in a database table.

Repeat message

When the receiver processes the message, it records the message id of the message in the data table as part of the transaction to create and change the business entity. As shown in the figure above, the receiver inserts the row containing the message id into the PROCESSED_MESSAGE table. If the message is repeated, the INSERT will fail and the receiver can choose to discard the message.

Another solution is that the message handler records the message id in the application table instead of in a dedicated table. This method was particularly useful when using a NoSQL database with a restricted transaction model at the time, because NoSQL databases usually do not support updates to two tables as database transactions.

Processing transactional messages

Services usually need to publish messages in the transaction of updating the database. Both database update and message sending must be performed in the transaction, otherwise the service may update the database and then crash before sending the message.

If the service does not perform the two operations atomically, similar failures may leave the system in an inconsistent state.

Next, let's take a look at two commonly used solutions to guarantee transaction messages, and finally look at the transactional message solution of RocketMQ, a modern messaging component.

Use database tables as message queues

If your application is using a relational database, to ensure that the transaction between data update and message sending can directly use the transactional outbox mode, Transactional Outbox .

Insert picture description here

This mode uses database tables as temporary message queues. As shown in the figure above, the service that sends messages has an OUTBOX data table. When performing INSERT, UPDATE, and DELETE business operations, a message record will be added to the OUTBOX data table to ensure atomicity because it is based on local ACID transactions. .

The OUTBOX table acts as a temporary message queue, and then we are introducing a message relay (MessageRelay) service, which reads data from the OUTBOX table and publishes messages to the message component.

The implementation of message relay can be very simple. You only need to periodically pull the latest unpublished data from the OUTBOX table through a timed task, and send the data to the message component after obtaining the data, and finally delete the completed message from the OUTBOX table. can.

Use the transaction log to publish events

Another way to guarantee transactional messages is based on database transaction logs, which is called Change Data Capture, or CDC for short.

Generally, the database will record the transaction log (Transaction Log) when the data changes, such as MySQL binlog. The transaction log can be simply understood as a file queue local to the database, which mainly records the database table changes that occur in chronological order.

Here we use the alibaba open source component canal combined with MySQL to illustrate the working principle of this mode.

For more instructions, please refer to the official document: https://github.com/alibaba/canal

How canal works

  • Canal simulates the interaction protocol of MySQL slave, pretending to be a MySQL slave node, and sending dump protocol to MySQL master;
  • MySQL master receives the dump request and starts to push the binary log to the slave (ie canal);
  • Canal parses the binary log object (the original byte stream), and then can send the parsed data directly to the message component.

RocketMQ transaction message solution

Apache RocketMQ already supports distributed transaction messages in version 4.3.0. RocketMQ adopts the idea of ​​2PC to implement the commit transaction message, and adds a compensation logic to handle the two-phase timeout or failure message, as shown in the following figure.

Insert picture description here

RocketMQ implements transaction messages mainly in two phases: normal transaction sending and submission, and transaction information compensation process.

The overall process is:

  • Normal transaction sending and commit phase
    1. The producer sends a half message to MQServer (half message refers to the message that the consumer cannot consume temporarily)
    2. The server responds to the message writing result, and the half message is sent successfully
    3. Start executing local transaction
    4. , Perform Commit or Rollback operations according to the execution status of the local transaction
  • Compensation process of transaction information
    1. If MQServer does not receive the execution status of the local transaction for a long time, it will initiate a confirmation check back operation request to the
    producer 2. After the producer receives the confirmation check back request, check the execution status of the local transaction
    3. Perform Commit or Rollback operations based on the results of the inspection. The
    compensation phase is mainly used to solve the problem of timeout or failure when the producer sends Commit or Rollback operations.

When the producer uses RocketMQ to send transaction messages, we will also learn from the first solution, which is to build a transaction log table by itself, and then generate a transaction log record when performing local transactions, so that the local transaction and the log transaction are in the same process, while adding @Transactionalannotations, two operations to ensure transaction is an atomic operation.

In this way, if there is information about this local transaction in the transaction log table, it means that the local transaction is executed successfully and Commit is required. On the contrary, if there is no corresponding transaction log, it means that the execution was not successful and Rollback is required.

Interested students can read this article: RocketMQ Advanced-Affairs News

Guess you like

Origin blog.csdn.net/jianzhang11/article/details/109502906