How do RocketMQ and Kafka do transaction messages in message queues?

When it comes to affairs, I believe everyone is familiar with it. ACID comes out of my mind.

Usually we understand the transaction is for some update operations either all succeed or fail, and there will be no intermediate state. ACID is a strict transaction implementation definition, but it is generally not strictly followed in a single system. ACID constraints to implement transactions, let alone distributed systems.

Distributed systems often can only compromise to the final consistency, to ensure the final integrity and consistency of the data, the main reason is that strength does not allow... because availability is king.

And to ensure that the implementation of the full version of the transaction is very expensive, you want to maintain so many system data, no intermediate state data can be read, all operations must be indivisible, which means that the execution of a transaction is blocking Yes, the resource is locked for a long time.

In the case of high concurrency, resources are occupied for a long time, which is a fatal injury. To give a tasteful example, during the peak period of toileting, you can understand.

By the way, what ACID is a classmate who is still not clear, so quickly check it out, I won't go into it here.


Distributed transaction

When it comes to distributed transactions, 2PC, TCC, and transaction messages are common. This article focuses on transaction messages, but I will mention 2PC and TCC a little bit.

 

2PC

2PC is a two-stage submission, which has two roles: coordinator and participant. The two stages are preparation phase and submission phase.

In the preparation phase, the coordinator sends a preparation command to each participant. In this phase, the participant has done everything except the transaction submission, and the commit phase is when the coordinator sees that each participant is not OK in the preparation phase. If there is ok, then Each participant sends a commit command, and if one is not ok, then a rollback command is sent.

The point here is that 2PC is only applicable to database-level transactions. What does it mean? That is, you want to write a piece of data in the database while uploading a picture. These two operations 2PC cannot guarantee that the two operations meet the transaction constraints.

And 2PC is a strongly consistent distributed transaction, which is synchronously blocked, that is, before receiving a commit or rollback command, all participants are waiting for each other, especially when the preparation phase is completed. The resources are all locked. If a participant is stuck for a long time, other participants have to wait for it, resulting in long-term congestion in the resource locked state.

Overall efficiency is low, and there is a single point of failure. The coordinator is that single point, and there is a risk of data inconsistency under extreme conditions. For example, a participant has not received a submission order, and the machine is down at this time. After that, the data is rolled back, and other participants have actually executed the command to commit the transaction.

 

TCC

TCC can guarantee business-level transactions, which means that it is not only at the database level, but it can also do the uploading pictures above.

TCC is divided into three phases try-confirm-cancel. Simply put, each business needs these three methods. The try method is executed first. At this stage, no real business operations will be performed, but a pit first. Mean? For example, if you plan to add 10 points, first add these 10 points in the pre-add field. At this time, the points on the user's account are actually not increased.

Then if all the try succeeds, then execute the confirm method, and everyone will do the real business operation. If one try fails, everyone will execute the cancel operation to withdraw the modification.

It can be seen that TCC is actually very coupled to the business, because the business needs to be modified to complete these three methods. This is actually the shortcoming of TCC, and the confirm and cancel operations must be idempotent, because the two There is no retreat when stepping, it must be completed, so a retry mechanism is required, so the method needs to be idempotent.

 

Transaction message

Transaction news is the protagonist of today's article. It is mainly suitable for asynchronous update scenarios and places where the real-time data is not required.

Its purpose is to solve the problem of data consistency between message producers and message consumers.

For example, if you order takeaway, we first choose fried chicken to add to the shopping cart, and then choose a bottle of Coke, then place an order, and the process is over after payment.

The data in the shopping cart is very suitable for asynchronous deletion with message notification, because generally we will not click on the menu of this store after placing the order, and it does not matter if there are these dishes in the shopping cart even if we click on it. Has little effect.

What we hope is that the dishes in the shopping cart will eventually be deleted after the order is successfully placed, so the main point is that the two steps of placing an order and sending a message either succeed or fail.

 

RocketMQ transaction message

Let's first look at how RocketMQ implements transaction messages.

RocketMQ's transaction message can also be considered as a two-phase commit. Simply put, at the beginning of the transaction, a half message will be sent to Broker.

Half-message means that the message is not visible to the Consumer at this time, and it does not exist in the queue to be sent, but a special queue.

After sending the half message, execute the local transaction, and then decide whether to send a commit message to the Broker or a rollback message according to the execution result of the local transaction.

At this point, some people say that the submit or rollback message failed in this step, what should I do?

The impact is not significant. Broker will periodically check with the Producer to check whether the transaction is successful. Specifically, the Producer needs to expose an interface. Through this interface, the Broker can know whether the transaction has been executed successfully. If it fails, it will return unknown, because it may The transaction is still executing, and multiple queries will be performed.

If it succeeds, the half message is restored to the queue to be sent normally, so that the consumer can consume the message.

Let's take a simple look at how to use it. I simplified it based on the sample code on the official website.

It can be seen that it is still very simple and intuitive to use, it is nothing more than adding a method to check the results of the transaction, and then write the process of local transaction execution in the TransationListener.

At this point, the general flow of RocketMQ transaction messages has been clear. Let’s draw an overall flow chart and go through it. In fact, in the fourth step, the message is either a normal message or abandoning nothing. At this time, the transaction message has been End its life cycle.

RocketMQ transaction message source code analysis

Then let's take a look at how to do it from the source code point of view. First, let's look at the sendMessageInTransactionmethod. The method is a bit long, but the structure is still very clear.

The process is what we analyzed above. The message is stuffed into some attributes to indicate that the message is still a half message at this time, and then sent to the Broker, then executes the local transaction, and then sends the execution status of the local transaction to the Broker, let's look at it now How does Broker deal with this message?

This semi-message request will be processed in Broker's SendMessageProcessor#sendMessage. Because today's main analysis is transaction messages, other processes will not be analyzed. Let me talk about the principle.

Simply put MessageConst.PROPERTY_TRANSACTION_PREPARED, if the attribute of the received message is found to be true in sendMessage , then you can know that the message is a transaction message, and then judge whether the message exceeds the maximum number of consumption, whether to delay, and whether the Broker accepts the transaction message After the operation, the real topic and queue of the message are stored in the attributes, and then the topic of the message is reset RMQ_SYS_TRANS_HALF_TOPIC, and the queue is 0, so that consumers cannot read the message.

The above is the overall process of processing half messages, let's take a look at the source code.

It was Bo Limao to change the prince. In fact, the delayed news was realized in this way, and the news of the changed skin was finally put on the disk.

Broker handles the submission or rollback message processing method EndTransactionProcessor#processRequest, let's take a look at what it does.

It can be seen that if the transaction is submitted, the skin will be replaced and written into the queue of the real topic for consumer consumption. If it is rolled back, the half message will be recorded under a half_op topic, and the background service will scan When it is half a message, it is judged that the message has been processed.

The background service is TransactionalMessageCheckServiceservice, it periodically scans the message queue and a half, to request reverse lookup interface to see if the transaction did not succeed, is the concrete implementation TransactionalMessageServiceImpl#checkmethods.

Let me briefly talk about the process. This step actually involves a lot of code, so I won’t post the code. Interested students will understand it by themselves. But I believe it can be said clearly in words.

First, take half message topic i.e. RMQ_SYS_TRANS_HALF_TOPICall queues, if remember The above then you know queue semi message written is id is the queue 0, and then remove the queue in the queue corresponding half_op theme, i.e., RMQ_SYS_TRANS_OP_HALF_TOPICunder the general topic queue.

This half_op is mainly to record that the transaction message has been processed, that is to say, the message that has been known whether the transaction message is committed or rolled back will be recorded in half_op.

Then call fillOpRemoveMapmethod, a number taken from half_op message that has been treated come weight, those not recorded in half of the message call half_op inside putBackHalfMsgQueueand commitlog written, and sends the transaction request pegging, the reverse lookup requests are Oneway, It will not wait for a response. Of course, the consumption offset of the semi-message queue will also advance at this time.

Then ClientRemotingProcessor # processRequest producer in will process the request, the task will be thrown into the TransactionMQProducer thread pool will eventually call the message we defined above checkLocalTransactionStatemethods, and then sends the transaction status to the Broker, by the way also oneWay.

Seeing this, I believe you will have some questions, such as why there is a half_op, why half-messages are processed and then written to the commitlog. Don't listen to me one by one.

First of all, RocketMQ is designed to write sequentially, so it will not change the messages that have been entered into the disk. Then the transaction message needs to be updated for the number of counter checks. If the counter check fails, the transaction is determined to be rolled back.

Therefore, every time a counter-check is required, the previous half of the news will be re-entered and the consumption progress will be pushed forward. And half_op will record the result of each reverse check, whether it is submitted or rolled back, so the next time it loops to process this half message, you can know from half_op that the transaction has ended, so it is filtered out No need to deal with it.

If the result of the counter check is UNKNOW, the result will not be recorded in half_op, so you can check again and update the number of counter checks.

Now that the entire process is clear, I will draw another diagram to summarize the Broker's transaction processing process.

Kafka transaction message

Kafka's transaction messages are different from RocketMQ's transaction messages. RocketMQ solves the two actions of local transaction execution and message sending to satisfy transaction constraints.

The Kafka transaction message is used when multiple messages need to be sent in a transaction to ensure the transaction constraint between multiple messages, that is, multiple messages are sent successfully or all failed, as demonstrated by the following code .

Kafka's transaction basically cooperates with its idempotent mechanism to achieve Exactly Once semantics, so Kafka's transaction messages are not the kind of transaction messages we think, but RocketMQ's.

When it comes to this, I want to talk a bit. When it comes to Exactly Once, students who are not very clear about it are easy to misunderstand.

We know that there are three kinds of message reliability, namely at most once, exactly once, and at least once. In the previous article in the message queue serial question, I have mentioned that we basically use at least once and then cooperate with the consumer-side idempotency to achieve exactly once. .

The news happens to be consumed once, of course we all pursue it, but I have analyzed it from various aspects in the previous article, and it is basically difficult to reach.

And Kafka actually said that it can achieve Exactly Once? Is it such a good beer? This is actually a gimmick of Kafka. If you want to say he is wrong, he is really right. You want to say he is right, but the Exactly Once he realized is not the Exactly Once you think in your heart.

It happens to exist only one scenario at a time, that is, use Kafka as the message source, and then write it to Kafka after doing some operations.

How did he achieve it exactly once? It is through idempotence, as we achieve in business, through a unique Id, and then record it, if it has been recorded, it will not be written, so as to ensure exactly once.

So what Kafka achieves is exactly once in a specific scenario, instead of using Kafka to send a message as we thought, then this message will only happen to be consumed once.

This is actually the same as Redis saying that he has implemented a transaction, and it is not the transaction we thought.

So what features of open source software have been developed, we blindly believe it, so it is often left with blood or can only be satisfied in special scenarios. Don’t be misled. You can’t believe the description on the surface. You have to look at it in detail. Look at the documentation or source code.

But from another point of view, there is nothing wrong with it. As an open source software, I definitely want more people to use it. I didn't lie. My document is very clear. The title is not deceiving, right?

Indeed, if you click into an article that shocked the title of xxxx, people didn't lie to you. He was really shocked.

Let's talk about Kafka's transaction news again, so that this transaction news is not the transaction news we want, it is actually not today's topic, but let me talk about it briefly.

Kafka's transaction has the role of transaction coordinator, and the transaction coordinator is actually a part of Broker.

When starting a transaction, the producer will initiate a request to the transaction coordinator to indicate that the transaction is open. The transaction coordinator will record this message in a special log-transaction log, and then the producer will send the message that it really wants to send, here Kafka Unlike RocketMQ processing, Kafka will process these transaction messages like normal messages, and the consumer will filter the messages.

Then after the transmission is completed, the producer will send a commit or rollback request to the transaction coordinator, and the transaction coordinator will perform a two-phase commit. If it is committed, it will perform pre-commit first, that is, set the transaction status to pre-commit and then write Transaction log, and then write a message similar to the end of the transaction to all transaction-related partitions, so that the consumer will know that the transaction is complete when the message is consumed, and the message can be released.

Finally, the coordinator will record another transaction end message in the transaction log. So far, the Kafka transaction is completed. I will use the diagram on confluent.io to summarize the process.

At last

So far we have known the whole process of RocketMQ and Kakfa transaction messages. We can see that RocketMQ transaction messages are what we want. Of course, if you use stream computing, then Kakfa transaction messages are also what you want.

There is no way, but the technique can be achieved; if there is no way, it ends with the technique

Welcome everyone to follow the Java Way public account

Good article, I am reading ❤️

Guess you like

Origin blog.csdn.net/hollis_chuang/article/details/108591173