Talking about the solutions of distributed transactions

We are no strangers to transactions. The transactions we often refer to stand-alone transactions, that is, local transactions. What is distributed transaction? A distributed transaction is a transaction composed of multiple local transactions , and generally appears in a distributed scenario.

For example, in the e-commerce platform, when we are shopping, the process of placing an order and payment seems to be done in one go, but behind it may be the division of labor and cooperation of multiple systems. Order system, payment system, logistics system, etc. These systems are deployed on different servers and execute various transactions. For e-commerce platforms, this is a distributed transaction.

Local transactions are easy to solve. There is a set of ready-made transaction mechanisms. Distributed transactions are much more complicated than local transactions. How to implement distributed transactions? There are about 3 solutions:

  • A two-phase submission agreement based on the XA agreement .
  • Submit the agreement in three stages .
  • Final consistency based on MQ .

Two-phase submission based on XA protocol

The XA protocol was first proposed by Tuxedo and handed over to the X / Open organization as the interface standard between the resource manager (database) and the transaction manager. Currently, major database manufacturers such as Oracle, Informix, DB2 and Sybase all provide support for XA. ---- Source Baidu Encyclopedia

The two-phase commit is also called 2PC , The two-phase commit protocol. First, there are two roles in the two-phase submission:

  • Participants : the local resource manager, the executor of the transaction, that is, each business system.
  • Coordinator : The brain of distributed transactions, responsible for directing and coordinating the submission / rollback of transactions by various business systems.

The so-called two-stage submission refers to the two stages of voting and commit . Like the electoral system, voting is first and then the decision is made.

In the voting phase , the coordinator initiates a request to perform a transaction operation (CanCommit request) to the participant, and waits for the participant to respond.

After receiving the request, the participant executes the transaction request operation, records the log information but does not submit it. After the recording is successful, a "Yes" message is sent to the coordinator, indicating that the submission operation is approved, and if it is unsuccessful, a "No" message is sent, indicating that it is not Agree to this operation . Note that this process will lock the data.

The flow chart of the voting stage is probably like this:

Second stage submission and voting stage

In the submission phase , after receiving the responses from all participants, the coordinator sends submission or rollback requests to the participants based on the information returned.

  • If the response messages received are all "Yes", send a "DoCommit" message to the participant , the participant completes other operations of the local transaction and releases resources, and then sends a "HaveCommitted" message to the coordinator;

  • If the message received by the coordinator contains a "No" message or if there is a participant who does not respond within a specified time, the "DoAbort" message is sent to all participants , and the participant who sent "Yes" will perform the operation according to the previous The rollback log at the time rolls back the operation, and then all participants will send a "HaveCommitted" message to the coordinator;

The process of the submission phase is roughly shown in the following figure:

Two-phase submission

The two-phase commit protocol is easy to understand. The XA-based two-phase commit algorithm satisfies the ACID characteristics of transactions. It looks perfect, but there are still many shortcomings. The main problems are as follows:

  • Synchronous blocking problem : During the execution of the two-phase commit, all participating nodes are transaction blocking. Participants will lock the data, and other visitors will be blocked if they want to access the data.
  • Single point of failure . In the two-phase submission agreement, there is only one coordinator. Once the coordinator sends a fault, the entire system will be in a stagnation phase. Especially during the submission phase, if the coordinator hangs up, the participant will always wait for the coordinator to respond and will be blocked.
  • Data inconsistency : During the submission phase, after the coordinator sends a DoCommit request to the participants, due to network jitter or a failure of the coordinator in the process of sending the request, only a part of the participants receive the submission request and perform the submission operation But the other part of the participants who did not receive the submission request cannot perform the transaction submission. The problem of data inconsistency appeared in the entire distributed system.

Three-phase submission

The three-phase commit protocol (3PC) is an improvement on the two-phase commit protocol (2PC) . Solved some problems of the two-phase submission, the biggest difference between the three-phase and second-phase submission is the introduction of timeout mechanism and preparation phase .

Let ’s talk about the timeout mechanism first. In the second stage of submission, only the coordinator has a timeout mechanism. If the coordinator does not receive a response from the participant within the specified time, it will submit or terminate the entire transaction according to the current state, but if the coordinator The participant has no timeout mechanism, so he has been waiting, which is also the problem of submitting a single point of failure in the second stage. In the three-phase submission, a timeout mechanism was introduced in the coordinator and participants at the same time . If the coordinator or participant does not receive a response from other nodes within the specified time, it will choose to commit or terminate the entire transaction according to the current state.

The three-phase submission actually divides the submission phase in the two-phase submission into two. The specific three phases in the three-phase submission agreement are: CanCommit, PreCommit, DoCommit three phases

The CanCommit phase is similar to the 2PC voting phase: the coordinator sends a request operation (CanCommit request) to the participant, asking if the participant can perform the transaction submission operation, and then waits for the participant's response; Reply Yes, indicating that the transaction can be executed smoothly; otherwise, reply No.

In the PreCommit phase , according to the similar commit phase in the two-phase commit, according to the result returned by the CanCommit phase, it is determined whether the PreCommit operation can be performed.

image description

At this time, there are two situations. If all participants reply "Yes" , then the execution flow is like this:

  • 1. The coordinator sends a pre-commit request : the coordinator sends a PreCommit request to the participant to enter the pre-submit phase.
  • 2. Transaction pre- commitment: Participants perform transaction operations after receiving the PreCommit request, and record Undo and Redo information in the transaction log.
  • 3. Response feedback : If the participant successfully executes the transaction operation, it returns an ACK response and starts to wait for the final instruction.

If a participant returns "No", or the coordinator does not receive a response from the participant within the specified time , the transaction will be interrupted . The process is this:

  • 1. Send an interrupt request : The coordinator sends an "Abort" message to all participants.
  • 2. Interrupt the transaction : After the participant receives the "Abort" message, or does not receive the message from the coordinator after the timeout, the transaction is interrupted.

In the DoCommit phase , the transaction is actually submitted, the coordinator decides whether to enter the commit phase or the transaction interruption phase based on the information returned by the participants in the PreCommit phase.

image description

The process of the submission phase is as follows:

  • 1. Send a submission request : The coordinator receives the Ack responses sent by all participants, enters the pre-submission state to the submission state, and sends a DoCommit message to all participants.
  • 2. Transaction submission : After the participants receive the DoCommit message, they formally submit the transaction. After completing the transaction commit, release all locked resources.
  • 3. Response feedback : After the participant submits the transaction, send an Ack response to the coordinator.
  • 4. Completing the transaction : After receiving the Ack responses from all participants, the coordinator completes the transaction.

During the transaction interruption phase, the process is as follows:

  • 1. Send interrupt request : The coordinator sends Abort requests to all participants.
  • 2. Transaction rollback : After receiving the Abort message, the participant uses the Undo information recorded in the PreCommit phase to perform the transaction rollback operation and release all locked resources.
  • 3. Feedback result : After the participant completes the transaction rollback, it sends an Ack message to the coordinator.
  • 4. Interrupt transaction : After receiving the Ack message from the participant, the coordinator interrupts the transaction and ends the transaction.

The final consistency scheme based on distributed message

Whether it is a two-phase commit or a three-phase commit, they are all strongly consistent and satisfy the ACID principle of transactions. They all have two common problems:

  • 1. The need to lock data reduces the system performance .
  • 2. Due to network and other reasons, the data consistency problem has not been completely solved.

The distributed solution based on MQ message is not the same. It uses not strong consistency, but final consistency , which is BASE theory. And we are asynchronous until MQ, so the performance is relatively fast, it can be said that the problem caused by the above two methods is perfectly solved.

The idea of ​​solving distributed transactions based on MQ message middleware is this: mainly based on the reliability of MQ message delivery, after sending distributed transactions to MQ middleware, the middleware will persist the transaction, which is very important to ensure the message Not lost. The consumer end consumes asynchronously. If we encounter a failure, because our message is persistent, we can continue to retry according to business rules. If necessary, manually compensate to ensure the final consistency of the data.

Regarding the final consistency scheme based on distributed messaging, I am going to open a separate chapter based on RocketMQ and talk about it in detail, so I wo n’t go into details here.

Welcome to pay attention to the public number [ Internet flat brother ]. Concerned about this Internet programmer who secretly steals lives, I hope you and I can make progress together. The best today is the lowest requirements tomorrow.

Internet flat-headed brother

Guess you like

Origin juejin.im/post/5e96d8226fb9a03c4c5bcf90