Distributed Transaction Solutions

What is a distributed transaction?

Simply put, a large operation consists of different small operations. These small operations are distributed on different servers and belong to different applications. Distributed transactions need to ensure that these small operations either all succeed or all fail. Essentially, distributed transactions are to ensure data consistency across different databases.

Reasons for Distributed Transactions

Service-oriented, with the emergence of various micro-services and the library tables corresponding to these services, data operations between multiple library tables may need to ensure atomicity.

CAP theorem

CAP theory tells us that it is impossible for a distributed system to satisfy the three basic requirements of consistency (C: Consistency), availability (A: Availability) and partition tolerance (P: Partion tolerance) at the same time. two.

consistency

In a distributed environment, consistency refers to whether data can remain consistent across multiple copies.

Availability

Availability means that the services provided by the system must always be available, and each operation request of the user can always return the correct result within a limited time.

Partition Tolerance

When a distributed system encounters any network partition failure, it still needs to be able to provide external services that meet consistency and availability, unless the entire network environment fails.

Network partition refers to that in a distributed system, different nodes are distributed in different sub-networks (computer room or remote network, etc.), due to some special reasons, the network is disconnected between these sub-networks, but each sub-network The internal network is normal, resulting in the network environment of the entire system being divided into several isolated areas. It should be noted that the joining and exiting of each node that composes a distributed system can be regarded as a special network partition.

image.png-183.1kB

BASE theory

BASE theory refers to:

  • Basically Available: Allows longer response times, allows for loss of functionality, allows for downgraded pages (system busy, retry later, etc.)
  • Soft state (soft state): refers to allowing the data in the system to exist in an intermediate state, and believes that the existence of the intermediate state will not affect the overall availability of the system.
  • Eventually consistent (eventual consistency): The essence is to ensure that the final data can achieve consistency, and there is no need to ensure the strong consistency of system data in real time.

two-phase commit

Phase one commits the transaction request

  1. Transaction query:
    The coordinator sends the transaction content to all participants, asks whether the transaction commit operation can be performed, and starts to wait for the responses of each participant.
  2. Execute transactions:
    Each participant node executes transaction operations and records Undo and Redo information in the transaction log.
  3. Each participant feeds back the response of the transaction inquiry to the coordinator:
    if the participant successfully executes the transaction operation, it will feedback to the coordinator Yes response, indicating that the transaction can be executed; if the participant does not successfully execute the transaction, then it will be fed back to the coordinator No Response, indicating that the transaction cannot be executed.

The above process is similar in form to the process of the coordinator organizing each participant to vote for a transaction operation, so this stage is also called the "voting stage", that is, each participant votes to indicate whether to continue to execute the next transaction submission. operate.

Phase 2 executes transaction commit

In Phase 2, the coordinator will decide whether the transaction commit operation can be finally performed according to the feedback of each participant. Under normal circumstances, the following two possibilities are included:

Execute transaction commit:

If the coordinator gets a Yes response from all participants, then a transaction commit is performed.

  1. Send Commit Request: The coordinator sends a Commit request to all participant nodes.
  2. Transaction Commit: After the participant receives the Commit request, it will formally perform the transaction commit operation, and release the transaction resources occupied during the entire transaction execution period after completing the commit.
  3. Feedback transaction submission results: After the participant completes the transaction submission, it sends an ACK message to the coordinator.
  4. Complete the transaction: The coordinator completes the transaction after receiving the Ack messages fed back by all participants.

Interrupting the transaction
If any participant returns a No response to the coordinator, or after waiting for a timeout, the coordinator cannot receive the feedback response from all participants, then the transaction will be interrupted.

  1. Send rollback request: The coordinator sends a Rollback request to all participant stages.
  2. Perform transaction rollback: After the participant receives the Rollback request, it will use the Undo information recorded in Phase 1 to perform the transaction rollback operation, and release the resources occupied during the entire transaction execution after the rollback is completed.
  3. Feedback transaction rollback result: After the participant completes the transaction rollback, it sends an Ack message to the coordinator.
  4. Completion of transaction interruption: After the coordinator receives the Ack messages fed back by all participants, the transaction interruption is completed.

Two-phase commit divides the processing of a transaction into two phases: voting and execution. Its core is to use a try-before-commit processing method for each transaction.

Advantages and disadvantages

The advantages of the two-phase commit protocol: the principle is simple and the implementation is convenient.

Disadvantages of the two-phase commit protocol:

  1. Synchronous blocking: During the execution of the two-phase commit, all the logic involved in the transaction operation is in a blocking state, that is, each participant will not be able to perform any other operations while waiting for the response of other participants.
  2. Single point problem: The coordinator single point problem.
  3. Data inconsistency: In the second phase of the two-phase commit protocol, that is, when the transaction commit is performed, after the coordinator sends a Commit request to all participants. A local network exception was sent or the coordinator crashed before sending the Commit request, resulting in only the departmental participants receiving the Commit request. Therefore, this part of the participants who have received the Commit request will commit the transaction, while other participants who have not received the Commit request will not be able to commit the transaction, so the entire distributed system will have a data inconsistency problem.
  4. Too conservative: If the coordinator fails to obtain the response information of all participants when the coordinator instructs the participants to inquire about transaction submission, the coordinator can only rely on its own timeout mechanism. To judge whether the transaction needs to be interrupted, such a strategy seems conservative. In other words, the two-phase commit protocol does not have a well-designed fault tolerance mechanism, and the failure of any node will lead to the failure of the entire transaction.

three-phase commit

Protocol description

The three-phase commit divides the "commit transaction content" process of the two-phase commit protocol into two, forming a transaction processing protocol consisting of three phases: CanCommit, PreCommit and DoCommit.

stage one

  1. Transaction query: The coordinator sends a canCommit request containing the transaction content to all participants, asking whether the transaction commit operation can be performed, and starts to wait for each participant's response.
  2. Waiting for feedback: After each participant receives the canCommit request from the coordinator, under normal circumstances, if it thinks that the transaction can be executed sequentially, it will return a Yes response and enter the ready state, otherwise it will return a No response.

Phase 2: PreCommit

In Phase 2, the coordinator will decide whether the PreCommit operation of the transaction can be performed according to the feedback of each participant. Under normal circumstances, there are two possibilities.

Perform transaction pre-commit:
If the coordinator gets a Yes response from all participants, then a transaction pre-commit is performed.

  1. Send transaction pre-commit request: The coordinator sends a PreCommit request to all participating nodes and enters the Prepared phase.
  2. Perform transaction pre-commit operation: After the participant receives the preCommit request, it will perform the transaction operation and record the Undo and Redo information in the transaction log. (execute but not commit)
  3. Feedback transaction execution response: If the participant successfully executes the transaction operation, it will feedback to the coordinator Ack response, while waiting for the final instruction: commit (commit) or abort (abort).

Interrupting the transaction:
If any participant returns a No response to the coordinator, or the coordinator cannot receive the feedback response from all participants after waiting for a timeout, the transaction will be interrupted.

  1. Send abort request: The coordinator sends an abort request to all participating nodes.
  2. Whether it receives an abort request from the coordinator, or a timeout occurs while waiting for the coordinator to send a request, the participant will abort the transaction.

Phase 3: doCommit

This phase will perform a real transaction commit, and there are two possible situations as follows.

perform commit

  1. Send commit request: Entering this stage, assuming the coordinator is in normal working state, and it has received Ack responses from all participants, then it will transition from "pre-commit" state to "commit" state and report to all participants The sender sends a doCommit request.
  2. Execute transaction operation: After the participant receives the doCommit request, it will formally execute the transaction commit operation, and release the transaction resources occupied during the entire transaction execution period after completing the commit.
  3. Feedback transaction submission result: After the participant completes the transaction submission, it sends an Ack message to the coordinator.
  4. Complete the transaction: The coordinator completes the transaction after receiving the Ack messages fed back by all participants.

Interrupt the transaction:
If any participant sends a No response to the coordinator, or the coordinator cannot receive the feedback response from all participants after waiting for a timeout, the transaction will be interrupted.

  1. Send abort request: The coordinator sends an abort request to all participating nodes.
  2. Transaction rollback: After the participant receives the abort request, it will use the Undo information recorded in Phase 2 to perform the transaction rollback operation, and release the resources occupied during the entire transaction execution after the rollback is completed.
  3. Feedback transaction rollback result: After the participant completes the transaction rollback, it sends an Ack message to the coordinator.
  4. Interrupt the transaction: After the coordinator receives the Ack messages fed back by all participants, the transaction is interrupted.

It should be noted that once entering stage three, the following two faults may exist:

  • There is a problem with the coordinator.
  • Network failure between coordinator and participants.

No matter what happens, it will eventually cause the participant to fail to receive the doCommit or abort request from the coordinator in time. For such an abnormal situation, the participant will continue to commit the transaction after waiting for the timeout. (Optimistic, thinking that the PreCommit request has been made before, and the transaction will be executed successfully).

Advantages and disadvantages:

Advantages of the three-phase commit protocol: Compared with the two-phase commit protocol, the biggest advantage of the three-phase commit protocol is that it reduces the blocking range of participants and can continue to reach agreement after a single point of failure.

Disadvantages of the three-phase commit protocol: The three-phase commit protocol also introduces new problems while removing blocking, that is, after the participant receives the preCommit message, if the network is partitioned, the node where the coordinator is located at this time and some participants Normal network communication cannot be performed. In this case, the participant will still commit the transaction, which will inevitably lead to data inconsistency.

ZooKeeper's ZAB protocol

ZooKeeper does not fully adopt the Paxos algorithm, but uses a protocol called ZooKeeper Atomic Broadcast (ZAB, ZooKeeper Atomic Message Broadcasting Protocol) as its core algorithm for data consistency.

The ZAB protocol is an atomic broadcast protocol specially designed for the distributed coordination service ZooKeeper to support crash recovery.

ZAB is not a solution for distributed transactions, but a solution for data consistency between distributed (master-slave) replicas.

At its core, ZAB defines the handling of transaction requests that change the state of the ZooKeeper server data.

message broadcast

ZooKeeper allows only one leader server to receive and process all transaction requests from clients. Even if non-leader servers receive transaction requests, they will forward them to the leader server.

All transaction requests must be coordinated and processed by a globally unique server, such a server is called a Leader server, and the rest of the other servers are called Follower servers. The leader server is responsible for converting a client transaction request into a transaction proposal and distributing the proposal to all follower servers in the cluster. After that, the leader server needs to wait for the feedback from all the follower servers. Once more than half of the follower servers give correct feedback, the leader will distribute the Commit message to all the follower servers again, asking them to submit the previous proposal.

This is a process similar to a two-phase commit, but it should be noted that only more than half of the follower servers are required to give correct feedback to commit the transaction; and in the commit process of the ZAB protocol, the interrupt logic is removed, all The follower server either normally feeds back the transaction proposal proposed by the leader, or abandons the leader server.

Message sequential processing

In the entire message broadcast, the leader server will generate a corresponding proposal for each transaction request to broadcast, and before broadcasting the transaction proposal, the leader server will first assign a globally monotonically increasing unique ID to the transaction proposal, which we call it Transaction ID (ZXID). Since the ZAB protocol needs to ensure the strict causal relationship of each message, each proposal must be sorted and processed in the order of its ZXID.

This ZXID is a 64-bit number, of which the lower 32 bits can be regarded as a simple monotonically increasing counter. For each transaction request of the client, when the Leader server generates a new transaction Proposal, the counter will be counted. Add 1 operation. The high 32 bits represent the epoch number of the leader cycle. Whenever a new leader server is elected, the ZXID of the largest transaction proposal in the local log will be taken from the leader server, and the ZXID will be parsed from the ZXID. epoch value, and then add 1 to it, then this number will be used as a new epoch, and the lower 32 bits will be 0 to start generating a new ZXID.

The specific solution to guarantee the sequence is that in the process of message broadcasting, the leader server will allocate a separate queue for each follower server, and then put the transaction proposals that need to be broadcast into these queues in turn, and according to the FIFO policy to send the message. After each follower server receives the transaction proposal, it will first write it to the local disk in the form of a transaction log, and send back an Ack response to the leader server after successful writing. When the Leader server receives the Ack responses from more than half of the Followers, it will broadcast a Commit message to all Follower servers to notify them to commit the transaction. At the same time, the Leader itself will also complete the commit of the transaction, and each Follower server receives After the Commit message, the commit of the transaction is also completed.

Crash Recovery (Leader Election)

Once the leader server crashes, or the leader server loses contact with more than half of the followers due to network reasons, it will enter the crash recovery mode. At this time, the ZK cluster will initiate a round of leader election. The election rule is: the transaction ID with the largest transaction ID is preferred. In the case of the same transaction ID, the server ID is preferred.

In the first round, each server selects itself and broadcasts the vote; at the same time, it also receives voting information from other servers, and compares the transaction ID and server number in the received vote with its own vote. Broadcast the vote again. Every time a vote is received, it will calculate whether it can judge that a machine has more than half of the votes after receiving the vote. Once it senses this situation, it will change its status. The leader elected in this way must have the largest transaction ID in the cluster, that is, the machine with the most complete data.

Because it is the machine with the largest transaction ID, the newly elected leader must have all submitted proposals. More importantly, if the machine with the highest transaction ID becomes the leader, the leader server can save the step of checking the submission of the proposal and losing work. The commit mentioned here means that the ZAB protocol needs to ensure that those transactions that have been committed on the Leader server are finally committed by all servers. Discarding here means that the ZAB protocol needs to ensure that transactions that are only proposed on the Leader server are discarded.

data synchronization

After the leader election is completed, the leader server checks whether all submitted proposals in the transaction log have been submitted by more than half of the machines in the cluster, that is, whether data synchronization is completed. The data synchronization process is as follows:

The Leader server will prepare a queue for each Follower server, and send those transactions that are not synchronized by each Follower server to the Follower server one by one in the form of Proposal messages, and send a Commit message immediately after each Proposal message, to indicate that the transaction has been committed. After the Follower server synchronizes all its unsynchronized transaction proposals from the Leader server and successfully applies it to the local database, the Leader server will add the Follower server to the list of truly available Followers, and start other processes after that .

Distributed transaction solution:

In principle, try to avoid distributed transactions as much as possible, and the cost of ensuring strong consistency is much higher than the cost of quickly finding inconsistencies and repairing them.

Compensation transaction mechanism to ensure eventual consistency

Transaction compensation means that any forward transaction operation in the transaction chain must have a reversible transaction that fully complies with the rollback rules. If it is a complete transaction chain, it must be ensured that each business operation in the transaction chain has a corresponding reversible service.

For example, the withdrawal operation requires multiple operations such as account balance deduction and third-party real payment. At this point, if you use compensation, the operation flow is as follows:

  1. The user initiates a withdrawal, deducts the user's account balance, and generates a withdrawal flow.
  2. Submit to third parties.
  3. When a third party tries to make a payment, the initiator will be notified whether it succeeds or fails.
  4. After the initiator receives the notification, if it finds that the withdrawal fails, it will try to generate a withdrawal failure and return the flow to the user. Return the previously deducted money.

TCC(try Confirm Cancel)

TCC is actually the compensation mechanism adopted, which is divided into three stages:

  • The Try phase is mainly to detect and reserve resources for the business system.
  • The Confirm stage is mainly to confirm and submit the business operation.
  • The Cancel phase mainly refers to the cancellation of the business performed in the state where the business execution error needs to be rolled back, and the reserved resources are released.

Take the membership card debit or balance consumption or coupon usage freeze/occupy-use-write-off process as an example:

  • try: Check whether the account balance is sufficient, and freeze it if it is sufficient (equivalent to reserved resources, and can also freeze multiple resources)
  • confirm: Execute the entire distributed transaction operation (equivalent to using only the resources reserved by try).
  • cancel: Restores the resources reserved in the try phase. (If it fails at a certain stage, restore the resources reserved by try to the state before try, which is understood as adding the balance back)

Taking online ordering as an example, the Try stage will deduct the inventory, and the Confirm stage will update the order status. If the update order fails, it will enter the Cancel stage and restore the inventory.

In short, TCC artificially implements two-stage submission through code. The code written in different business scenarios is different, and the complexity is different. Therefore, this mode cannot be reused well.

Message-Based Distributed Transactions

This type of transaction mechanism divides a distributed transaction into multiple local transactions, which are referred to here as master transactions and slave transactions. First, the master transaction submits locally, and then uses the message to notify the slave transaction, and the slave transaction obtains the key information of the transaction operation from the message for local operation submission. It can be seen that this is an asynchronous transaction mechanism that can only guarantee eventual consistency; but the availability is very high and will not block due to failure. In addition, the master transaction has been submitted first. If it is difficult to roll back the master transaction because the slave transaction cannot be submitted, this mode is only suitable for business situations with a high probability of success in theory, that is, the failure of the slave transaction to submit may be due to a failure. , which is unlikely to be a logical error.

There are two main ways of transaction mechanism based on asynchronous message: local message table and transaction message. The difference between the two is: how to ensure the atomicity of the two operations of committing the main transaction and sending the message.

  1. local message table

    The scheme based on the local message table refers to writing the message to the local database, and ensuring the atomicity of the main transaction and the message writing through the local transaction. For example, the example of bank transfer, the pseudo code is as follows:

     begin transaction:
      update User set account = account - 100 where userId = 'A' ;
      insert into message(userId, amount, status) values('A', 100, 1) ;
    commit transaction 

    After the master transaction writes the message to the local message table, the slave transaction obtains the message and executes it through the pull or push mode. If it is a push mode, a message queue with persistence function is generally used to subscribe messages from transactions. If it is pull mode, then pull the message from the transaction timing and execute it.

  2. transaction message

    The so-called transaction message is a two-phase commit based on the message middleware, which is essentially a special use of the message middleware. It puts the local transaction and the message sent in a distributed transaction to ensure that either the local operation is successful and the external The message is sent successfully, or both fail. The open source RocketMQ supports this feature (the latest version is no longer supported). Let's use RocketMQ to analyze its specific principles:

image.png-31.5kB

  1. When RocketMQ sends a Prepared message in the first stage, it will get the address of the message.
  2. The second phase performs local transactions.
  3. The third stage accesses the message through the address obtained in the first stage according to the execution result of the second stage, and modifies the message state. If the confirmation message fails to be sent at this time, RocketMQ will periodically scan the transaction messages in the message cluster. If a Prepared message is found, it will confirm to the message sender (the confirmation method is to force a callback interface to be registered in the first stage, at this time Call the callback interface to obtain the execution result.) to determine the transaction message status.

Two-phase commit based on message middleware is often used in high concurrency scenarios to split a distributed transaction into a message transaction (local operation of system A + message sending) + local operation of system B, where the operation of system B is determined by Message-driven, as long as the transaction message is successful, then the A operation must be successful, and the message must be sent. At this time, B receives the message to perform the local operation. If the local operation fails, the message will be retransmitted until the B operation is successful, which is disguised The distributed transaction between A and B is realized. If it is more perfect, considering that B has been retrying failure, it can also provide a rollback mechanism for A's operation. The whole process principle is as follows:

image.png-40.5kB

Reconciliation (most robust technique)

Real-time reconciliation, quasi-real-time reconciliation, offline reconciliation of T+1. For unevenness, automatic straightening and automatic netting. Commonly used in financial systems such as transaction settlement.

best effort delivery

  1. Allow intermediate states to appear, such as cash withdrawal processing, record the failure record table at the same time, and schedule Job retry.
  2. Delay message retry using MQ.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325121995&siteId=291194637