This is the correct way to open distributed transactions!

background

With the popularity of microservices, distributed transactions have become a problem that has to be faced in system design, and the implementation of distributed transactions is very complicated. Before reading this article, you need to have a certain understanding of the ACID, CAP theory, Base theory and two-phase submission of database transactions. Those who are not familiar with it, please Baidu or read the reference blogs 1, 2, 3 and 4. In addition, in the process of reading this article, if you do not understand a certain solution, it is strongly recommended to read the reference blog in the corresponding solution first and then read the corresponding introduction in this article.

In order to facilitate the following description, here is an emphasis on C (consistency) in ACID: strict transaction consistency is to make the database change from one consistency state to another consistency state, and the intermediate state of the transaction cannot be observed.

Seven implementation schemes of distributed transactions:

1. Based on reliable message service (based on reliable message middleware);

2. Try best (based on message middleware);

3. TX-LCN (the realization of LCN);

4. X/Open DTP model (XA specification, based on two-phase submission);

5. Ali DTS (based on TCC);

6. Huawei ServiceComb (the realization of SAGA mode);

7. Ali GTS (the open source product is Fescar, the implementation of the improved XA protocol).

1. Based on reliable message service

Our common messaging middleware, such as LinkedLin's Kafka, RabbitMQ from Rabbit Technology Co., Ltd., and ActiveMQ provided by Apache, do not support transactions. On the other hand, RocketMQ provided by Alibaba has solved the sequence of messages and the idempotence of repeated messages, and realized support for transactions (see reference blog 5 for details). Therefore, based on RocketMQ, distributed transactions can be realized (see reference blog 6 for details). In fact, reference blog 6 gives two sets of implementation solutions based on message services: First, the solution based on local message services is aimed at scenarios where the message middleware itself does not support transactions, and messages need to be implemented from the perspective of application design. Reliability of data; Second, the solution based on independent message service is aimed at scenarios where the message middleware itself supports transactions. In order to facilitate the subsequent comparative analysis, a design plan based on independent messaging services is posted here:

The core principle of the distributed transaction scheme based on reliable message service is to perform "two-phase commit" on the message sent to the message middleware, that is, commit first, and then confirm the message after executing the local transaction. Through this mechanism, it is possible to roll back the transaction.

As shown below. If ACID is used to measure the scheme, the distributed transaction scheme based on reliable message service can guarantee the final atomicity and durability of the transaction, but cannot guarantee the consistency and isolation. The word "final" is added in front of atomicity here because message-based operations are essentially asynchronous operations, which are obviously non-real-time. Persistence naturally needless to say. So why is it that the program cannot guarantee consistency and isolation? As can be seen from the figure below, the message will be confirmed only after the local transaction commits successfully. At this time, the remote transaction has not yet been committed, and the consistency is obviously not satisfied.

We know that the isolation of the database is guaranteed by the lock mechanism. Therefore, if the distributed transaction scheme based on reliable message service wants to meet the isolation, it is often necessary to adopt a distributed lock mechanism on the transaction initiator. Therefore, in general, the distributed solution based on reliable message services is suitable for internal systems that have low requirements for real-time business consistency and transaction isolation.

2. Try your best

The best-effort solution is the same as the reliable messaging service, which relies on messaging middleware (see blog 7). The difference is that message middleware does not need to guarantee reliability, and the realization of distributed transactions is guaranteed by additional proofreading systems or alarm systems (manual processing after alarms). Therefore, like the reliable message service, the best-effort distributed solution can only guarantee the final atomicity and durability of the transaction, but cannot guarantee the consistency and isolation. In terms of application scenarios, as mentioned in reference blog 7, best-effort solutions are often used in internal systems or cross-enterprise business activities that require low real-time business consistency and transaction isolation. The picture below shows the best-effort solution:

Like the solution based on reliable message service, the best-effort solution can guarantee the final atomicity and durability of the transaction, but cannot guarantee the consistency and isolation. Although the best-effort solution can only guarantee the final atomicity and durability, it is often the first choice for many non-core businesses because of its simple implementation.

3 、 TX-LCN

According to the official website of LCN (refer to blog 8), LCN is the abbreviation of lock, confirm and notify. Personally, the LCN scheme is the easiest to understand among the seven schemes in this article, except for the two message-based middleware schemes just introduced, so it is ranked third to introduce. As stated in its official website document (see reference blog 8 for details), LCN does not generate transactions, LCN is just a coordinator of local affairs. This means that systems using LCN are completely dependent on local transactions. Unfortunately, LCN's official website introduces its core principles relatively briefly, and it is not clear after reading it. Instead, it introduces its implementation principles in relative detail in another blog (refer to blogs 9 and 10).

The LCN contains a TxManager and a TxClient. Among them, TxManager is responsible for maintaining global transaction information, while TxClient is located between the business module and the local transaction layer. Its role is to proxy the local transaction layer. It implements the javax.sql.DataSource interface through the proxy connection word and rewrites the close method. After the module commits and closes, the TxClient connection pool will perform a "false close" operation, waiting for TxManager to coordinate and complete the transaction before closing the connection, as shown in the following figure.

The core steps of LCN

1. Create a transaction group

Refers to the process of calling TxManager to create a transaction group object before the transaction initiator starts to execute the business code, and then getting the transaction identifier GroupId.

2. Add transaction group

Adding a transaction group refers to the operation of adding and notifying the TxManager of the transaction information of the module after the participants have finished executing the business method.

3. Close the transaction group

Refers to the action of notifying the initiator of the execution result status to the TxManager after the initiator finishes executing the business code. After the method of closing the transaction group is executed, TxManager will notify the corresponding participating modules to commit or roll back the transaction according to the transaction group information.

Take the database as an example. The "false close" mentioned earlier means that the close() method called after the business method is executed in step 2 is an overwritten method. This method does not actually commit the transaction, but keeps holding The database is connected, and the related resource locks required by the transaction are held, and the locks are not released until the TxManager asynchronously informs the transaction to commit or roll back.

Take the sequence diagram in the reference blog 10 as an example (see the figure below). After participant A and participant B have performed the business operation, the transaction is not actually submitted, but the operation result of the local transaction before submission is returned to the transaction The initiator, the transaction initiator notifies the TxManager, and the latter asynchronously informs each participant to submit or roll back the transaction based on the result of the notification. Of course, TxManager's notification to participants may fail, so a compensation mechanism is needed.

It can be seen that the three words in LCN correspond to the three key steps in the LCN distributed transaction operation: 1. Before the distributed transaction operation, lock all resources until the asynchronous notification (notify) releases the resource; 2. Execute For business operations, confirm whether the transaction should be committed or rolled back according to the operation result; 3. Asynchronously notify the commit or rollback of the transaction according to the operation result in step 2 and finally release the resources.

So far, we have learned that the core principle of LCN is to realize distributed transactions by coordinating local transactions, and the realization of distributed transactions depends on local transactions. Therefore, the ACID characteristics of LCN-based distributed transactions depend on the ACID characteristics of local transactions. Generally speaking, if local transactions can guarantee ACID, then LCN-based distributed transactions can also meet AID. For consistency (Consistency), this is a common problem of distributed transactions.

From the Base theory, we can see that for distributed systems, we are more concerned about eventual consistency and the real-time nature of eventual consistency. Compared with the two solutions based on message middleware introduced above, the real-time performance of the final consistency of the distributed transaction solution based on LCN is much higher than the former two. Of course, the price is a great reduction in concurrent performance. In fact, for distributed systems, quasi-real-time final consistency can already satisfy most application scenarios. For extreme scenarios with extreme requirements for consistency, such as banking services, it can also be ensured by using distributed locks or distributed queues in the business system.

Summary: Compared with other solutions to be introduced later, the advantage of the LCN scheme is that the implementation is relatively simple, but its disadvantages are also obvious: First, it depends on local transactions. If the resources to be operated do not support local transactions, the LCN mode cannot Use directly. Of course, for this limitation, the new version of TX-LCN solves it by supporting the TCC mode and TXC mode that will be introduced later. Second, the entire process of LCN transaction submission needs to lock resources, so the performance is lower than TCC and TXC (TCC and TXC are to shorten the lock time of resources during distributed transaction operations).

4. X/Open DTP model based on XA specification

X/Open, the current open group, is an independent organization that is mainly responsible for formulating various industry technical standards. X/OpenDTP is a set of distributed transaction solutions developed by the organization. For a detailed introduction of the model, please read reference blog 11, here only the core principles are introduced.

DTP model elements

Application Program (AP): Used to define transaction boundaries (ie, define the beginning and end of a transaction), and operate resources within the transaction boundary.

Resource Manager (RM for short): Such as databases, file systems, etc., and provide a way to access resources.

Transaction Manager (Transaction Manager, referred to as TM): Responsible for allocating unique identifiers of transactions, monitoring the execution progress of transactions, and responsible for transaction submission and rollback.

Communication Resource Manager (Communication Resource Manager, CRM for short): Controls communication between distributed applications within a TM domain (TMdomain) or across TM domains.

Communication Protocol (CP for short): Provides underlying communication services between distributed application nodes provided by CRM.

Here we focus on the relationship between AP, TM and RM:

As can be seen from the above figure, the most important role of the XA specification is to define the interactive interface of RM-TM. In fact, in addition to the defined RM-TM interaction interface, the XA specification also optimizes the two-phase submission protocol. Specific optimizations include read-only assertions and one-phase commits. For details, please read reference blog 11.

Next we look at the specific working principle of the XA specification. Reference blog 14 gives a schematic diagram of the XA specification submission process:

The submission steps are:

Before starting a global transaction, the involved RM must register with the TM through ax_regr() to join the cluster; correspondingly, when there is no transaction to be processed, the RM can request the TM to cancel and leave the cluster through ax_unreg().

Before TM performs specific operations at the beginning of xa_ on an RM, it must first open the RM through xa_open() (essentially establishing a dialogue)-this is actually an act of assigning XID; correspondingly, TM executes xa_close() Come close RM.

The combination of xa_start() and xa_stop() called by TM to RM is generally used to mark the beginning and end of a local transaction. There are three points to note here:

For the same RM, according to the requirements of the global transaction, multiple pairs of combinations can be executed before and after—as if to say, the local transaction operation of a running account INSERT is marked first, and then the local transaction operation of the account UPDATE is marked.

The TM executes this combination only to mark the transaction, and the specific service command is handed over to the RM by the AP.

In addition to performing these marking tasks, this combination can actually implement multithreaded join/suspend/resume management in RM.

TM calls RM's xa_prepare() to perform the first stage, and calls xa_commit() or xa_rollback() to perform the second stage.

What needs to be emphasized here is that in the XA specification, the resources of the entire two-phase submission process are locked (see the figure below). In the figure below, regardless of whether Phase2's decision is commit or rollback, the transactional resource lock must be held until Phase2 is completed before being released. It can be seen that both X/OpenDTP and LCN based on the XA specification have long-term resource lock problems.

 

Summary: The core principle of X/Open DTP is based on two-phase commit (XA specification), which implements distributed transactions by locking resources during the entire commit process, which can well meet the AID characteristics and quasi real-time final consistency . In addition, this scheme is similar to LCN, and its performance is lower than TCC and TXC. Therefore, in practical applications, few systems will choose this solution.

5. Alipay DTS based on TCC

Regarding the principle of TCC, it is strongly recommended to read reference blogs 12 and 13 first. Here is the schematic diagram in reference blog 13:

The TCC distributed transaction model includes three parts:

Main business service: The main business service is the initiator of the entire business activity, and the service orchestrator is responsible for initiating and completing the entire business activity.

Secondary business service: The secondary business service is the participant of the entire business activity and is responsible for providing TCC business operations, realizing the three interfaces of preliminary operation (Try), confirming operation (Confirm), and canceling operation (Cancel) for the main business service to call.

Business activity manager: The business activity manager manages and controls the entire business activity, including recording and maintaining the transaction status of the TCC global transaction and the sub-transaction status of each slave business service, and calls all the Confirm operations of the slave business service when the business activity is submitted. Call all the Cancel operations from the business service when the business activity is cancelled.

A complete TCC distributed transaction process is as follows:

1. The main business service first starts local affairs;

2. The main business service applies to the business activity manager to start the distributed transaction main business activity;

3. Then for the slave business service to be called, the master business activity first registers the slave business activity with the business activity manager, and then calls the Try interface of the slave business service;

4. When the Try interface of all slave business services is successfully called, the main business service submits the local transaction; if the call fails, the main business service rolls back the local transaction;

5. If the main business service submits a local transaction, the TCC model calls the Confirm interface of all the slave business services respectively; if the main business service rolls back the local transaction, it calls the Cancel interface respectively;

6. After all the Confirm or Cancel operations of the slave business services are completed, the global transaction ends.

Summary: Using ACID to measure TCC shows that TCC can meet AID characteristics and quasi real-time final consistency. The core principle of TCC is that in a distributed transaction operation, the required resources are first preempted, then the lock is released, and finally, according to the resource preemption, it is decided to use the resource or return the resource. Compared with the XA specification, the TCC scheme adopts the method of reserving resources, and divides the global lock in the two-phase commit process into two local transaction locks, which shortens the time of distributed resource locking and improves the concurrency of transactions. Compared with SAGA, which will be introduced later, because TCC adopts the method of pre-occupying resources, its compensation action is relatively simple to implement. Of course, the disadvantage of TCC is that the business is highly intrusive, especially if the existing business wants to use the TCC solution, the original business logic needs to be modified (this point will be emphasized when SAGA is introduced later).

6. Huawei ServiceComb based on SAGA

Saga is a concept mentioned in a database paper in 1987 (refer to blogs 16-19, focusing on 16), and ServiceComb is a Huawei project team's implementation of the saga mode. The paper pointed out that a Saga transaction is a Long Live Transaction (LLT), and an LLT can be decomposed into multiple local transactions, that is, LLT = T1 + T2 + ... + Tn. Each local transaction will take effect immediately after it is committed. For example, after executing T1 and T2, when T3 is executed, the transactions of T1 and T2 have already been committed.

If the execution of T3 fails, because T1 and T2 have been submitted, it cannot be rolled back. In order to solve this problem, saga requires the business server to provide a corresponding reverse compensation operation Cx for each local transaction Tx (reverse compensation refers to the reverse operation of Tx), and the reverse compensation operation will take effect immediately after execution. In the process of executing an LLT, if any Tx goes wrong, reverse compensation is performed by calling the Cx corresponding to all executed Txs. For example, after executing T1 and T2, if an error occurs when executing T3, you need to execute C3, C2, and C1 to compensate.

The thesis puts forward two requirements for the service called by saga: One is that the called service must support idempotence. Since distributed services must have network timeouts, this is generally sufficient for distributed services. The second is that services must meet exchangeable compensation. as the picture shows:

 

It is necessary to explain the exchangeable compensation here. In fact, the two requirements of service support idempotence and service satisfying exchangeable compensation are to deal with two subdivision scenarios that are inevitably encountered when performing compensation operations. Let's first talk about the two scenarios that will be encountered when performing compensation operations: First, the transaction operation Tx has not been executed at all (for example, the transaction operation Tx is discarded in the graph due to network packet loss and does not reach the server, or Failed to execute on the server).

For this scenario, Cx does not need and cannot be executed, otherwise an error will occur. In other words, the premise of saga's execution of Cx operation is that the server does execute the Tx operation (success or failure); second, as shown on the right in the figure above As shown in the figure, after the transaction operation T is issued, the transaction operation has a large delay in the network, which triggers the timeout mechanism of saga, so a forward retry is performed, and the retry fails, and then the reverse compensation operation C is performed. After C has been executed, the first transaction operation T reaches the server.

At this time, saga requires that the initial transaction operation T cannot take effect, because C has taken effect. In other words, saga requires that once the service accepts and executes the reverse compensation operation C, it will no longer process the corresponding forward operation. The purpose of this is to prevent the reverse compensation operation that arrives first from being overwritten by the forward transaction operation that arrives later.

In fact, the implementation of scenario two relies on the guarantee provided in scenario one, because the positive retry operation in the example may not reach the server, which becomes scenario one. In fact, this problem will also be encountered in the TCC solution introduced later and needs to be resolved (if you are interested, please refer to the blog 20 in advance).

In addition, the reference blog 16 also explained that the saga mode itself only supports the ACD in ACID, and cannot support isolation (and if you use the judgment criteria of this article, saga cannot only meet the quasi-real-time consistency. It cannot satisfy strong consistency). In order to support isolation, you need to consider adding a lock mechanism or a TCC-like approach to the business layer, and use the method of pre-freezing resources at the business layer to isolate resources.

Regarding the isolation of the saga mode, I would like to add that, because the local transaction of the saga mode is submitted immediately after execution, and cannot be rolled back, then the reverse compensation operation may not be executed without the isolation guarantee success. For example, there are three accounts A, B and C, and the account balance is 100 yuan, and the following two concurrent transactions are executed:

Summary: The core principle of the saga mode is to divide a global transaction into several local transactions that can be submitted independently. Each local transaction corresponds to a reverse compensation operation. When the local transaction fails to commit, the local transaction is cancelled through the reverse compensation operation. The impact of the transaction.

Compared with TCC, Saga lacks reservation actions, which makes it more troublesome to implement compensation actions for certain services. For example, the service is sending emails. In TCC mode, save the draft (Try) and then send (Confirm), and delete it directly if you cancel it. Draft (Cancel) will do. Saga will send the email directly (Ti), if you want to cancel, you have to send another email to explain the cancellation (Ci). Of course, for some other simple businesses, Saga has no reserved actions and can be considered an advantage (see reference blog 18 for details):

Some businesses are very simple. Applying TCC requires modifying the original business logic, while Saga only needs to add a compensation action.

The minimum number of TCC communications is 2n, and Saga is n (n=the number of sub-transactions).

Some third-party services do not have a Try interface, and the TCC mode is more tricky to implement, while Saga is very simple.

No reservation action means that there is no need to worry about resource release, and exception handling is easier (please compare Saga's recovery strategy and TCC's exception handling).

7. Alibaba GTS based on improved XA protocol

GTS, originally named TXC, is the abbreviation of Taobao Transaction Constructor. It was established in April 2014, TXC 1.0 version was released in October 2014, TXC 2.0 version was released in December 2015, and Alibaba Cloud public beta in February 2017, external Changed its name to GTS (GlobalTransaction Service). In January 2019, Ali distributed transaction framework GTS open sourced a free community version of Fescar.

As mentioned earlier when introducing the XA specification, the resources of the two-stage submission process are all locked. For the sake of comparison, I once again posted the submission diagram of the XA specification (refer to Blog 21):

As can be seen from the figure, no matter whether Phase2's decision is commit or rollback, the transactional resource lock must be held until Phase2 is completed before being released. Imagine a normal business. The high probability is that more than 90% of the transactions should be submitted successfully. Can we submit local transactions in Phase1? In this way, more than 90% of the cases can save Phase2 lock holding time and improve overall efficiency.

 

The local lock of the data in the branch transaction is managed by the local transaction and is released when the branch transaction Phase1 ends.

At the same time, as the local transaction ends, the connection is released.

The global lock of the data in the branch transaction (see blog 25 for details) is managed on the transaction coordinator side, and the global lock can be released immediately when Phase2 is decided to commit globally. Only in the case of a global rollback decision, the global lock is held until the end of Phase2 of the branch.

This design greatly reduces the lock time of resources (data and connections) for branch transactions, and provides a basis for the improvement of overall concurrency and throughput.

Of course, you will definitely ask: How does Phase2 roll back when Phase1 is submitted? First, the application needs to use Fescar's JDBC data source proxy, which is Fescar's RM.

Phase1 :

Fescar's JDBC data source agent analyzes the business SQL, organizes the data mirroring of the business data before and after the update into a rollback log, and uses the ACID feature of local transactions to write the business data update and the rollback log in the same Commit in local affairs.

In this way, it can be guaranteed that there must be a corresponding rollback log for any update of submitted business data.

Based on this mechanism, the branched local transaction can be committed in Phase1 of the global transaction, and the resources locked by the local transaction are immediately released.

Phase2:

If the decision is a global commit, the branch transaction has already been committed at this time, and there is no need for synchronous coordination processing (only need to clean up the rollback log asynchronously), Phase2 can be completed very quickly.

If the decision is a global rollback, RM receives a rollback request from the coordinator, finds the corresponding rollback log record through XID and Branch ID, generates and executes the reverse update SQL through the rollback record to complete the branch rollback roll.

 

It can be seen that the core principle of Fescar is to analyze the business SQL, organize the data mirroring of the business data before and after the update into a rollback log, and use the ACID feature of local transactions to update the business data and write the rollback log. Commit in the same local transaction. In addition, in order to ensure the isolation of distributed transactions, a global lock is added on the transaction coordinator side to ensure the smooth execution of the rollback log (you can go back and look at the rollback failures listed in the Saga program Example).

Summary: Using ACID to measure Fescar, we know that this scheme can guarantee AID characteristics and quasi-real-time final consistency. In fact, for a distributed system, if the distributed solution can simultaneously guarantee the AID characteristics and the final consistency of quasi-real time, it is equivalent to the ACID characteristics that can guarantee the addition, deletion and modification operations. As for the query operation, the transaction may be read. Intermediate state data is acceptable in most business scenarios. And as mentioned earlier, for extreme scenarios with extreme requirements for consistency such as banking services, it can also be ensured by using distributed locks or distributed queues in the business system.

In addition, comparing TCC and Fescar, it can be seen that regardless of whether the transaction is finally committed or rolled back, TCC essentially needs to perform two operations on the same resource, one is try, the other is confirm or cancel; for Fescar, most In this case, distributed transactions do not need to be rolled back, and for distributed transactions that do not need to be rolled back, each resource only needs to perform an operation. From this perspective, Fescar's average performance will be higher than TCC.

to sum up

The core principle of the distributed solution based on reliable message middleware is to perform "two-phase commit" on messages sent to the message middleware, that is, commit first, and then confirm the message after executing the local transaction. Through this mechanism, it is possible to roll back the transaction. This scheme can guarantee the final atomicity and durability of the transaction, but cannot guarantee the consistency and isolation. The distributed solution based on reliable message service is suitable for systems that do not require high real-time consistency of business and isolation of transactions.

The core principle of the best-effort solution is to rely on an additional proofreading system or alarm system to ensure distributed transactions. This scheme can guarantee the final atomicity and durability of the transaction, but cannot guarantee the consistency and isolation. The distributed solution based on reliable message service is suitable for systems that do not require high real-time consistency of business and isolation of transactions.

The core principle of the TX-LCN scheme is to realize distributed transactions by coordinating local transactions, and the realization of distributed transactions depends on local transactions. Generally speaking, if local transactions can guarantee ACID, then distributed transactions based on LCN can also meet AID, but cannot meet consistency. TX-LCN is relatively simple to implement, but the transaction locks resources for a long time, so it is suitable for scenarios that do not require high concurrency performance.

The core principle of X/Open DTP is based on the two-phase commit (XA specification), which realizes distributed transactions by locking resources during the entire commit process, which can well meet the AID characteristics and quasi-real-time final consistency. Because this solution locks resources for a long time, it is suitable for scenarios that do not require high concurrency performance.

The core principle of TCC is that in a distributed transaction operation, the required resources are first preempted, then the lock is released, and finally based on the resource preemption, the decision to use the resource or return the resource is determined. Compared with the XA specification, the TCC scheme adopts the method of reserving resources, and divides the global lock in the two-phase commit process into two local transaction locks, which shortens the time of distributed resource locking and improves the concurrency of transactions.

The core principle of the saga mode is to divide a global transaction into several local transactions that can be independently committed. Each local transaction corresponds to a reverse compensation operation. When the local transaction fails to commit, the reverse compensation operation is used to cancel the local transaction. influences. Compared with TCC, Saga lacks reserved actions, which makes it more troublesome to implement compensation actions for certain services. For example, the service is sending emails. For other simple services, Saga’s feature of no reserved actions reduces the access of the old system. The cost of entering the Saga program.

The core principle of Fescar is to parse the business SQL, organize the data mirroring of the business data before and after the update into a rollback log, and use the ACID feature of local transactions to write the update of the business data and the rollback log in the same local Commit in transaction. In addition, in order to ensure the isolation of distributed transactions, a global lock is added on the transaction coordinator side to ensure the smooth execution of the rollback log.

If you think this article is helpful to you, you can like it and follow it to support it, or you can follow my public account. There are more technical dry goods articles and related information sharing on it, everyone learns and progresses together!

Guess you like

Origin blog.csdn.net/weixin_48182198/article/details/108848658