Distributed Software Architecture - Transaction ACID

business concept

Transaction processing is a problem that is involved in almost every information system. The meaning of its existence is to ensure that the data in the system is correct, and there will be no contradictions between different data, that is, to ensure the consistency of the data state (Consistency)

Regarding consistency, we focus on the consistency of the database state. In distributed, the consistency mentioned in the distributed consensus algorithm that will be discussed is different. Speaking of the consistency of the database state, in theory, to achieve this goal requires the joint efforts of three aspects:

  • Atomic: In the same business process, a transaction guarantees that multiple data modifications are either successful at the same time or are revoked together.

  • Isolation: In different business processes, transactions ensure that the data being read and written by each business are independent of each other and will not affect each other.

  • Durability: The transaction should ensure that all successfully submitted data modifications can be correctly persisted without data loss.

  • A, I, D are the means, and C is the end.

business scene

Transaction Scenarios The concept of transactions originally originated from databases, but in today’s information systems, all scenarios that require data correctness (consistency), including but not limited to databases, caches, transactional memory, messages, queues, and object file storage Etc., may involve transaction processing.

When a service only operates one data source, it is relatively easy to obtain consistency through A, I, and D, but when a service involves multiple different data sources, or even multiple different services involve multiple different data sources at the same time. When there is no data source, this matter becomes very difficult, and sometimes requires a large or even unrealistic price. Therefore, the industry has explored many other solutions to obtain the highest possible consistency while ensuring operability Assure.

As a result, transaction processing has risen from a "programming problem" in specific operations to an "architecture problem" that needs to be carefully weighed. In the process of exploring these business schemes, people have produced many new ideas and concepts. Let us explore the different processing of the same case in different business schemes, so as to run through and straighten out these concepts.

Scenario

First introduce the specific case.
Fenix's Bookstore is an online bookstore. For a product to be sold successfully, it is necessary to ensure that the following three things are handled correctly:

  1. The user's account is deducted from the corresponding commodity payment;

  2. Deduct the inventory in the commodity warehouse, and mark the commodity as waiting for delivery;

  3. The account of the merchant increases the corresponding product payment.

Next, I will introduce the different scenarios of "single service using a single data source", "single service using multiple data sources", "multiple services using a single data source" and "multiple services using multiple data sources", What means can we use to ensure the correctness of the above scenario examples. In today's lecture, let's first look at "a single service uses a single data source", that is, the local transaction scenario.

Local Transactions

A local transaction refers to a transaction that only operates a specific single transaction resource and does not require the coordination of the "global transaction manager".
Local transaction is the most basic transaction processing scheme, which is usually only applicable to the scenario where a single service uses a single data source, and it directly depends on the transaction capability of the data source (usually the database system) itself to work.

At the program code level, we can only do a layer of standardized packaging (such as JDBC interface) on the transaction interface at most, and cannot deeply participate in the operation process of the transaction.

Let me give you a specific example. Suppose your code calls the Transaction::rollback() method in JDBC. The successful execution of the method does not mean that the transaction has been successfully rolled back. If the engine of the data table is MyISAM, then rollback () method is a meaningless no-op. Therefore, if we want to discuss local transactions in depth, we have to go beyond the level of application code to understand some transaction implementation principles of the database itself and figure out how traditional database management systems implement ACID.

ARIES (Algorithms for Recovery and Isolation Exploiting Semantics) theory, ARIES is a semantic-based recovery and isolation algorithm, focusing on solving the three ACID attributes of transactions, atomicity (Atomic), isolation (Isolation) and durability (Durability) How it should be implemented at the algorithmic level.

Achieving Atomicity and Persistence

Achieving Atomicity and Persistence Atomicity and Persistence are two closely related attributes in a transaction. Atomicity ensures that multiple operations of a transaction either take effect or do not take effect, and there will be no intermediate state; Persistence guarantees that once Once the transaction takes effect, the modified content will not be revoked or lost for any reason.

Obviously, data must be successfully written to persistent storage such as disks and tapes before it can be persistent. Data stored only in memory, once the program suddenly crashes, the database crashes, the operating system crashes, and the machine suddenly loses power and crashes ( Later we will collectively refer to it as crash, Crash) and other situations will be lost. The difficulty in achieving atomicity and persistence is that the operation of "writing to disk" will not be atomic . There are not only "writing" and "not writing", but also an objective intermediate state of "writing". .

According to the example scenario we listed above, buying a book from Fenix's Bookstore requires three data changes:

  1. Subtract the payment from the user account,

  2. Add payment to merchant account,

  3. Mark a book as shipped in the merchandise warehouse,

Due to the intermediate state of the write, the following scenarios can occur:

Uncommitted transaction : The program has not finished modifying the three data, and the database has written the changes of one or two of the data to the disk. At this time, a crash occurs. Once restarted, the database must have a way to know that it happened once before the crash Incomplete shopping operations restore the modified data from the disk to the unmodified state to ensure atomicity.

Submitted transaction : The program has modified three data, and the database has not yet written all three data changes to the disk. At this time, a crash occurs. Once restarted, the database must have a way to know that a complete transaction occurred before the crash. The shopping operation rewrites the part of the data that has not had time to be written to the disk to ensure persistence. This data recovery operation is called crash recovery (Crash Recovery, also known as Failure Recovery or Transaction Recovery).

In order to successfully complete crash recovery, writing data to the disk cannot directly change a certain value of a certain row and certain column of a table like a program modifying a variable value in memory. All the information required for the operation of modifying data (such as What data is modified, which memory page and disk block the data is physically located in, what value is changed to what value, etc.), in the form of a log (the log refers to the file writing method that only performs sequential appends, which is the most efficient write mode) to disk first.

Only when all the log records are safely placed on the disk and the "Commit Record" representing the successful submission of the transaction is seen, the database will modify the real data according to the information on the log. After the modification is completed, an "End Record" will be added to the log. "Indicates that the transaction has been persisted. This transaction implementation method is called "Commit Logging".

Additional knowledge: Shadow Paging
realizes the atomicity and durability of transactions through logs is a mainstream solution today, but it is not the only option. In addition to logs, there is another transaction implementation mechanism called " Shadow Paging " (translated as "shadow paging" in Chinese materials). The transaction mechanism used by the commonly used lightweight database SQLite Version 3 is Shadow Paging.
The general idea of ​​​​Shadow Paging is that changes to data will be written to the data on the hard disk, but instead of directly modifying the original data on the spot, a copy of the data is first copied, the original data is retained, and the copy data is modified. During the transaction process, the modified data will exist in two copies at the same time, one is the data before modification, and the other is the data after modification, which is also the origin of the name "Shadow".
When the transaction is successfully submitted and all data modifications are successfully persisted, the last step is to modify the reference pointer of the data, changing the reference from the original data to the newly copied modified copy, and the last "modify pointer" operation will be It is considered an atomic operation, and the write operation of modern disks can be considered as a hardware guarantee that the phenomenon of "half value change" will not occur. So Shadow Paging can also guarantee atomicity and persistence.
Shadow Paging is simpler to implement transactions than Commit Logging, but when it comes to isolation and concurrency locks, the transaction concurrency capability of Shadow Paging is relatively limited, so it is not widely used in high-performance databases.

The principle of Commit Logging to ensure data persistence and atomicity:
First, once the log is successfully written to the Commit Record, the entire transaction is successful. Even if it crashes when modifying the data, after restarting, restore the site according to the log information that has been written to the disk, and continue Just modify the data, which ensures persistence. Secondly, if the log crashes without being written successfully, after the system restarts, you will see some logs without Commit Record, then mark this part of the log as a rollback state, and the entire transaction will be as if it never happened at all, which guarantees atomicity.

Commit Logging implements transactions simply and clearly, and some databases use the Commit Logging mechanism to implement transactions (more representatively, Ali's OceanBase). However, there is a huge flaw in Commit Logging: all real modifications to data must occur after the transaction commits and the log is written to the Commit Record, even if the disk I/O is free enough before the transaction commits, even if the data modified by a transaction The data volume is very large and takes up a lot of memory buffering. No matter what the reason is, it is never allowed to modify the data on the disk before the transaction is committed. This is very detrimental to improving the performance of the database.

In order to solve this defect, the ARIES theory can finally appear. ARIES proposed a "Write-Ahead Logging" log improvement solution. The so-called "Write-Ahead" in its name means that it allows changing data to be written in advance before the transaction is committed.

Write-Ahead Logging first divides when to write changed data into two types: FORCE and STEAL according to the transaction commit time point:

  • FORCE : After the transaction is committed, it is called FORCE if the changed data must be written at the same time, and it is called NO-FORCE if the changed data must be written at the same time. In reality, most databases adopt the NO-FORCE strategy. As long as there are logs, the changed data can be persisted at any time. From the perspective of optimizing disk I/O performance, there is no need to force data to be written immediately.
  • STEAL : Before the transaction is committed, it is called STEAL if the changed data is allowed to be written in advance, and it is called NO-STEAL if it is not allowed. Considering the optimization of disk I/O performance, allowing data to be written in advance is conducive to utilizing idle I/O resources and saving memory in the database cache.

Commit Logging allows NO-FORCE, but not STEAL. Because if part of the changed data is written to the disk before the transaction is committed, once the transaction is rolled back or a crash occurs, the changed data written in advance will become an error.

Write-Ahead Logging allows NO-FORCE and also allows STEAL. The solution it gives is to add another log called Undo Log. Before the changed data is written to the disk, the Undo Log must be recorded first, specifying which position the data is modified and from what value to what value, so that when the transaction is rolled back or the crash is recovered, the data written in advance can be changed according to the Undo Log to erase.

Undo Log is now generally translated as "rollback log". The previously recorded log for replaying data changes during crash recovery is correspondingly named Redo Log, which is generally translated as "redo log". Due to the addition of Undo Log, Write-Ahead Logging will go through the following three stages when recovering from a crash:

  • Analysis phase (Analysis) : This phase scans the log from the last checkpoint (Checkpoint, which can be understood as all changes that should be persisted before this point have been safely placed on the disk), finds out all transactions without End Record, and forms a pending A collection of recovered transactions (generally including Transaction Table and Dirty Page Table).

  • Redo phase (Redo) : This phase replays the history (Repeat History) based on the collection of transactions to be recovered generated in the analysis phase, finds all logs containing Commit Records, writes them to disk, and adds an entry after the writing is completed End Record, and then remove the transaction set to be recovered.

  • Rollback phase (Undo) : This phase processes the remaining collection of recovery transactions after the analysis and redo phases. At this time, the remaining transactions (called Loser) that need to be rolled back are rolled back according to the information in the Undo Log these affairs.

Operations in both the redo phase and the rollback phase should be designed to be idempotent. In order to pursue high performance, the above three stages will inevitably involve very cumbersome concepts and details (such as the specific data structure of Redo Log and Undo Log, etc.).

Write-Ahead Logging is part of the ARIES theory. The whole set of ARIES has many advantages such as rigor and high performance, but these are also at the cost of complexity. The database can generate four combinations according to "whether FORCE and STEAL are allowed". From the perspective of optimizing disk I/O, the performance of the combination of NO-FORCE and STEAL is undoubtedly the highest; from the perspective of algorithm implementation and logs, NO-FORCE The complexity of adding STEAL combination is undoubtedly the highest. The specific relationship between these four combinations and Undo Log and Redo Log is shown in the figure below:
insert image description here

achieve isolation

Isolation ensures that the data read and written by each thing are independent of each other and will not affect each other. It can be seen from the definition that isolation must be closely related to concurrency, because if there is no concurrency and all transactions are serial, then no isolation is required, or such access has natural isolation. But in reality, it is impossible to have no concurrency. How to realize serial data access under concurrency? Almost all programmers will answer: lock synchronization! Correct, modern databases provide the following three types of locks.

  • Write lock : (Write Lock, also called exclusive lock, eXclusive Lock, abbreviated as X-Lock): If the data has a write lock, only the transaction that holds the write lock can write to the data, and the data supports writing When locked, other transactions cannot write data, nor can they impose read locks.

  • Read Lock (Read Lock, also known as Shared Lock, Shared Lock, abbreviated as S-Lock): multiple transactions can add multiple read locks to the same data. After the data is read lock, it cannot be added write lock , so other transactions cannot write to the data, but can still read it. For a transaction that holds a read lock, if only one transaction of the data has a read lock, it is allowed to directly upgrade it to a write lock, and then write the data.

  • Range Lock : Directly add an exclusive lock to a certain range, and data in this range cannot be written. The following statement is a typical example of adding range locks:

SELECT * FROM books WHERE price < 100 FOR UPDATE;

​Please pay attention to the difference between "the range cannot be written" and "a batch of data cannot be written", that is, do not understand the range lock as a set of exclusive locks. After the range lock is added, not only the existing data in the range cannot be modified, but also any data can be added or deleted in the range, which is impossible for a set of exclusive locks.

Four isolation levels for local transactions

Dirty reads, phantom reads, and non-repeatable reads may occur in transactions of different levels, as shown in the figure below.
insert image description here

  • Dirty read: A transaction accesses and modifies data, but the transaction has not yet been committed, and another transaction is also reading the data, which will cause dirty reads.

  • Non-repeatable read: In the same transaction, read the same data multiple times. For example, the first transaction reads the same data twice, but the modification of the second transaction causes the data read twice by the first transaction to be inconsistent.

  • Phantom read: Occurs when a transaction does not execute independently. For example, the first transaction modifies all the data in the table, the second transaction also modifies the data in the table and adds a row to it, and then the first transaction finds that there are still unmodified data rows in the table.

Serializable

Serializable: Serialized access provides the highest level of isolation, and the highest level of isolation defined in ANSI/ISO SQL-92 is Serializable. Serialization is fully in line with ordinary programmers' understanding of data competition locking. If performance optimization is not considered, all read and written data in the transaction can be serialized by adding read locks, write locks and range locks. Lineization ("just" is to simplify the understanding, but it is actually very complicated. It needs to be divided into two stages of expanding and shrinking to deal with the relationship between read locks, write locks and data, which is called Two-Phase Lock, 2PL). However, it is absolutely impossible for the database not to consider the performance. The concurrency control theory (Concurrency Control) determines that the degree of isolation and the concurrency capability are in conflict with each other. The higher the isolation, the lower the throughput when accessed concurrently. Modern databases will definitely provide other isolation levels than serializable for users to use, allowing users to adjust the isolation level options. The fundamental purpose is that users can adjust the locking method of the database to achieve a balance between isolation and throughput. .

repeatable read

Repeatable Read: Repeatable Read adds read locks and write locks to the data involved in the transaction, and holds them until the end of the transaction, but does not add range locks. The weaker part of repeatable read than serialization is the problem of phantom reads (Phantom Reads), which means that during transaction execution, two identical range queries get different result sets. . For example, if you want to count the number of books in Fenix's Bookstore that sell for less than 100 yuan, the first SQL statement below will be executed:

SELECT count(1) FROM books WHERE price < 100					/* 时间顺序:1,事务: T1 */
INSERT INTO books(name,price) VALUES ('深入理解Java虚拟机',90)	/* 时间顺序:2,事务: T2 */
SELECT count(1) FROM books WHERE price < 100					/* 时间顺序:3,事务: T1 */

According to the previous definitions of range locks, read locks, and write locks, if this SQL statement is executed twice in the same transaction, and there happens to be another transaction that inserts a copy of Books less than 100 yuan, this will be allowed, then the two same queries will get different results, the reason is that there is no range lock for repeatable reading to prohibit inserting new data in the range, this is a transaction Affected by other transactions, the performance of isolation is destroyed.

It is a reminder that the introduction here is based on the ARIES theory, and the specific database does not necessarily have to be implemented in full compliance with the theory. An example is that the default isolation level of MySQL/InnoDB is repeatable read, but it can completely avoid the phantom read problem in read-only transactions (because InnoDB uses MVCC, the repeatable read level will only be less than or equal to the record of the current transaction ID, below There is an introduction to MVCC), for example, transaction T1 in the above example has only query statements, which is a read-only transaction, so the problem in the example does not appear in MySQL. However, in the read and write transactions, MySQL still has phantom reading problems. For example, if transaction T1 in the example inserts new books in other transactions, instead of re-querying the quantity, it will still rename all books less than 100 yuan. Affected by newly inserted books.

read committed

Read Committed (Read Committed): The read committed write lock on the data involved in the transaction will last until the end of the transaction, but the added read lock will be released immediately after the query operation is completed. The weaker part of read committed than repeatable read is the problem of non-repeatable reads (Non-Repeatable Reads), which means that during the execution of a transaction, two queries on the same data get different results. For example, the author wants to obtain the price of the book "In-depth Understanding of Java Virtual Machine" in Fenix's Bookstore, and also executes two SQL statements. Between the execution of these two statements, another transaction modifies the price of this book. , adjust the price of the book from 90 yuan to 110 yuan, as shown in the following SQL:

SELECT * FROM books WHERE id = 1;   						/* 时间顺序:1,事务: T1 */
UPDATE books SET price = 110 WHERE id = 1; COMMIT;			/* 时间顺序:2,事务: T2 */
SELECT * FROM books WHERE id = 1; COMMIT;   				/* 时间顺序:3,事务: T1 */

If the isolation level is read committed, the results of the two repeated executions of the query will be different, because the read committed isolation level lacks a read lock throughout the entire transaction cycle, and cannot prevent the read data from changing. At this time The update statement in transaction T2 can be submitted successfully immediately, which is also a manifestation that a transaction is affected by other transactions and the isolation is broken. If the isolation level is repeatable read, since the data has been locked by transaction T1 and will not be released immediately after reading, transaction T2 cannot obtain the write lock, and the update will be blocked until transaction T1 is committed or returned. Submit after rolling.

read uncommitted

Read Uncommitted (Read Uncommitted): Read Uncommitted only adds a write lock to the data involved in the transaction, which will last until the end of the transaction, but does not add a read lock at all. The weaker aspect of read uncommitted than read committed is the problem of dirty reads (Dirty Reads), which means that during the execution of a transaction, one transaction reads uncommitted data from another transaction. For example, the author felt that the price increase from 90 yuan to 110 yuan in "In-depth Understanding of the Java Virtual Machine" was an act that harmed the interests of consumers, and executed an update statement to change the price back to 90 yuan. Before submitting the transaction, my colleagues said that this was not The random price increase is caused by the increase in printing costs. If you sell it at 90 yuan, you will lose money, so the author immediately rolled back the transaction. The scene is as shown in the following SQL:

SELECT * FROM books WHERE id = 1;   						/* 时间顺序:1,事务: T1 */
/* 注意没有COMMIT */
UPDATE books SET price = 90 WHERE id = 1;					/* 时间顺序:2,事务: T2 */
/* 这条SELECT模拟购书的操作的逻辑 */
SELECT * FROM books WHERE id = 1;			  				/* 时间顺序:3,事务: T1 */
ROLLBACK;			  										/* 时间顺序:4,事务: T2 */

However, after the previous price revision, transaction T1 has sold several copies at a price of 90 yuan. The reason is that read uncommitted does not add a read lock to the data at all, which instead allows it to read data that has been locked by other transactions, that is, the results obtained by the two query statements in the above transaction T1 are not the same. If you can't understand the word "instead" in this sentence, please re-read the definition of write lock: write lock prohibits other transactions from applying read locks, instead of prohibiting transactions from reading data, if transaction T1 reads data at all If there is no need to add a read lock, the uncommitted data of transaction T2 will be immediately read by transaction T1. This is also a performance that a transaction is affected by other transactions and the isolation is destroyed. If the isolation level is read-committed, since transaction T2 holds a write lock on the data, the second query of transaction T1 cannot obtain the read lock, and the read-committed level requires adding a read lock before reading the data, so The query in T1 will be blocked until the transaction T2 is committed or rolled back to get the result.

In fact, problems such as different isolation levels, phantom reads, non-repeatable reads, and dirty reads are just superficial phenomena. They are the result of the combined application of various locks at different locking times. Using locks as a means to achieve isolation is the performance of the database. The root cause of different isolation levels.

Basic principles of MVCC

The four isolation levels have another common feature, that is, problems such as phantom reads, non-repeatable reads, and dirty reads are all caused by the fact that one transaction is affected by another transaction that writes data during the process of reading data. For the isolation problem of "one transaction read + another transaction write", in recent years, a lock-free optimization scheme called "Multi-Version Concurrency Control" (MVCC) has been widely adopted by mainstream commercial databases. MVCC is a read optimization strategy, and its "lock-free" means that no lock is required when reading. The basic idea of ​​MVCC is that any modification to the database will not directly overwrite the data, but instead generate a new version of the copy to coexist with the old version, so as to achieve the purpose of not locking at all when reading. In this sentence, "version" is a key word. The version can be understood as that there are two invisible fields in each row of records in the database: CREATE_VERSION and DELETE_VERSION. The values ​​recorded in these two fields are transaction IDs, transaction IDs Is a globally strictly increasing value, and then write data according to the following rules.

  • Insert data: CREATE_VERSION records the transaction ID for inserting data, and DELETE_VERSION is empty.
  • When deleting data: DELETE_VERSION records the transaction ID of deleting data, and CREATE_VERSION is empty.
  • When modifying data: treat the modified data as a combination of "deleting old data and inserting new data", that is, copy the original data first, record the transaction ID of the modified data in DELETE_VERISON of the original data, and leave CREATA_VERSION empty. The CREATE_VERSION of the copied new data records the transaction ID of the modified data, and the DELETE_VERSION is empty.

At this point, if another transaction wants to read the changed data, it will decide which version of the data should be read according to the isolation level.

The isolation level is repeatable read : always read the record whose CREATE_VERSION is less than or equal to the current transaction ID. On this premise, if there are still multiple versions of the data, the latest one (with the largest transaction ID) is taken.
The isolation level is read committed : always take the latest version, that is, the data record of the version that was most recently committed.
There is no need to use MVCC for the other two isolation levels, because the original data can be directly modified by reading uncommitted, and other transactions can see it immediately when viewing the data, and there is no need for a version field at all. The original semantics of serialization is to block the read operations of other transactions, and MVCC is for lock-free optimization when reading, so naturally it will not be used together.

MVCC is only optimized for "read + write" scenarios. If two transactions modify data at the same time, that is, "write + write", there is not much room for optimization. At this time, locking is almost the only feasible solution There is a little room for discussion on whether the locking strategy is "Optimistic Locking" or "Pessimistic Locking" . The locking described by the author above is a pessimistic locking strategy, that is, if you do not lock first and then access data, problems will definitely occur. In contrast, the optimistic locking strategy believes that data competition between transactions is accidental, and no competition is common. In this way, locking should not be done at the beginning, but remedial measures should be found when competition occurs. This kind of thinking is called "Optimistic Concurrency Control" (Optimistic Concurrency Control, OCC), but the author reminds that there is no need to be superstitious about the saying that optimistic locks are faster than pessimistic locks. It depends purely on the intensity of competition. If the competition is fierce If , optimistic locking is slower.

global affairs

The opposite of the local transaction is the global transaction (Global Transaction), which is also called the external transaction (External Transaction) in some materials. In this section, the global transaction is limited to a single service that uses multiple data sources Scenario transaction solution. Please note that in theory, the real global transaction does not have the constraint of "single service". A consistent transaction processing solution is extremely inappropriate for occasions where multiple nodes call each other's services (typically the current microservice system). Today, it is almost only used in single-service and multi-data source occasions. To avoid confusion with the weakly consistent transaction processing method introduced later that abandons ACID, the scope of the global transaction here has been reduced, and subsequent transactions involving multiple services and multiple data sources will be called "distributed transactions" .

XA protocol

In 1991, in order to solve the consistency problem of distributed transactions, the X/Open organization (later merged into The Open Group) proposed a transaction processing architecture called X/Open XA (XA is the abbreviation of eXtended Architecture). The core content is to define the communication interface between the global transaction manager (Transaction Manager, used to coordinate global transactions) and the local resource manager (Resource Manager, used to drive local transactions). The XA interface is bidirectional and can form a communication bridge between a transaction manager and multiple resource managers (Resource Managers). By coordinating the consistent actions of multiple data sources, the unified submission or unified rollback of global transactions can be realized. Now The names XADataSource and XAResource that we occasionally see in Java code come from this.

However, XA is not a Java technical specification (XA proposed that there was no Java at that time), but a set of language-independent general specifications, so Java specifically defines JSR 907 Java Transaction API, based on the implementation of XA mode in Java language This is the standard for global transaction processing, which is what we now know as JTA. The two main interfaces of JTA are:

  • The interface of the transaction manager: javax.transaction.TransactionManager. This set of interfaces is used by the Java EE server to provide container transactions (the container is automatically responsible for transaction management). It also provides another set of javax.transaction.UserTransaction interfaces for manually opening, committing, and rolling back transactions through program code.
  • The resource definition interface that meets the XA specification: javax.transaction.xa.XAResource. If any resource (JDBC, JMS, etc.) wants to support JTA, it only needs to implement the method in the XAResource interface.

JTA was originally a technology in Java EE. Under normal circumstances, it should be supported by Java EE containers such as JBoss, WebSphere, and WebLogic, but now Bittronix, Atomikos, and JBossTM (formerly called Arjuna) have implemented JTA interfaces in the form of JAR packages. , called JOTM (Java Open Transaction Manager), enables us to use JTA in Java SE environments such as Tomcat and Jetty.

Now, let's make another assumption about the scenario example in this chapter: If the users, merchants, and warehouses of the bookstore are in different databases, and other conditions are still the same as before, what will happen to the situation? Adding that you usually code with declarative transactions, it may look the same as local transactions, just mark the @Transactional annotation, but if you implement it with programmatic transactions, you can see the difference in writing, pseudo The code looks like this:

public void buyBook(PaymentBill bill) {
    
    
    userTransaction.begin();
    warehouseTransaction.begin();
    businessTransaction.begin();
	try {
    
    
        userAccountService.pay(bill.getMoney());
        warehouseService.deliver(bill.getItems());
        businessAccountService.receipt(bill.getMoney());
        userTransaction.commit();
        warehouseTransaction.commit();
        businessTransaction.commit();
	} catch(Exception e) {
    
    
        userTransaction.rollback();
        warehouseTransaction.rollback();
        businessTransaction.rollback();
	}
}

two-phase commit

It can be seen from the code that the purpose of the program is to do three transaction commits, but in fact the code cannot be written like this. Just imagine, if an error occurs in businessTransaction.commit(), the code is transferred to the catch block for execution. At this time The userTransaction and warehouseTransaction have been submitted, and calling the rollback() method is of no avail. This will cause part of the data to be submitted and the other part to be rolled back, and the consistency of the entire transaction cannot be guaranteed. To solve this problem, XA splits the transaction commit into a two-phase process:

  • Preparation phase: Also known as the voting phase, in this phase, the coordinator asks all participants of the transaction whether they are ready to submit, and if the participants are ready to submit, they will reply Prepared, otherwise they will reply Non-Prepared. The preparation operation mentioned here is not the same as the preparation commonly understood in human language. For the database, the preparation operation is to record the content of all transaction commit operations in the redo log, which is different from the real commit in the local transaction It’s just that the last Commit Record is not written for the time being, which means that the isolation is not released immediately after the data is persisted, that is, the lock is still held to maintain the isolation of the data from other non-transactional observers.
  • Commit phase: Also known as the execution phase, if the coordinator receives the Prepared message replied by all transaction participants in the previous phase, it first persists the transaction status locally as Commit, and sends Commit to all participants after the operation is completed instruction, all participants execute the commit operation immediately; otherwise, if any participant replies to the Non-Prepared message, or any participant fails to reply after timeout, the coordinator will persist its own transaction status as Abort, and report to all participants Send the Abort command, and the participant immediately executes the rollback operation. For the database, the commit operation at this stage should be very light. It is only a persistent Commit Record, which can usually be completed quickly. Only when the Abort command is received, the committed data needs to be cleaned up according to the rollback log. This can be a relatively heavy load operation.

The above two processes are called the "2 Phase Commit" (2 Phase Commit, 2PC) protocol, and it needs some other prerequisites to successfully ensure consistency.

  1. It must be assumed that the network is reliable for the short period of the commit phase, i.e. no messages are lost during the commit phase. At the same time, it is also assumed that there will be no error in the whole process of network communication, that is, the last message can be lost, but the wrong message will not be delivered. The design goal of XA is not to solve problems such as Byzantine generals. The failure of the voting stage in the two-stage commit can be remedied (rolled back), while the failure of the commit stage cannot be remedied (the result of the commit or rollback will no longer be changed, and the crashed node can only be recovered), so this stage takes time and should As short as possible, this is also a consideration for controlling network risks as much as possible.
  2. It must be assumed that nodes that are disconnected due to network partitions, machine crashes, or other reasons will eventually recover and will not be permanently disconnected. Since the complete redo log has been written in the preparation phase, once the lost machine recovers, it can find out the transaction data that has been prepared but not committed from the log, and then query the status of the transaction to the coordinator , to determine whether the next step should be a commit or rollback operation.

Please note that the coordinator and participants mentioned above are usually played by the database itself, without application intervention. The coordinator is generally elected among the participants, and the application only plays the role of the client relative to the database. The interaction sequence of the two-stage commit is shown in the figure below.
insert image description here
The principle of the two-stage commit is simple and not difficult to implement, but there are several very significant disadvantages:

  • Single point problem : the coordinator plays a pivotal role in the two-phase submission. The coordinator can have a timeout mechanism when waiting for the participant's reply, allowing the participant to go down, but the participant cannot perform timeout processing while waiting for the coordinator's instruction. Once it is not one of the participants that goes down, but the coordinator, all participants will be affected. If the coordinator has not recovered and has not sent a Commit or Rollback command normally, all participants must wait.
  • Performance issues : During the two-stage submission process, all participants are equivalent to being bound into a unified scheduling whole. During this period, two remote service calls and three data persistence (write redo logs in the preparation stage, and the coordinator makes persistent state In the commit phase, the log is written to the Commit Record), and the whole process will continue until the slowest processing operation in the participant cluster ends, which determines that the performance of the two-stage commit is usually poor.
  • Consistency risk : As mentioned earlier, there are prerequisites for the establishment of two-phase commit. When the assumptions of network stability and downtime recovery capabilities are not established, consistency problems may still occur. There is no need to talk about the ability to recover from downtime. In 1985, Fischer, Lynch, and Paterson proposed the "FLP Impossibility Principle", which proved that if the downtime cannot be recovered in the end, then there is no distributed protocol that can correctly reach a consensus. sex results. This principle is a theory with the same name as "CAP cannot have both principles" in distributed. The consistency risk brought about by network stability means that although the submission period is very short, it is still a period of clear danger. Confirm that the transaction state can be submitted. The coordinator will first persist the transaction state and submit its own transaction. If the network is suddenly disconnected at this time and it is no longer possible to issue a Commit command to all participants through the network, some data will be generated. (of the coordinator) has been committed, but part of the data (of the participant) has neither been committed nor rolled back, resulting in data inconsistency.

three-stage submission

In order to alleviate some of the defects of the two-phase commit protocol, specifically the single-point problem of the coordinator and the performance problem in the preparation stage, a "three-phase commit" (3 Phase Commit, 3PC) protocol was subsequently developed. The three-stage commit subdivides the original two-stage commit preparation stage into two stages, which are called CanCommit and PreCommit, and the commit stage is renamed as the DoCommit stage. Among them, the newly added CanCommit is an inquiry stage. The coordinator allows each participating database to evaluate whether the transaction is likely to be successfully completed according to its own status. The reason for dividing the preparation phase into two is that this phase is a heavy-duty operation. Once the coordinator sends a message to start preparation, each participant will immediately start writing redo logs, and the data resources involved will be locked. , if a certain participant declares that the submission cannot be completed at this time, it means that everyone has done a round of useless work. Therefore, if you add a round of inquiry phase, if you get a positive response, you will be more confident that the transaction can be successfully committed. get smaller. Therefore, in the scenario where the transaction needs to be rolled back, the performance of the three-stage method is usually much better than that of the two-stage method, but in the scenario where the transaction can be submitted normally, the performance of both is still very poor, even the three-stage method Because of one more inquiry, it was slightly worse.

It is also because the probability of transaction failure rollback becomes smaller. In the three-stage commit, if the coordinator is down after the PreCommit stage, that is, if the participant does not wait for the DoCommit message, the default operation strategy will be commit Transactions instead of rolling back transactions or continuing to wait, which is equivalent to avoiding the risk of a single point of problem for the coordinator. The operation sequence of the three-stage commit is shown in the figure below.
insert image description here

From the above process, it can be seen that the three-stage commit has improved the single point problem and the performance problem during rollback, but it has not improved the consistency risk problem, and the risk it faces in this respect is even slightly There are increased ones. For example, after entering the PreCommit stage, the command issued by the coordinator is not Ack but Abort, and at this time due to network problems, if some participants fail to receive the Abort command from the coordinator until the timeout, these participants will mistakenly Commit the transaction, which creates the problem of data inconsistency among different participants.

shared affairs

Contrary to the use of multiple data sources by a single service discussed in Global Transactions, Share Transaction refers to multiple services sharing the same data source. Here it is necessary to emphasize the difference between "data source" and "database" again: a data source refers to a logical device that provides data, and does not necessarily correspond to a physical device one by one. The most commonly used mode when deploying application clusters is to deploy the same set of programs on multiple middleware servers to form multiple replica instances to share traffic pressure. Although they are connected to the same database, each node is equipped with its own exclusive data source, usually the middleware is open to the program code in the form of JNDI. In this case, the data access of all replica instances is completely independent without any intersection, and each node still uses the simplest local transaction. This section discusses the scenario where multiple services will have business intersections. As a specific example, in the scenario of Fenix's Bookstore, it is assumed that user accounts, merchant accounts, and product warehouses are all stored in the same database, but Independent microservices are deployed in each field of users, merchants, and warehouses. At this time, the business operation of a book purchase will run through the three microservices, and they all need to modify data in the database. If we directly regard different data sources as different databases, then the global transactions mentioned in the previous section and the distributed transactions mentioned in the next section are both feasible. However, for each data source connection of this kind It is a special case of the same physical database, and shared transactions have the opportunity to become another way to improve performance and reduce complexity. Of course, it is also likely to be a false requirement.

A theoretically feasible solution is to directly let each service share the database connection. It is not difficult to share the database connection between different persistence tools (JDBC, ORM, JMS, etc.) in the same application process. Some middleware servers, such as WebSphere There will be a built-in "shareable connection" feature to specifically support this. But the premise of this kind of sharing is that the users of the data source are all in the same process. Since the database connection is based on the network connection, it is bound to the IP address and port number. In the literal sense, "different service nodes share the database Connection" is difficult to achieve, so in order to achieve shared transactions, an intermediate role of "transaction server" must be added. Whether it is user service, merchant service or warehouse service, they all deal with the database through the same transaction server. If the external interface of the transaction server is implemented according to the JDBC specification, it can be regarded as a remote database connection pool independent of each service, or directly as a database agent. At this time, the transaction requests sent by the three services may be connected to the same database on the transaction server and completed through local transactions. For example, according to the same transaction ID sent by different service nodes, the transaction server uses the same database connection to process transactions spanning multiple services, as shown in the figure below. The reason why it is emphasized that it is theoretically feasible is because this solution is not related to
insert image description here
actual production The direction of pressure in the system is opposite. The database in a service cluster is the hardest-hit area with the greatest pressure and the most difficult to scale and expand. Therefore, in reality, there are only database proxies like ProxySQL and MaxScale that are used to balance the load of multiple database instances. (In fact, using ProxySQL to proxy a single database and enabling Connection Multiplexing is already close to the transaction server solution mentioned above), and there are almost no transaction service agents that in turn proxy a database to provide transaction coordination for multiple applications. This is also the reason why it is more likely to be a pseudo-requirement. If you have sufficient reasons for multiple microservices to share the database, you must find a more tenable reason to explain to the team what the purpose of splitting microservices is. OK.

In daily development, there is a more common variant of the above solution: using a message queue server instead of a transaction server. When the services of users, merchants, and warehouses operate business, all changes to the database are transmitted to the message queue server through messages, and are processed uniformly through message consumers to realize persistent operations guaranteed by local transactions. This is called "Message-Driven Update of a Single Database".

The idea of ​​"shared transactions" and the two processing methods listed here are not worth advocating in practical applications. There are few successful cases of using this method. Almost all the information that can be queried originated from Spring's The article "Distributed Transactions in Spring, with and without XA" by core developer Dave Syer. The author lists shared transactions as one of the four types of transactions in this chapter just to complete the narrative logic. Although it is not uncommon in reality to share databases after splitting microservices, the author personally does not agree to use shared transactions as a routine solutions to consider.

distributed transaction

The Distributed Transaction mentioned in this chapter refers specifically to the transaction processing mechanism in which multiple services access multiple data sources at the same time. Please note the difference between it and the "Distributed Transaction" in the DTP model. The "distributed" referred to by the DTP model is relative to the data source and does not involve services. This part has been discussed in the "Global Transaction" section. The "distributed" referred to in this section is relative to the service. If it is strictly speaking, it should be called "the transaction processing mechanism in the distributed service environment".

Before 2000, people had hoped that the XA transaction mechanism could be well applied in the distributed environment mentioned in this section, but this good wish has been completely shattered by the CAP theory today, and the next step is to Let's start with the contradiction between CAP and ACID.

Contradiction between CAP and ACID

The CAP theorem (Consistency, Availability, Partition Tolerance Theorem), also known as the Brewer theorem, originated in July 2000 and was proposed by Professor Eric Brewer of the University of California, Berkeley at the "ACM Distributed Computing Principles Symposium (PODC)" a conjecture.
insert image description here
Then in 2022, Seth Gilbert and Nancy Lynch of the Massachusetts Institute of Technology proved the CAP conjecture with rigorous mathematical reasoning. Since then, CAP has officially changed from a conjecture to a well-known theorem recognized in the field of distributed computing. This theorem describes that in a distributed system, when it comes to shared data issues, the following three characteristics can only satisfy at most two of them at the same time:

  • Consistency : It means that what the data sees in any distributed node at any time is in line with expectations. Consistency has a serious definition in distributed research and has multiple types of subdivisions. When discussing distributed consensus algorithms in the future, we will mention consistency again. The consistency that is oriented to copy replication is the same as that oriented to databases here. Strictly speaking, the consistency of the state is not exactly the same. We will discuss the specific differences in the follow-up distributed consensus algorithm.
  • Availability : It represents the ability of the system to provide uninterrupted services. To understand availability, you must first understand two indicators that are closely related to it: reliability (Reliability) and maintainability (Serviceability). Reliability is measured using Mean Time Between Failure (MTBF); maintainability is measured using Mean Time To Repair (MTTR). Availability measures the ratio of the time that the system can be used normally to the total time, which is characterized by: A=MTBF/(MTBF+MTTR), that is, availability is a ratio calculated from reliability and maintainability, such as 99.9999% available, This represents an average annual failure recovery time of 32 seconds.
  • Partition Tolerance (Partition Tolerance) : It represents the ability of the system to provide services correctly after some nodes in a distributed environment lose contact with each other due to network reasons, that is, when a "network partition" is formed with other nodes.

Simply listing concepts, CAP is relatively abstract. The author still uses the scenario examples listed at the beginning of this chapter to illustrate what these three features will mean to distributed systems. Assume that the service topology of Fenix's Bookstore is shown in the figure below. A transaction request from an end user will be responded to by a node in the account, merchant, and warehouse service cluster: in this system, each individual service node
insert image description here
is Has its own database (here is an assumption for the convenience of explaining the problem. In the actual production system, it should generally avoid designing data such as user balances to be stored in multiple writable databases), assuming that a transaction request is sent by " Account node 1", "merchant node 2", and "warehouse node N" jointly respond. When a user purchases a product worth 100 yuan, the account node 1 should first deduct 100 yuan from the user's account. It is easy to deduct 100 yuan in its own database, but it also needs to inform the cluster of this transaction change Node 2 to node N, and ensure that the associated data in the merchant and other account nodes of the warehouse cluster can be changed correctly. At this time, the following possible situations will be faced.

  • If the change information is not synchronized to other account nodes in time, it may happen that when the user purchases another product, it will be assigned to another node for processing. Due to the incorrect balance on the account, an error that could not have occurred by mistake transactions, this is a consistency issue .
  • If the change information needs to be synchronized to other account nodes, the transaction service for the user must be temporarily stopped until the data is synchronized and then resumed, which may cause the user to be blocked due to the temporary inability to provide services when the user purchases next time. Declined transaction, this is an availability issue .
  • If some nodes in the account service cluster cannot normally exchange account change information with another part of the nodes due to network problems, the service provided by any part of the nodes in the service cluster may be incorrect. Can the entire cluster Partition tolerance is the ability to provide services correctly due to connection interruptions between some nodes .

The above only refers to the CAP problem of the account service cluster itself. For the entire Fenix's Bookstore site, it is faced with CAP problems brought about by the account, merchant and warehouse service cluster. For example, after the user account is deducted, due to All nodes in the warehouse service were not notified in time, resulting in oversold due to incorrect inventory data in the warehouse seen in another transaction. Another example is that because a transaction involving a certain commodity in the warehouse is in progress, in order to synchronize the transaction changes of users, merchants and warehouses, the transaction service of the commodity is temporarily locked, resulting in usability problems, and so on.

Since the CAP theorem has been rigorously proved, this section does not discuss why CAP cannot have both, but directly analyzes the different impacts if C, A, and P are discarded.

If partition tolerance is dropped (CA without P)

Means we will assume that communication between nodes is always reliable. Always reliable communication must not be established in a distributed system. This is not a question of whether you think about it, but as long as the network is used to share data, partitions will always exist. In reality, the easiest example of abandoning partition tolerance is the traditional relational database cluster. Although such a cluster still uses multiple nodes connected by the network to work together, the data is not shared through the network. Taking Oracle's RAC cluster as an example, each node has its own independent SGA, redo log, rollback log and other components, but each node obtains data through the same data file and control file in the shared storage , to avoid network partitions by sharing disks. Therefore, although Oracle RAC is also a database composed of multiple instances, it cannot be called a distributed database.
insert image description here

If you give up availability (CP without A)

It means that we will assume that once the network is partitioned, the information synchronization time between nodes can be extended indefinitely. At this time, the problem is equivalent to degenerating to the scenario where a system uses multiple data sources discussed in the previous "Global Transaction" , we can obtain partition tolerance and consistency at the same time through 2PC/3PC and other means. In reality, the CP system that chooses to give up availability is generally used in occasions that require high data quality. In addition to the distributed database transactions of the DTP model, the well-known HBase also belongs to the CP system. Taking the HBase cluster as an example, if a certain If a RegionServer goes down, all key-value ranges held by this RegionServer will be offline until the data recovery process is completed. The time consumed by this process cannot be estimated in advance.
insert image description here

If you give up consistency (AP without C)

Meaning we will assume that once a partition occurs, the data provided between nodes may be inconsistent. The AP system that chooses to give up consistency is currently the mainstream choice for designing distributed systems, because P is a natural attribute of a distributed network, and you can’t discard it if you don’t want it; and A is usually the purpose of building a distributed system. If the number increases but decreases, many distributed systems may lose their existence value. Unless services involving money transactions such as banking and securities would rather be interrupted than make mistakes, most systems cannot tolerate more nodes and lower availability. At present, most NoSQL libraries and distributed cache frameworks are AP systems. Taking Redis cluster as an example, if a Redis node has a network partition, it still does not prevent each node from providing external cache services with its own locally stored data, but At this time, it is possible that inconsistent data will be returned to the client when the request is allocated to different nodes.

读到这里,不知道你是否对“选择放弃一致性的 AP 系统目前是设计分布式系统的主流选择”这个结论感到一丝无奈,本章讨论的话题“事务”原本的目的就是获得“一致性”,而在分布式环境中,“一致性”却不得不成为通常被牺牲、被放弃的那一项属性。但无论如何,我们建设信息系统,终究还是要确保操作结果至少在最终交付的时候是正确的,这句话的意思是允许数据在中间过程出错(不一致),但应该在输出时被修正过来。为此,人们又重新给一致性下了定义,将前面我们在 CAP、ACID 中讨论的一致性称为“强一致性”(Strong Consistency),有时也称为“线性一致性”(Linearizability,通常是在讨论共识算法的场景中),而把牺牲了 C 的 AP 系统又要尽可能获得正确的结果的行为称为追求“弱一致性”。不过,如果单纯只说“弱一致性”那其实就是“不保证一致性”的意思……人类语言这东西真的是博大精深。在弱一致性里,人们又总结出了一种稍微强一点的特例,被称为“最终一致性”(Eventual Consistency),它是指:如果数据在一段时间之内没有被另外的操作所更改,那它最终将会达到与强一致性过程相同的结果,有时候面向最终一致性的算法也被称为“乐观复制算法”。

在本节讨论的主题“分布式事务”中,目标同样也不得不从之前三种事务模式追求的强一致性,降低为追求获得“最终一致性”。由于一致性的定义变动,“事务”一词的含义其实也同样被拓展了,人们把使用 ACID 的事务称为“刚性事务”,而把笔者下面将要介绍几种分布式事务的常见做法统称为“柔性事务”。

可靠事件队列

最终一致性的概念是 eBay 的系统架构师 Dan Pritchett 在 2008 年在 ACM 发表的论文《Base: An Acid Alternative》中提出的,该论文总结了一种独立于 ACID 获得的强一致性之外的、使用 BASE 来达成一致性目的的途径。

BASE 分别是:

  • BA:基本可用性(Basically Available)
  • S: Soft State
  • E: Eventually Consistent

The idea of ​​BASE is to give full play to the bad taste of database scientists who love to make up abbreviations. However, with the catchy stalk of ACID vs BASE (acid vs base), the influence of the paper does spread fast enough. Here I won't talk much about the conceptual issues in BASE. Although it is a bad idea to ridicule it, this paper itself is the origin of the concept of final consistency, and systematically summarizes a technical means for distributed transactions. is very valuable.

We continue to use the scenario examples in this chapter to explain the specific approach of the "reliable event queue" proposed by Dan Pritchett. The goal is still to correctly modify the data in the account, warehouse, and merchant services during the transaction process. The following figure lists the sequence diagram of the modification process ,
insert image description here

  1. The end user sends a transaction request to Fenix's Bookstore: buy a copy of "In-depth Understanding of Java Virtual Machine" worth 100 yuan.
  2. Fenix's Bookstore firstly has a priori evaluation of the error probability for the three operations of user account deduction, merchant account collection, and inventory delivery. The order of their operations is arranged according to the magnitude of the error probability. This evaluation is generally directly reflected in In the program code, some large-scale systems may also implement dynamic sorting. For example, according to statistics, the most likely transaction exception is that the user purchased the product, but did not agree to deduct the payment, or the account balance was insufficient; the second was that the warehouse found that the product was not in stock enough and could not deliver the goods; the lowest risk was the receipt of payment, if When it comes to merchant collection, generally there will be no surprises. The order should be arranged to be the most error-prone first, that is: account deduction → warehouse delivery → merchant collection.
  3. The account service conducts the deduction business. If the deduction is successful, it will create a message table in its own database and store a message in it: "Transaction ID: certain UUID, deduction: 100 yuan (status: completed), warehouse issued Library "In-depth understanding of Java Virtual Machine": 1 book (status: in progress), a merchant's collection: 100 yuan (status: in progress)", note that in this step, "debit business" and "write message" are Use the same local transaction to write to the account service's own database.
  4. Create a message service in the system, periodically poll the message table, and send the message with the status "in progress" to the inventory and merchant service nodes at the same time (it can also be sent serially, that is, send another after one is successful, But it's not necessary in the scenario we're discussing). At this time, the following situations may occur.
  • 1) Both the merchant and the warehouse service have successfully completed the collection and delivery work, and returned the execution result to the user account server, and the user account service updated the message status from "in progress" to "completed". The entire transaction declares a successful end and reaches a state of final consistency.
  • 2) At least one of the merchant or warehouse service fails to receive the message from the user account service due to network reasons. At this point, since the message status stored in the user account server is always "in progress", the message server will continue to repeatedly send messages to unresponsive services every polling. The repeatability of this step determines that all messages sent by the message server must be idempotent. The usual design is to let the message carry a unique transaction ID to ensure that the outbound and payment actions in a transaction will be will only be processed once.
  • 3) Some or all of the merchants or warehouse services are unable to complete the work. For example, the warehouse finds that "In-depth Understanding of the Java Virtual Machine" is out of stock. At this time, it will continue to automatically resend the message until the operation is successful (such as replenishing new inventory) , or until it is manually intervened. It can be seen that as long as the first step of the reliable event queue is completed, there is no concept of failure rollback in the follow-up, and only success is allowed, and failure is not allowed.
  • 4) The merchant and the warehouse service have successfully completed the collection and delivery work, but the reply message is lost due to network reasons. At this time, the user account service will still resend the next message, but because the operation is idempotent, it will not It will lead to repeated delivery and collection, and will only cause the merchant and warehouse server to resend a response message, and this process will repeat until the network communication between the two parties returns to normal.

There are also some message frameworks that support distributed transactions, such as RocketMQ, which natively supports distributed transaction operations. At this time, the above situations 2 and 4 can also be guaranteed by the message framework.

The above solution that relies on continuous retrying to ensure reliability is not the first or original creation of Dan Pritchett. It has been frequently used in other fields of computing and has a special name called "best effort delivery". (Best-Effort Delivery), for example, in the TCP protocol, the reliability guarantee of automatically resending packets without receiving an ACK response belongs to best-effort delivery. There is also a more common form of reliable event queue, which is called "Best-Effort 1PC", which refers to the use of continuous retry after completing the most likely error-prone business as a local transaction. The way (not limited to the message system) to promote the completion of other related businesses in the same distributed transaction.

Guess you like

Origin blog.csdn.net/zkkzpp258/article/details/131360252