In-depth understanding of distributed transactions, solutions for distributed transactions under high concurrency

1. What is a distributed transaction

Distributed transaction means that transaction participants, transaction-supporting servers, resource servers, and transaction managers are located on different nodes of different distributed systems. The above is the explanation of Baidu Encyclopedia. In short, a large operation consists of different small operations. These small operations are distributed on different servers and belong to different applications. Distributed transactions need to ensure that these small operations or all Succeed, or fail at all. Essentially, distributed transactions are to ensure data consistency across different databases.

2. Reasons for distributed transactions

2.1, database sub-database sub-table

When the data generated by a single table in the database exceeds 1000W in one year, then the sub-database and sub-table should be considered. The principle of the specific sub-database and sub-table will not be explained here. I will talk about it in detail later. Simply put, the original database becomes multiple databases. At this time, if an operation accesses both the 01 library and the 02 library, and to ensure data consistency, distributed transactions are used.

2.2. Apply SOA

The so-called SOA is the service-oriented business. For example, the original single machine supported the entire e-commerce website, but now the entire website is disassembled, and the order center, user center, and inventory center are separated. For the order center, there is a special database to store order information, the user center also has a special database to store user information, and the inventory center also has a special database to store inventory information. At this time, if you want to operate the order and inventory at the same time, then the order database and the inventory database will be involved. In order to ensure data consistency, you need to use distributed transactions.

The appearance of the above two situations is different, but the essence is the same, because there are more databases to be operated!

3. ACID characteristics of transactions

3.1. Atomicity (A)

The so-called atomicity means that all operations in the entire transaction are either completed or not done, and there is no intermediate state. For an error in the execution of the transaction, all operations will be rolled back, and the entire transaction will appear as if it had never been executed.

3.2. Consistency (C)

The execution of the transaction must ensure the consistency of the system. Take the transfer as an example. A has 500 yuan and B has 300 yuan. If A successfully transfers 50 yuan to B in a transaction, then no matter how much concurrent, no matter what happens, as long as the transaction is executed If it is successful, then the final account A must be 450 yuan, and the B account must be 350 yuan.

3.3. Isolation (I)

The so-called isolation means that transactions will not affect each other, and the intermediate state of one transaction will not be perceived by other transactions.

3.4. Persistence (D)

The so-called persistence means that once a single transaction is completed, the changes made to the data by the transaction are completely saved in the database, even if a power failure occurs and the system goes down.

4. Application scenarios of distributed transactions

4.1. Payment

The most classic scenario is payment. One payment is to debit the buyer's account and add money to the seller's account at the same time. These operations must be performed in one transaction, either all of them succeed or all of them fail. The buyer account belongs to the buyer center, which corresponds to the buyer database, and the seller account belongs to the seller center, which corresponds to the seller database. The operation of different databases must introduce distributed transactions.

4.2. Order online

When a buyer places an order on an e-commerce platform, two actions are often involved, one is to deduct inventory, and the other is to update the order status. Inventory and order generally belong to different databases, and distributed transactions are required to ensure data consistency.

5. Common distributed transaction solutions

5.1. Two-phase commit based on XA protocol

XA is a distributed transaction protocol, proposed by Tuxedo. XA is roughly divided into two parts: the transaction manager and the local resource manager. Among them, the local resource manager is often implemented by the database. For example, commercial databases such as Oracle and DB2 implement the XA interface, and the transaction manager, as the global scheduler, is responsible for committing and rolling back each local resource. The principle of XA implementing distributed transactions is as follows:

In general, the XA protocol is relatively simple, and once a commercial database implements the XA protocol, the cost of using distributed transactions is relatively low. However, XA also has a fatal disadvantage, that is, the performance is not ideal, especially in the transaction order link, the concurrency is often high, and XA cannot meet high concurrency scenarios. Currently, XA is ideally supported in commercial databases, but it is not ideally supported in MySQL databases. The XA implementation of MySQL does not record logs in the prepare phase. Switching back to the active and standby databases causes data inconsistencies between the main database and the standby database. Many nosqls also do not support XA, which makes the application scenarios of XA very narrow.

5.2, message transaction + eventual consistency

The so-called message transaction is a two-phase commit based on the message middleware, which is essentially a special use of the message middleware. It puts the local transaction and the message in a distributed transaction to ensure that either the local operation is successful or successful. And the outgoing message is successful, or both fail. The open source RocketMQ supports this feature. The specific principles are as follows:

1. System A sends a prepared message to the message middleware
2. The message middleware saves the prepared message and returns success
3. A executes the local transaction
4. A sends a commit message to the message middleware

A message transaction is completed through the above 4 steps. For the above 4 steps, each step may generate errors, the following analysis:

  • As soon as the step is wrong, the entire transaction fails, and the local operation of A will not be performed
  • If there is an error in step 2, the entire transaction fails, and the local operation of A will not be performed.
  • There is an error in step 3. At this time, the prepared message needs to be rolled back. How to roll back? The answer is that system A implements a callback interface for message middleware. The message middleware will continuously execute the callback interface to check whether the execution of transaction A is successful. If it fails, it will roll back the prepared message.
  • There is an error in step 4. At this time, the local transaction of A is successful, so does the message middleware need to roll back A? The answer is no. In fact, through the callback interface, the message middleware can check that the execution of A is successful. At this time, there is no need for A to send a submission message. The message middleware can submit the message by itself, thus completing the entire message transaction.

Two-phase commit based on message middleware is often used in high concurrency scenarios, splitting a distributed transaction into a message transaction (local operation of system A + message sending) + local operation of system B, where the operation of system B is determined by the message Driver, as long as the message transaction is successful, then the A operation must be successful, and the message must be sent. At this time, B will receive the message to perform the local operation. If the local operation fails, the message will be retransmitted until the B operation is successful, which is disguised The distributed transaction between A and B is realized. The principle is as follows:

Although the above scheme can complete the operations of A and B, A and B are not strictly consistent, but eventually consistent. We sacrifice consistency here in exchange for a substantial improvement in performance. Of course, this kind of gameplay is also risky. If B has been unsuccessful, the consistency will be destroyed. Whether or not to play depends on how much risk the business can take.

5.3, TCC programming mode

The so-called TCC programming mode is also a variant of two-phase commit. TCC provides a programming framework that divides the entire business logic into three parts: Try, Confirm and Cancel operations. Taking online ordering as an example, the Try phase will deduct the inventory, and the Confirm phase will update the order status. If the update order fails, the Cancel phase will be entered, and the inventory will be restored. In short, TCC artificially implements two-stage submission through code. The code written in different business scenarios is different, and the complexity is different. Therefore, this mode cannot be reused well.

6. Summary

Distributed transactions are essentially unified control of the transactions of multiple databases, which can be divided into: no control, partial control and complete control according to the degree of control. No control means not introducing distributed transactions, partial control means two-phase commit of various variants, including the above-mentioned message transaction + eventual consistency, TCC mode, and full control means fully realizing two-phase commit. The advantage of partial control is that the concurrency and performance are very good, but the disadvantage is that the data consistency is weakened, while the complete control sacrifices performance to ensure consistency. The specific method ultimately depends on the business scenario. As a technician, you must not forget that technology serves business, not technology for technology's sake. Technology selection for different businesses is also a very important ability

Link to this article: http://www.codeceo.com/article/distributed-transaction.html

The author of this article: Code Agricultural Network  – Wu Jixin

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326148524&siteId=291194637