Graphic explanations will take you to thoroughly understand the scenarios and solutions of distributed transactions! !

Local affairs

Local transaction process

Before introducing distributed transactions, let's take a look at local transactions. First, let's take a picture.
Insert picture description here
From the above figure, we can see that local transactions are managed locally by resource managers (such as DBMS, database management systems).

Pros and cons of local affairs

Local affairs have corresponding advantages and disadvantages.

advantage:

  1. Support strict ACID attributes.
  2. Reliable, high efficiency of transaction implementation (only local operation).
  3. You can only operate transactions in RM (resource manager).
  4. The programming model is simple.

Disadvantages:

  1. Lack of distributed transaction processing capabilities.
  2. The smallest unit of data isolation is determined by the RM (resource manager), and developers cannot decide the smallest unit of data isolation. For example: a record in the database, etc.

ACID attributes

Speaking of transactions, we have to mention the ACID properties of transactions.

Insert picture description here

  • A (Atomic): Atomicity, all operations that constitute a transaction are either executed or not executed at all. It is impossible to have partial success and partial failure.
  • C (Consistency): Consistency, before and after the transaction is executed, the consistency constraints of the database are not destroyed. For example: Zhang San transfers 100 yuan to Li Si. The correct state of the data before and after the transfer is called consistency. If Zhang San transfers 100 yuan, and Li Si’s account does not increase by 100 yuan, there is a data error. Consistency is not reached.
  • I (Isolation): Isolation. Transactions in the database are generally concurrent. Isolation means that the execution of two concurrent transactions does not interfere with each other, and one transaction cannot see the intermediate state of the other transactions. By configuring the transaction isolation level, problems such as dirty reads and repeated reads can be avoided.
  • D (Durability): Durability, after the transaction is completed, the data changes made by the transaction will be persisted to the database and will not be rolled back.

Distributed transaction

With the rapid development of business, website systems often gradually evolve from a single architecture to a distributed, micro-service architecture, while for databases, a single-machine database architecture is transformed to a distributed database architecture. At this point, we will split a large application system into multiple application services that can be deployed independently, and remote collaboration between each service is required to complete transaction operations.

We can use the following figure to represent the monolithic architecture of our system at the beginning.
Insert picture description here
In the above figure, we organize different modules in the same project into different packages for management, and all the program codes are still placed in the same project.

In the future, due to the development of the business, we will expand it to a distributed, micro-service architecture. At this point, we split a large project into small microservices that can be deployed independently. Each microservice has its own database, as shown below: For
Insert picture description here
another example, in our program, it is often in the same Execute code similar to the following in a transaction to complete our requirements.

@Transactional(rollbackFor = Exception.class)
public void submitOrder() {
    
    
    orderDao.update(); // 更新订单信息
    accountService.update(); // 修改资金账户的金额
    pointService.update(); //  修改积分
    accountingService.insert(); // 插入交易流水
    merchantNotifyService.notify(); // 通知支付结果
}

The business in the above code only adds a @Transactional annotation to the submitOrder() method. Can this avoid the problem of distributed transactions in a distributed scenario? Obviously it won't work.

If the above codes correspond to: order information, capital account information, points information, transaction flow and other information are stored in different data, and after the payment is completed, the notified target system data is also stored in different databases. Distributed transaction problems will arise at this time.

Scenarios generated by distributed transactions

Cross JVM process

When we split the monolithic project into distributed and microservice projects, each service uses remote REST or RPC calls to coordinate business operations. The typical scenario is: order microservices and inventory microservices in the mall system, users will access the order microservices when placing orders, and the order microservices will call the inventory microservices to deduct inventory when they generate order records. Each microservice is deployed in a different JVM process. At this time, distributed transaction problems caused by cross-JVM processes will occur.
Insert picture description here
Cross database instance

A single system accesses multiple database instances, that is, distributed transactions will occur when accessing across data sources. For example, the order database and transaction database in our system are placed in different database instances. When a user initiates a refund, the user’s order database and transaction database will be operated at the same time, and the refund operation will be performed in the transaction database. Change the status of the order to refunded in the database. Since data is distributed in different database instances, different database connection sessions need to be used to manipulate the data in the database. At this time, distributed transactions are generated.
Insert picture description here
Multi-service order database

Multiple microservices access the same database. For example, access to the same database by order microservices and inventory microservices will also generate distributed transactions. The reason is that multiple microservices access the same database, essentially operating the database through different database sessions, and distribution will occur at this time Style affairs.
Insert picture description here
Note: The cross-database instance scenario and the multi-service single database scenario are essentially because different database sessions will be generated to manipulate the data in the database, thereby generating distributed transactions. These two scenarios are relatively easy to overlook.

Distributed transaction solution

After knowing the scenarios of distributed transactions, let's talk about specific solutions for distributed transactions.

2PC plan

2PC is the two-phase commit protocol, which divides the entire transaction process into two phases, Prepare phase and commit
phase, 2 refers to two phases, P refers to the preparation phase, and C refers to the commit phase .

Here, we use the MySQL database as an example. The MySQL database supports a two-phase commit protocol, which can be divided into two situations: success and failure.

Success:
Insert picture description here
Failure:

Insert picture description here
The specific process is as follows:

  • Prepare phase: The transaction manager sends Prepare messages to each participant. Each database participant executes the transaction
    locally and writes the local Undo/Redo log. At this time, the transaction is not committed. (Undo log is to record the data before modification, used for database rollback, Redo log is to record the modified data, used to write the data file after committing the transaction)
  • Commit phase: If the transaction manager receives a participant’s execution failure or timeout message, it directly sends a rollback message to each participant; otherwise, it sends a commit message; the participants follow the transaction The manager's instructions perform commit or rollback operations, and release the lock resources used during transaction processing.

When using the 2PC solution, it should be noted that the lock resources must be released in the final stage.

Reliable message eventual consistency scheme:

Reliable message eventual consistency scheme means that when the transaction initiator completes the local transaction and sends a message, the transaction participant (message consumer) must be able to receive the message and process the transaction successfully. This scheme emphasizes that as long as the message is sent to the transaction participant The final affairs of the parties must be consistent.

Insert picture description here
The transaction initiator (message producer) sends the message to the message middleware, the transaction participant receives the message from the message middleware, between the transaction initiator and the message middleware, and between the transaction participant (message consumer) and the message middleware All communication is through the network, and the uncertainty of network communication will cause distributed transaction problems. Therefore, we will introduce message confirmation service and message recovery service in the specific plan.

There are several issues to be aware of when using reliable message eventual consistency solutions:

  1. The atomicity of local affairs and message sending.
  2. The reliability of the message received by the transaction participants.
  3. The problem of repeated consumption of messages (need to achieve idempotence).

TCC scheme

TCC is divided into three stages:

  1. The Try stage is to do business inspection (consistency) and resource reservation (isolation). This stage is only a preliminary operation, and it can truly form a complete business logic together with the subsequent Confirm.
  2. The Confirm phase is to confirm the submission. After all branch transactions are successfully executed in the Try phase, Confirm will be executed. Under normal circumstances, when using TCC, it is considered that there is no error in the Confirm phase. That is: As long as the Try succeeds, Confirm must succeed. If something goes wrong in the Confirm phase, a retry mechanism or manual processing must be introduced.
  3. The Cancel phase is to execute the business cancellation of the branch transaction when the business execution error needs to be rolled back, and the reserved resources are released. Under normal circumstances, the use of TCC is considered to be a certain success in the Cancel phase. If an error occurs in the Cancel phase, a retry mechanism or manual processing must be introduced.

Insert picture description here
When using TCC distributed solutions, you need to pay attention to issues such as empty rollback, idempotence, and suspension.

Best effort notification

This scheme is mainly used to ensure the final consistency of data before multiple different systems, as shown in the figure below.
Insert picture description here
Using the best-effort notification scheme requires attention to idempotence and data back-check operations.

Conclusion:

Now that you want to make money, you need to have real skills, otherwise there is so much luck! If you need Java learning materials and interview materials, you can click to enter, code: csqq , get it for free!
Insert picture description here

Insert picture description here

Finally, I wish you all the best in your work!

Guess you like

Origin blog.csdn.net/m0_45270667/article/details/109162465