Distributed Transaction (Seata)

1. Affairs

First of all, let's talk about transactions. Everyone is familiar with them. A transaction means that multiple SQL additions, deletions and changes to the database either succeed or fail together.
Insert picture description here
As shown above: For example, when we place an order, the order module, inventory module, and account module are all using the same DB and the same database Connect. On this basis, to save orders, reduce inventory, and reduce account balances, then it can be guaranteed to be The same transaction can ensure the consistency of the transaction.

Note: The emphasis here is on the same database Connect. If the above figure uses the same DB but not the same Connect (one Connect for orders, one Connect for inventory, and one Connect for account), it cannot belong to the same transaction.
 
Summary: The transaction belongs to the same database Connect.

2. Distributed transactions

At the beginning, the three businesses in the above picture may belong to the same project, but as the business volume increases and the number of your users increases, you are likely to transform your system, and you are likely to transform into the following:
Insert picture description here
orders, Inventory and account are each split into an independent service, and they each have their own DB.

For an e-commerce user, he may not feel any changes, and place an order as usual, but for our back-end internal personnel, the next order is now very different from before, and the inventory system can only have A Connect connects to your own DB. If you want to manipulate the account balance, you can only send a request from the order system to operate our account system alone.

At this time, you still want to ensure the consistency of the transaction, what should you do? You still use LocalTransactional (database local transaction) can not solve this problem;
then we will use our distributed transaction at this time .

3. Case

For example, there are two services: server1, :server2, and transactions are added to both service methods.

In server1, we will save a piece of server1 data in the database, and then call server2, and the server will also save a piece of server2 data in the database. After server2 is called, an exception occurs immediately after server1. At this time, there will be several Article data? Whose data is it?

Server1:

public class Server1 {
    
    
    
    @Transactional
    public void test() {
    
    
        Server1Dao.insert(new Order(UUID.randomUUID()));
        HttpClient.get("http://localhost:8082/server2/test");
        int i = 1/0;
    }
    
}

Server2:

public class Server2 {
    
    
    
    @Transactional
    public void test() {
    
    
        Server2Dao.insert(new Order(UUID.randomUUID()));
    }
    
}

Answer: Yes, it is a piece of server2 data.

So why? I obviously added transactions to both methods. This is what we said, we are not using the same database Connect.
Combining with the actual situation: My order system reported an error rollback, but my inventory system still reduced the inventory, and there was no rollback. This is a very big problem.

What the distributed transaction needs to do is: we don't even store server2 data in the database.

4. Distributed transaction framework: Seata

In the microservice architecture, distributed transactions have always been a relatively difficult to achieve. Usually we do not require strong consistency, and the final consistency of the data is achieved through the message queue.

Seata is a distributed transaction solution provided by Ali. Through Seata, global transaction management between multiple microservices can be realized.

Seata has two original design intentions to solve the distributed transaction problem

  • No intrusion into the business: that is, to reduce the intrusion of distributed transaction problems into the business caused by the
    microservices in the technical architecture. High performance: reduce the performance consumption caused by the distributed transaction solution . There are two distributed transaction implementations in seata Solution, AT and TCC
  • AT mode : Mainly focus on the data consistency of multi-DB access, of course, also include the problem of multi-DB data access consistency under multi-service; TCC mode : Mainly focus on business splitting, and solve the call between microservices when resources are expanded horizontally according to the business Consistency problem

Seata is divided into three roles:

TC: Transaction Coordinator, used to control global transactions and Batch

TM: Transaction manager, used to start, roll back and commit transactions

RM: Resource Manager, used to register local resources as a batch of global transactions
Insert picture description here

as the picture shows:

The three transactions of order, inventory, and account all belong to a distributed transaction. You can understand TC as a big transaction. Orders, inventory, and accounts are all sub-transactions of TC, which are used to coordinate the sub-transactions, whether it is commit or rollback;

But note: TC is only for coordination. The commit and rollback operations of specific sub-transactions are still operated by TM. TC only tells these sub-transactions what to do.

5. Implementation Ideas of Distributed Transaction Framework

Think about it, if we want to write a distributed transaction framework, how should we implement it?

First of all: our previous process: whether it is an order or an inventory system:

  1. establish connection
  2. Open transaction
  3. Execution method
  4. Submit/roll back
    Insert picture description here
    This is obviously not enough. Because we submitted the next system directly after the execution, we don't know whether the previous system succeeded or failed.

So what do we need to do to solve this problem?

First : After we perform the third step and before the fourth step, let's wait for a while, and wait for someone to tell me whether we should commit or rollBack.

1. wait…

which is:

  1. establish connection
  2. Open transaction
  3. Execution method
    // wait...
  4. Commit/rollback

Second : In our case, there are now two systems on the call chain, and the order can tell the inventory system success or failure together. Wouldn't it be very complicated if there are hundreds of services on the call chain.

Then there will be a transaction manager at this time , and now we call it a transaction manager.

Our order and inventory system register our affairs to the affairs manager.

There are two parameters when registering:
type: commit/rollBack
because when we are waiting, we actually know whether our method is committed or rolled back;
groupId: which group the transaction belongs to.

Insert picture description here

2. Get control of the transaction, control commit/rollBack

Third : Now our transaction manager can actually judge whether our distributed transaction should commit/rollBack according to the status of each sub-transaction.

So in the end:
3. Tell us the last operation of the sub-transaction: commit/roll back

Insert picture description here

The overall idea is like this...

The basic idea of ​​Seata is like this...

6. Seata architecture gains and losses

6.1 Highlights
Compared with other distributed transaction frameworks, the main highlights of the Seata architecture are as follows:

The application layer realizes automatic compensation based on SQL analysis, thereby minimizing business intrusion ;
independently deploys TC (transaction coordinator) in distributed transactions, responsible for transaction registration and rollback;
realizes write isolation and read through global locks isolation.

There are other distributed transaction frameworks: TCC, ICN...
but the TCC framework is intrusive, while Seata is not.

The specific implementation mechanisms of these features are described in detail on the official website and github, so I will not introduce them here.

6.2 Performance loss
Let's take a look at the overhead added by Seata (ignoring improper calculations for memory storage operations):
an Update SQL requires global transaction xid acquisition (communication with TC), before image (parse SQL, query the database once) , After image (query the database once), insert undo log (write the database once), before commit (communication with TC, determine the lock conflict), these operations all require a remote communication RPC, and they are synchronized. In addition, the insertion performance of the blob field is not high when the undo log is written. Each write SQL will increase so much overhead, a rough estimate would increase five times the response time (two-phase asynchronous though, but in fact will take up system resources, network, thread, database).

How to generate front and rear mirrors? Parse the SQL through druid, reuse the where condition in the business SQL, and then generate Select SQL for execution.

Guess you like

Origin blog.csdn.net/RookiexiaoMu_a/article/details/105342614