Distributed transactions--Theoretical basis of distributed transactions

1. Local affairs

Local transactions are traditionalstand-alone transactions. In traditional database transactions, four principles must be met:

2. Distributed transactions

Distributed transactions refers to transactions that are not generated under a single service or a single database architecture, for example:

  • Distributed transactions across data sources

  • Distributed transactions across services

  • Comprehensive situation

        ​​​​After the horizontal split of databases and the vertical split of services, a business operation usually needs to span multiple databases and services to complete. For example, the more common order payment cases in the e-commerce industry include the following behaviors:

  • Create new order

  • Deduct product inventory

  • Amount deducted from user account balance

Completing the above operation requires access to three different microservices and three different databases.

        The creation of orders, inventory deductions, and account deductions are a local transaction within each service and database, ensuring ACID principles.

        But when we regard three things as a "business", we must ensure the atomicity of the "business". Either all operations succeed or all fail. Partial success and partial failure are not allowed. , this is transactions under a distributed system.

        At this time, ACID is difficult to satisfy. This is a problem that distributed transactions must solve.

1.3. Demonstrate distributed transaction issues

We use a case to demonstrate the problem of distributed transactions:

1)Create a database named seata_demo, and then import the SQL file:

Link: https://pan.baidu.com/s/10bd7RX9lXqIDlMODMOyV4Q?pwd=3l4o 
Extraction code: 3l4o 
--from Baidu Sharing of Netdisk Super Member V6

2)Import microservices:

Link: https://pan.baidu.com/s/1tB_c5KbskqGfFLP-phC0dQ?pwd=tgvp 
Extraction code: tgvp 
-- Sharing from Baidu Netdisk super member V6

The microservice structure is as follows:

in:

seata-demo: parent project, responsible for managing project dependencies

  • account-service: Account service, responsible for managing users’ capital accounts. Provides an interface for deducting balances

  • storage-service: Inventory service, responsible for managing product inventory. Provides an interface for deducting inventory

  • order-service: Order service, responsible for managing orders. When creating an order, account-service and storage-service need to be called

3) Start nacos and all microservices

4) Test the order function and issue a Post request:

The test found that when the inventory is insufficient, if the balance has been deducted, it will not be rolled back, and a distributed transaction problem occurs. (The transactions of each branch are not aware of each other, and each commits its own transaction, so it cannot be rolled back)

3. Theoretical basis

Solving distributed transaction problems requires some basic knowledge of distributed systems as theoretical guidance.

3.1.CAP Theorem

In 1998, Eric Brewer, a computer scientist at the University of California, proposed that distributed systems have three indicators.

  • Consistency

  • Availability

  • Partition tolerance

Their first letters are C, A, and P respectively.

Eric Brewer said that it is impossible to achieve these three indicators at the same time (at most two at the same time). This conclusion is called the CAP theorem.

3.1.1.Consistency

Consistency: When a user accesses any node in the distributed system, the data obtained must be consistent.

For example, it now contains two nodes, and the initial data in them is consistent:

 When we modify the data of one of the nodes, the data of the two are different:

 To maintain consistency, data synchronization from node01 to node02 must be achieved:

3.1.2. Availability

Availability: Users accessing any healthy node in the cluster must be able to get a response instead of timeout or rejection.

As shown in the figure, in a cluster with three nodes, you can get a timely response when accessing any one:

 When some nodes are inaccessible due to network failure or other reasons, it means that the node is unavailable:

3.1.3. Partition fault tolerance

Partition: Due to network failure or other reasons, some nodes in the distributed system lose connection with other nodes, forming independent partitions.

Tolerance: When a partition occurs in the cluster, the entire system must continue to provide services to the outside world

3.1.4.Contradiction

In a distributed system, the health of the network between systems cannot be guaranteed 100%, and there will definitely be failures, and services must be guaranteed to the outside world. Therefore Partition Tolerance is inevitable.

Problems arise when a node receives new data changes:

If you want to ensureconsistency at this time, you must wait for the network to recover. After data synchronization is completed, the entire cluster can provide services to the outside world. In blocking state and unavailable.

If you want to ensureavailability at this time, you cannot wait for the network to recover. Then there will be data inconsistency between node01, node02 and node03.

In other words, under the condition that P will definitely appear, only one of A and C can be realized.

3.2.BASE theory (reconciling contradictions)

BASE theory is a solution to CAP and contains three ideas:

  • Basically Available (Basically Available): When a failure occurs in a distributed system, partial availability is allowed to be lost, that is, core availability is guaranteed.

  • Soft State: Within a certain period of time, intermediate states, such as temporary inconsistent states, are allowed.

  • Eventually Consistent: Although strong consistency cannot be guaranteed, data consistency will eventually be achieved after the soft state ends.

4. Ideas for solving distributed transactions

The biggest problem of distributed transactions is the consistency of each sub-transaction. Therefore, we can learn from the CAP theorem and BASE theory. There are two solutions:

  • AP mode: Each sub-transaction is executed and submitted separately, allowing inconsistent results, and then taking remedial measures to restore the data to achieve final consistency.

  • CP mode: Each sub-transaction waits for each other after execution, commits at the same time, and rolls back at the same time to achieve strong consistency. However, while the transaction is waiting, it is in a weakly available state.

But no matter which mode it is, it is necessary to communicate with each other between subsystem transactions and coordinate the transaction status, which means aTransaction Coordinator (TC)

The subsystem transaction here is calledbranch transaction; the related branch transactions together are calledGlobal transactions.

If you like it, please give it a follow!

Guess you like

Origin blog.csdn.net/qq_45672041/article/details/135030006