Comparison summary of Seata's four modes | Spring Cloud 59

I. Introduction

Through the following series of chapters:

docker-compose implements high-availability deployment of Seata Server | Spring Cloud 51

Seata AT mode theory study, transaction isolation and partial source code analysis | Spring Cloud 52

Spring Boot integrates Seata with AT mode distributed transaction example | Spring Cloud 53

Seata XA mode theory learning, use and precautions | Spring Cloud54

Seata TCC mode theory learning, production-level use example construction and precautions | Spring Cloud55

Solve the problems of idempotence, suspension and empty rollback in Seata TCC mode | Spring Cloud56

Seata Saga mode theory learning, production-level use example construction and precautions (1) | Spring Cloud57

Seata Saga mode theory learning, production-level use example construction and precautions (2) | Spring Cloud58

We have an in-depth understanding of the theory and use of Seataits AT, XA, TCC, and transaction modes, and compare and summarize the four modes of today's distributed basic theory and .SagaSeata

Two, affairs

A simple understanding of a transaction is a program execution unit ( ) that updates various data in the database unit. Strictly speaking, a transaction must have atomicity, consistency, isolation and persistence, referred to as ACID.

  • Atomicity ( Atomicity): All operations within a transaction are either executed or not executed.

  • Consistency ( Consistency): Consistency means that the transaction must change the database from one consistent state to another, that is, the results before and after the database processing should be consistent with the results after execution according to the business rules.

  • Isolation ( Isolation): Refers to the fact that multiple transactions will not interfere with each other when they are executed concurrently.

  • Persistence ( Durability): Refers to the fact that after a transaction is completed, the data is stored forever, and other subsequent operations or failures will not affect the result of the transaction.

3. Distributed transactions

As the name implies, distributed transactions are to implement transactions in distributed systems, which are composed of multiple local transactions.

The root cause of using distributed transactions: high concurrency, when a server is too busy, it is necessary to add multiple servers to help respond to requests. At this time, a problem will arise, that is, there is only one copy of data. How to ensure that in a distributed environment, after each transaction is executed, the data is correct, such as inventory service, order service, and payment service. They are deployed separately. At this time, do Once the purchase is completed, all three services need to complete corresponding operations, either failing together or succeeding together. This leads to the problem of solving distributed transactions.

4. CAP Theory

In a distributed system, the three elements of consistency ( Consistency), availability ( Availability), and partition tolerance ( ) can only achieve two points at the same time at most, and it is impossible to take care of all three.Partition tolerance

  • Consistency ( Consistency): Whether all data backups in a distributed system have the same value at the same time. (equivalent to all nodes accessing the same latest data copy)

  • Availability ( Availability): After some nodes in the cluster fail, whether the cluster as a whole can still respond to the client's read and write requests. (high availability for data updates)

  • Partition tolerance ( Partitiontolerance): In terms of practical effect, partition is equivalent to the time limit requirement for communication. If the system cannot achieve data consistency within the time limit, it means that a partition has occurred, and a choice must be made between C and A for the current operation

insert image description here

The overlapping part in the figure is the trade-off that must be made in a distributed environment: either CP, or AP, or CA, but it does not exist CAP.

CP: Sacrifice usability. The replication synchronization protocol generally uses a strict quorum protocol ( Paxos, Raft, ZAB) or 2PCprotocol. CPTypes of systems are MongoDB, HBase, Zookeeper, Redis, ElasticSearchand so on.

AP: Sacrifice consistency. The replication synchronization protocol generally uses a non-strict quorum protocol. APTypes of systems are Couch DB, Cassandra, and Amazon Dynamoso on.

CA: Leaving RDBMSaside Oracleand MySQLnot talking about, among the distributed systems that claim to be CAsystematic, there are Google Spannerand Ali OceanBase.

Q: Why can't both consistency and availability be guaranteed in a distributed system?

Answer: First of all, there is a premise. For distributed systems, partition fault tolerance is the most basic requirement, so basically we can only choose between consistency ( C) and availability ( ) when designing a distributed system.A

If the consistency is guaranteed ( C): For the node N1and N2, when writing data N1in , N2the operation on it must be suspended, and only when N1the synchronization data arrives N2can the read N2and write requests be made, and N2the request submitted by the client during the suspended operation will be Receive failed or timed out. Obviously, this is antithetical to usability.

If the availability is guaranteed ( A): Then N2the read and write operations cannot be suspended, but N1if the data is being written at the same time, this violates the consistency requirement.

CAPACIDThe sum in and Ais Cquite different:

  • CThe difference:

    • ACIDConsistency is about database rules, and the database always transitions from one consistent state to another consistent state;

    • CAPConsistency is to replicate data between distributed multi-servers so that these servers have the same data. Due to network speed limitations, the time consumed by this replication on different servers is not fixed. Clusters view different nodes by organizing clients Maintain a logical view of data that has not been synchronized on the Internet, which is a concept of consistency in the distributed domain;

  • AThe difference:

    • ACIDrefers Ato atomicity ( Atomicity), which means that a transaction is regarded as an indivisible minimum unit of work, and all operations in the transaction are either successfully committed or rolled back after failure;

    • CAPrefers Ato availability ( Availability), which refers to whether the cluster as a whole can respond to the read and write requests of the client after some nodes in the cluster fail;

Five, BASE theory

BASEis an acronym for Basically Available(basically available), Soft state(soft state), and Eventually consistent(eventually consistent).

In a distributed system, CAPthe theory is the guiding thinking, and BASEthe theory is an extension of CAPthe theory , which is the result of a trade-off between consistency and availability in the system. The core idea is: even if strong consistency cannot be achieved, each application can According to its own business characteristics, appropriate methods can be adopted to make the system achieve final consistency.APCAP

  • Basic availability ( Basically Available): It means that when a distributed system fails, it is allowed to lose part of the availability to ensure that the core is available.

  • Flexible state ( Soft state): refers to allowing the system to have an intermediate state, and it is considered that the intermediate state will not affect the overall availability of the system. For example, the delay that allows replica synchronization between different nodes is the embodiment of flexible state.

  • Final Consistency ( Eventually consistent): It means that all copies in the system can finally reach a consistent state after a certain period of time. Therefore, the essence of final consistency is that the system needs to ensure that the final data can be consistent, and it does not need to ensure the strong consistency of system data in real time.

6. Transaction Consistency

  • Strong consistency: After a certain data in the system is successfully updated, subsequent accesses can see the updated value.

  • Weak consistency: After a certain data in the system is updated, subsequent access may get the updated value or the value before the change.

  • Final consistency: A certain data in the system is updated, and after a period of time, all accesses are finally updated values.

Rigid affairs: follow ACIDthe principle, strong consistency.

Flexible transactions: follow BASEthe theory, eventual consistency.

七、Set

SeataIt is a distributed transaction solution jointly open sourced by Ant Financial and Alibaba in January 2019. Committed to providing high-performance and easy-to-use distributed transaction services, creating a one-stop distributed solution for users.

Official website address: http://seata.io/ , where documents and podcasts provide a lot of usage instructions and source code analysis.

7.1 Architecture

SeataThere are three important roles in transaction management:

  • TC( Transaction Coordinator) - Transaction coordinator: maintains the state of global and branch transactions, and coordinates the commit or rollback of global transactions.

  • TM( Transaction Manager) - Transaction Manager: Defines the scope of a global transaction, starts a global transaction, commits or rolls back a global transaction.

  • RM( Resource Manager) - Resource Manager: Manage resources for branch transactions, talk to TC to register branch transactions and report the status of branch transactions, and drive branch transactions to commit or rollback

SeataBased on the above architecture, four different distributed transaction solutions are provided:

  • XAMode: strong consistency phased transaction mode, sacrificing certain availability and no business intrusion

  • TCCMode: eventually consistent phased transaction mode with business intrusion

  • ATMode: eventually consistent phased transaction mode, no business intrusion, also Seata's default mode

  • SAGAMode: long transaction mode, with business intrusion

7.2 XA mode

7.2.1 Overall mechanism

Within Seatathe framework of distributed transactions defined by , a transaction modeXA that uses the support of transaction resources (databases, message services, etc.) for XAthe protocol to manage branch transactions with the mechanism of the protocol .

insert image description here

The work of the first stage of RM:

① Register the branch transaction to TC

② Execute branch business sql but do not submit

③ Report execution status to TC

The work of the second phase of TC:

  • TC detects the transaction execution status of each branch

    • a. If all succeed, notify all RMs to commit the transaction
    • b. If there is a failure, notify all RMs to roll back the transaction

The work of the second phase of RM:

  • Receive TC instructions, commit or rollback transactions

7.2.2 Advantages and disadvantages

  • What are the advantages of XA mode?

    • The strong consistency of transactions meets the ACID principle.

    • Commonly used databases are supported, the implementation is simple, and there is no code intrusion

  • What are the disadvantages of XA mode?

    • Because the database resources need to be locked in the first stage and released only after the end of the second stage, the performance is poor

    • Relying on Relational Databases to Realize Transactions

7.3 AT mode

7.3.1 Overall mechanism

ATpattern is an evolution of the two-phase commit protocol (eventually consistent phased transaction pattern). It makes up XAfor the defect that the resource locking period in the mode is too long.

insert image description here
Phase one RMwork:

  • register branch transaction

  • record undo-log(data snapshot)

  • Execute business sqland submit

  • report transaction status

RMWork at stage 2 submission :

  • undo-logjust delete

RMWork during phase 2 rollback :

  • According to undo-logrestore data to before update

7.3.2 The difference between AT and XA

  • XAIn the first phase of the mode, transactions are not committed, and resources are locked; ATin the first phase of the mode, resources are not locked.

  • XAPatterns rely on database mechanisms for rollback; ATpatterns utilize data snapshots for data rollback.

  • XASchemas are strongly consistent; ATschemas are eventually consistent.

7.4 TCC mode

7.4.1 Overall mechanism

The whole is a two-phase commit model. A global transaction is composed of several branch transactions, and the branch transactions must meet the requirements of the two-phase commit model, that is, each branch transaction needs to have its own:

  • first stage Trybehavior
  • second stage Confirmor Cancelbehavior

insert image description here

In the above process, a total of three methods are involved, Try, Confirmand Cancel, these three methods are completely user-defined methods, which need to be implemented by ourselves. Compared with ATthe transaction mode, TCCthis mode does not depend on the transaction support of the underlying database.

7.4.2 Advantages and disadvantages

TCCWhat does each phase of the pattern do?

  • Try: resource checking and reservation

  • Confirm: Business Execution and Submission

  • Cancel: release of reserved resources

TCCWhat are the advantages?

  • Complete the direct commit transaction in one stage, release database resources, and have good performance

  • Compared with the AT model, there is no need to generate snapshots, no need to use global locks, and the performance is the strongest

  • Does not rely on database transactions, but relies on compensation operations, which can be used for non-transactional databases

TCCWhat are the disadvantages?

  • There is code intrusion, and it is too troublesome to manually write try, confirm and cancel interfaces

  • Soft state, transactions are eventually consistent

  • It is necessary to consider Confirmthe Cancelfailure of sum and do idempotent processing

7.5 Saga mode

7.5.1 Overall mechanism

SagaThe mode is Seataa long-term transaction solution provided. In Sagathe mode, each participant in the business process submits a local transaction. When a certain participant fails, the previous successful participants are compensated. The first-stage forward service and the second-stage compensation Services are implemented by business development.

insert image description here
Theoretical basis: Hector & Kenneth published the paper Sagas (1987)

7.5.2 Advantages and disadvantages

  • Applicable scene:
    • Long business process and many business processes
    • Participants include other companies or legacy system services that cannot provide the three interfaces required by the TCC model
  • Advantage:
    • Commit local transactions in one phase, lock-free, high performance
    • Event-driven architecture, participants can execute asynchronously, high throughput
    • Compensation services are easy to implement
  • shortcoming:
    • Isolation is not guaranteed

8. Comparison of Seata's four modes

- SHAH AT TCC SAGA
consistency strong consistency Weak consistency Weak consistency eventually consistent
isolation completely isolated Based on global lock isolation Isolation based on resource reservation no isolation
code hacking none none Yes, there are three interfaces to write Yes, to write state machine and compensation business
performance Difference good very good very good
Scenes Services with high requirements for consistency and isolation Most distributed transaction scenarios based on relational databases can Transactions with high performance requirements and transactions involving non-relational data The business process is long and there are many business processes, and the participants include other companies or legacy system services, which cannot provide the TCCthree interfaces required by the model

Guess you like

Origin blog.csdn.net/ctwy291314/article/details/131430012