The past and present of distributed transactions


1. What is a transaction

         Transaction generally refers to something to be done or done. In computer terms, it refers to a program execution unit (unit) that accesses and may update various data items in the database. Transactions are usually caused by the execution of user programs written in high-level database manipulation languages ​​or programming languages ​​(such as SQL, C++ or Java), and are defined by statements such as begin transaction and end transaction (or function calls). A transaction consists of all operations performed between the beginning of the transaction and the end of the transaction.

     (1) Four characteristics of affairs

        Atomicity: Atomicity means that the transaction is an indivisible unit of work, and the operations in the transaction either all happen or never happen.

        Consistency: When the transaction is completed, all data must be in a consistent state. In the relevant database, all rules must be applied to the modification of the transaction to maintain the integrity of all data (example: transfer, the balance of two accounts is added, the value remains unchanged.)

        Isolation: When multiple users access the database concurrently, one user's transaction cannot be interfered by other users' transactions, and the data between multiple concurrent transactions must be isolated from each other.

        Persistence: Once a transaction is committed, its impact on the database is permanent.

    (2) The isolation level of the transaction

          Dirty read: A transaction can read the data of another uncommitted transaction.

          Non-repeatable read: that is, a transaction can only read data after another transaction is committed. Data is read multiple times in a transaction, but another transaction performs an update operation between the two reads, resulting in inconsistent data that can be read twice in the same transaction.

          Magic read: There are two transactions A and B; transaction A first performs batch update operations on the table data, transaction B performs insert operations on the table, and A performs query operations on the table. As a result, unupdated data appears. Like a hallucination

        Read uncommitted: Read uncommitted. Dirty reads, non-repeatable reads, and phantom reads will occur.

        Read committed: Read committed. There will be non-repeatable reading, phantom reading.

        Repeatable read: Repeatable read. There will be a phantom reading.

        Serializable: The database is designed to be single-threaded. Can avoid all the above problems

2. What is a distributed transaction?
     Simply put, distributed transactions are used to ensure data consistency between different nodes in a distributed system.

3. Why use distributed transaction
     scenarios: With the development of the Internet, most Internet companies will split the system into services, and system servitization has become more and more popular. Customers place orders, customers pay, and order The system successfully placed an order to generate an order, but the in-line inventory service needs to deduct the inventory operation. If the inventory deduction is abnormal, how to ensure the data consistency between the two system services? So the use of distributed transactions is imminent

Four. Theoretical knowledge of distributed transactions

   (1) CAP Theory
        C: Consistency (Consistency) The client knows that some operations will take effect at the same time.

        A: Availability (Availability) Each operation must end with a predictable result.

        P: Partition tolerance (Partition tolerance) Even if a single component is unavailable, the operation can still be completed.

        Cap three is generally difficult to achieve all, generally choose two of the three, cp or ap, but in general Internet companies generally sacrifice strong consistency in exchange for weak consistency in exchange for high availability (AP)

xa (global transaction-strong consistency)

   (2) BASE theory

       The consistency and usability of the cap are weighed. The core idea is that even if strong consistency cannot be achieved, each application can adopt an appropriate method to make the system eventually consistent according to its own business characteristics.

      BA: Basically usable. In the case of force majeure,'availability' can still be guaranteed, and a clear result (the difference between basic availability and high availability) can still be returned even within a certain period of time.
     S Flexible state Different copy states of the same data may not need to be consistent in real time.

     E. Eventually consistent status of different copies of the same data may not need to be consistent in real time, but it must be ensured that they are still consistent after a certain period of time.

     pH balance

     ACID is acid, Base is alkali, the choice between ACID and Base in the development scenario is acid-base balance

   (3) Two-stage submission (2PC) and three-stage (3PC) submission concepts

   V. Implementation of distributed transactions

   (1) xa protocol

      Global transaction-strong consistency

   (2) Reliable delivery of mq messages (as shown in the figure below)

      

   (3) 2pc's TCC method realizes distributed transaction (specific code examples will be supplemented and released to github later)
    Tcc method realizes distributed transaction (try phase is successful, confirm is successful by default)

   try phase

confirm stage

cancel phase

       

      tcc source framework address  https://github.com/changmingxie/tcc-transaction

       easyTransaction source code framework address  https://github.com/QNJR-GROUP/EasyTransaction

       hmily distributed transaction framework address: https://github.com/Dromara/hmily         

   (4) Global consistency

     Realized at the server level, such as using Alibaba Cloud and Tencent Cloud to achieve global consistency

    

Guess you like

Origin blog.csdn.net/aazhzhu/article/details/100344555