Talking about Distributed Transaction Theory

Insert picture description here

Basic concept

What is a transaction

What is a transaction? Take an example in life: you go to a small shop to buy things. "Pay with one hand and deliver with one hand" is an example of a business. The payment and delivery must all succeed in order for the transaction to be considered successful. If any activity fails, the transaction All successful activities will be cancelled.

Understand the above example, and then look at the definition of a transaction: a transaction can be regarded as a large activity, which is composed of different small activities, and these activities either all succeed or all fail.

Local affairs

In computer systems, transactions are controlled more through relational databases. This is achieved by using the transaction characteristics of the database itself. Therefore, it is called database transactions. Because the application mainly depends on the relational database to control transactions, and the database is usually used in The same server, so transactions based on relational databases are also called local transactions.

Recall the four major characteristics of database transactions ACID:

A (Atomic) : Atomicity, all operations that constitute a transaction are either executed or not executed at all. It is impossible to have partial success and partial failure.

C (Consistency) : Consistency, before and after the transaction is executed, the consistency constraints of the database are not destroyed. For example: Zhang San transfers 100 yuan to Li Si, and the data before and after the transfer is correct. This is called consistency. If Zhang San transfers 100 yuan, and Li Si’s account does not increase by 100 yuan, there will be a data error. Consistency is not reached.

I (Isolation) : Isolation. Transactions in the database are generally concurrent. Isolation means that the execution of two concurrent transactions does not interfere with each other, and one transaction cannot see the intermediate state of the running process of other transactions. By configuring the transaction isolation level, problems such as dirty reads and repeated reads can be avoided.

D (Durability) : Durability , after the transaction is completed, the data changes made by the transaction will be persisted to the database and will not be rolled back.

When a database transaction is implemented, all operations involved in a transaction are incorporated into an indivisible execution unit. All operations in the execution unit either succeed or fail. As long as any one of the operations fails, it will cause the entire Transaction rollback

Distributed transaction

With the rapid development of the Internet, the software system has changed from a single application to a distributed application. The following figure describes the evolution of a single application to a microservice:
Insert picture description here

Distributed systems will split an application system into multiple services that can be deployed independently. Therefore, remote collaboration between services and services is required to complete transaction operations. In this distributed system environment, different services collaborate remotely through the network. Completing transactions are called distributed transactions . For example, user registration and bonus transactions, order creation and inventory reduction, and bank transfer transactions are all distributed transactions.

We know that local transactions depend on the transaction characteristics provided by the database itself, so the following logic can control local transactions:

begin transaction;
//1.本地数据库操作:张三减少金额
//2.本地数据库操作:李四增加金额
commit transation;

But in a distributed environment, it will become as follows:

begin transaction;
//1.本地数据库操作:张三减少金额
//2.远程调用:让李四增加金额
commit transation;

It is conceivable that when the remote call allows Li Si to increase the amount successfully, the remote call does not return due to network problems, and the local transaction submission fails and the operation of Zhang San to reduce the amount is rolled back. At this time, the data of Zhang San and Li Si is Inconsistent.

Therefore, on the basis of a distributed architecture, traditional database transactions cannot be used. The accounts of Zhang San and Li Si are not in the same database or even in the same application system. Realization of transfer transactions requires remote calls, which will lead to distribution due to network problems. Style transaction problem.

Scenarios generated by distributed transactions

  1. A typical scenario is to complete transaction operations through remote calls between microservices in the microservice architecture. For example: order micro service and inventory micro service, the order micro service requests the inventory micro service to reduce the inventory at the same time the order is placed. In short: Distributed transactions are generated across JVM processes.

Insert picture description here

  1. A single system accesses multiple database instances. Distributed transactions occur when a single system needs to access multiple databases (instances). For example: user information and order information are stored in two MySQL instances separately. To delete user information in the user management system, you need to delete user information and user order information separately. Since the data is distributed in different data instances, you need to operate through different database links. Data, distributed transactions are generated at this time. In short: Distributed transactions are generated across database instances.

Insert picture description here

  1. Multiple services access the same database instance. For example: Order microservices and inventory microservices will generate distributed transactions even if they access the same database. The reason is that across JVM processes, the two microservices hold different database links for database operations. When generating distributed transactions.

Insert picture description here

Basic Theory of Distributed Transaction

Through the previous study, we have learned the basic concepts of distributed transactions. Different from local transactions, a distributed system is called distributed because the nodes that provide services are distributed on different machines and interact with each other through the network. The entire system cannot provide services just because of a little network problem. Network factors have become one of the criteria for consideration of distributed transactions. Therefore, distributed transactions need further theoretical support. Next, let's first learn the CAP theory of distributed transactions.

Before explaining distributed transaction control solutions, we need to learn some basic theories, and guide us to determine the goals of distributed transaction control through theoretical knowledge, so as to help us understand each solution.

CAP theory

CAP is the abbreviation of Consistency, Availability, and Partition tolerance, which represent consistency, availability, and partition tolerance respectively.
Below we explain separately:
In order to facilitate the understanding of CAP theory, we combine some business scenarios in the e-commerce system to understand CAP.
The following figure shows the execution process of commodity information management:

Insert picture description here
The overall execution process is as follows:
1. Commodity service requests the main database to write commodity information (add commodity, modify commodity, delete commodity)
2. The main database writes the commodity service response successfully.
3. Commodity service requests to read commodity information from the database.

C - Consistency:

Consistency means that the read operation after the write operation can read the latest data state. When the data is distributed on multiple nodes, the data read from any node is the latest state.

In the above figure, the read and write consistency of commodity information is to achieve the following goals:
1. If the commodity service is successfully written into the main database, the query for new data from the secondary database is also successful.
2. If the commodity service fails to be written into the master database, the query of new data from the slave database also fails.

How to achieve consistency?
1. After writing to the master database, synchronize the data to the slave database.
2. After writing to the master database, lock the slave database during synchronization with the slave database, and release the lock after the synchronization is completed, so as not to query the slave database for the old data after the new data is successfully written.

The characteristics of the consistency of the distributed system:
1. Due to the process of data synchronization, the response of the write operation will be delayed to a certain extent.
2. In order to ensure data consistency, the resources will be temporarily locked, and the locked resources will be released after the data synchronization is completed.
3. If the node that fails to request data synchronization will return an error message, it will never return the old data.

A - Availability :

Availability means that any transaction operation can get a response result, and there will be no response timeout or response error.
In the above figure, the product information read to meet the availability is to achieve the following goals:
1. Receive a data query request from the database and immediately respond to the data query result.
2. The response timeout or response error is not allowed from the database.
How to achieve availability?
1. After writing to the master database, synchronize the data to the slave database.
2. To ensure the availability of the slave database, the resources in the slave database cannot be locked.
3. The instant data has not been synchronized, and the data to be queried should be returned from the database, even if it is old data. If there is no old data, a default message can be returned as agreed, but no error or response timeout can be returned.
The characteristics of the availability of the distributed system:
1. All requests are responded, and there will be no response timeout or response error.

P - Partition tolerance :

Usually, each node of a distributed system is deployed in a different subnet, which is a network partition. It is inevitable that communication between nodes will fail due to network problems. At this time, services can still be provided to the outside world. This is called partition tolerance. Sex.

In the above figure, the read and write of commodity information to meet the partition tolerance is to achieve the following goals:
1. Failure to synchronize data from the master database to the slave database does not affect the read and write operations.
2. The failure of one node will not affect the other node's external services.

How to achieve partition tolerance?
1. Try to use asynchronous instead of synchronous operations, such as using asynchronous methods to synchronize data from the master database to the slave data, so that the nodes can effectively achieve loose coupling.
2. Add slave database nodes, and one slave node suspends the other slave nodes to provide services.
The characteristics of distributed partition tolerance:
1. Partition tolerance is a basic capability possessed by a distributed system.

CAP combination method

1. Does the commodity management example above have CAP at the same time?

In all distributed transaction scenarios, the three characteristics of CAP will not be available at the same time, because C and A cannot coexist under the premise of having P.

such as:

The following figure satisfies P, which means to achieve partition tolerance:
Insert picture description here
the meaning of partition tolerance in this figure is:
1) The master database synchronizes data to the slave data through the network. It can be considered that the master and slave databases are deployed in different partitions and interact through the network.
2) When there is a problem in the network between the master database and the slave database, it will not affect the external service provided by the master database and the slave database.
3) The failure of one of its nodes will not affect the provision of services by the other node.
If you want to implement C, you must ensure data consistency. To prevent inconsistent data from querying the slave database during data synchronization, you need to lock the data from the database and unlock it after the synchronization is completed. If the synchronization fails, the database will return an error message or timeout. information.

If you want to achieve A, you must ensure data availability, you can query data from the data at any time, and it will not respond to timeouts or return error messages. Through analysis, it is found that C and A are contradictory under the premise of satisfying P.

2. What are the combinations of CAP?
Therefore, when dealing with distributed transactions in production, it is necessary to determine which two aspects of CAP are satisfied according to the needs.
1) AP:
Give up consistency and pursue partition tolerance and availability. This is the design choice of many distributed systems.
For example: the
above commodity management can fully implement AP, provided that the user can accept that the data inquired is not up to date within a certain period of time.
Usually the implementation of AP will ensure final consistency. The BASE theory described later is extended based on AP. Some business scenarios such as: order refund, refund succeeded today, and the account will be credited tomorrow, as long as the user can accept the credit within a certain period of time That's it.
2) CP:
Give up availability and pursue consistency and partition fault tolerance. Our zookeeper is actually pursuing strong consistency. Another example is inter-bank transfer. A transfer
request must wait for the entire transaction to be completed by the banking systems of both parties.

3) CA:
Give up the tolerance of partitions, that is, do not perform partitions, and do not consider the problem of network failure or node hang-up, and consistency and availability can be achieved. Then the system
will not be a standard distributed system, and our most commonly used relational data satisfies CA.
The above commodity management, if you want to implement CA, the structure is as follows:

Insert picture description here
Data synchronization is no longer between the master database and the slave database. The database can respond to each query request, and the latest data can be returned for each query request through the transaction isolation level.

to sum up

Through the above, we have learned the relevant knowledge of CAP theory. CAP is a proven theory: a distributed system can only meet the three requirements of consistency (Consistency), availability (Availability) and partition tolerance (Partition tolerance) at the same time. Two of the items. It can be used as our consideration standard for architecture design and technology selection. For most large-scale Internet application scenarios, there are many nodes, scattered deployments, and the current clusters are getting larger and larger, so node failures and network failures are normal, and service availability must reach N 9s (99.99...%), and To achieve good response performance to improve the user experience, the following choices are generally made: to ensure P and A, abandon the strong consistency of C, and ensure final consistency.

BASE theory

1. Understanding strong consistency and eventual consistency
CAP theory tells us that a distributed system can only satisfy two of the three items: Consistency, Availability, and Partition tolerance at the same time. There are many APs in practical applications. APs discard consistency to ensure availability and partition tolerance. However, in actual production, many scenarios need to achieve consistency. For example, in the previous example, the master database synchronizes data with the slave database, even if it is not. Consistency, but in the end, the data must be successfully synchronized to ensure data consistency. This consistency is different from the consistency in CAP. The consistency in CAP requires that the data of each node must be consistent at any time. It emphasizes It is strong consistency, but final consistency allows the data of each node to be inconsistent within a period of time, but after a period of time, the data of each node must be consistent. It emphasizes the consistency of the final data.

2. Introduction to Base Theory
BASE is the abbreviation of the three phrases Basically Available, Soft state and Eventually consistent. The BASE theory is an extension of AP in CAP. Availability is obtained by sacrificing strong consistency. When a fault occurs, the part is allowed to be unavailable but the core functions must be guaranteed to be available. Allow data to be inconsistent for a period of time, but eventually reach a consistent state. The affairs that satisfy the BASE theory are called " flexible affairs ".

  • Basically available: When a distributed system fails, it is allowed to lose part of the available functions to ensure that the core functions are available. For example, if there is a problem with the transaction payment on the e-commerce website, the product can still be browsed normally.
  • Soft state: Since strong consistency is not required, BASE allows an intermediate state (also called soft state) in the system. This state does not affect the availability of the system, such as the status of "Paying" and "Data Synchronizing" of the order, waiting for data After the final agreement, the status is changed to the "success" status.
  • Final consistency: Final consistency means that after a period of time, all node data will be consistent. For example, the "paying" status of the order will eventually become "payment successful" or "payment failed", so that the order status and the actual transaction result can be agreed, but a certain time delay and waiting are required.

Guess you like

Origin blog.csdn.net/qq_37362891/article/details/114438772