1. Distributed transaction problem
1.1. Local affairs
Local transactions, that is, traditional stand-alone transactions . In traditional database transactions, four principles must be met:
1.2. Distributed transactions
Distributed transactions refer to transactions that are not generated under a single service or a single database architecture, such as:
-
Distributed transactions across data sources
-
Distributed transactions across services
-
general situation
After the horizontal split of the database and the vertical split of the service, a business operation usually needs to span multiple databases and services to complete. For example, the more common cases of order payment in the e-commerce industry include the following behaviors:
-
create new order
-
Deduction of commodity inventory
-
Deduct the amount from the user account balance
Doing the above requires access to three different microservices and three different databases.
The creation of orders, deduction of inventory, and debiting of accounts are a local transaction within each service and database, which can guarantee ACID principles.
But when we regard the three things as a "business", to satisfy the atomicity of the "business", either all operations succeed or all fail, and the phenomenon of partial success and partial failure is not allowed. This is the distributed system . business .
At this time, ACID is difficult to satisfy, which is the problem to be solved by distributed transactions
2. Theoretical basis
Solving distributed transaction problems requires some basic knowledge of distributed systems as theoretical guidance.
2.1, CAP theorem
In 1998, Eric Brewer, a computer scientist at the University of California, proposed that there are three indicators for distributed systems.
Consistency
Availability
Partition tolerance (partition fault tolerance)
Their first letters are C, A, P.
Eric Brewer said that these three indicators can not be achieved at the same time. This conclusion is called the CAP theorem.
2.1.1. Consistency
Consistency: When a user accesses any node in the distributed system, the data obtained must be consistent.
For example, now it contains two nodes, where the initial data is consistent:
When we modify the data of one of the nodes, the data of the two is different: in order to maintain consistency, the data synchronization between node01 and node02 must be realized:
2.1.2. Availability
Availability (availability): A user accessing any healthy node in the cluster must be able to get a response instead of timeout or rejection.
As shown in the figure, there is a cluster with three nodes, and any one of them can be accessed in time to get a response:
When some nodes are inaccessible due to network failure or other reasons, it means the node is unavailable:
2.1.3, partition fault tolerance
Partition : Due to network failure or other reasons, some nodes in the distributed system lose connection with other nodes, forming an independent partition.
Tolerance (fault tolerance) : When a partition occurs in the cluster, the entire system must continue to provide external services
2.1.4. Contradiction
In a distributed system, the network between systems cannot be guaranteed to be 100% healthy, and there must be failures, and services must be guaranteed externally. Therefore Partition Tolerance is inevitable.
Problems arise when nodes receive new data changes:
If you want to ensure consistency at this time , you must wait for the network to recover. After the data synchronization is completed, the entire cluster will provide services to the outside world. The services are blocked and unavailable.
If you want to ensure availability at this time , you can't wait for the network to recover, then there will be data inconsistencies between node01, node02 and node03.
That is to say, in the case that P must appear, only one between A and C can be realized.
2.2, BASE theory
BASE theory is a solution to CAP, including three ideas:
-
Basically Available (basically available) : When a distributed system fails, it is allowed to lose part of its availability, that is, to ensure that the core is available.
-
Soft State (soft state): Within a certain period of time, an intermediate state is allowed, such as a temporary inconsistent state.
-
Eventually Consistent : Although strong consistency cannot be guaranteed, data consistency will eventually be achieved after the soft state ends.
2.3. Ideas for solving distributed transactions
The biggest problem of distributed transactions is the consistency of each sub-transaction. Therefore, we can learn from the CAP theorem and BASE theory. There are two solutions:
-
AP mode: Each sub-transaction is executed and submitted separately, allowing inconsistencies in results, and then taking remedial measures to restore the data to achieve final consistency.
-
CP mode: Each sub-transaction waits for each other after execution, commits at the same time, and rolls back at the same time to reach a strong consensus. However, during the waiting process of the transaction, it is in a weakly available state.
But no matter which mode it is, it is necessary to communicate with each other between subsystem transactions and coordinate transaction status, that is, a transaction coordinator (TC) is required :
The subsystem transactions here are called branch transactions ; the associated branch transactions together are called global transactions .
3. Seata
Seata is a distributed transaction solution jointly open-sourced by Ant Financial and Alibaba in January 2019. Committed to providing high-performance and easy-to-use distributed transaction services, creating a one-stop distributed solution for users.
Official website address: http://seata.io/ , where documents and podcasts provide a lot of usage instructions and source code analysis.
3.1, Seata's architecture
There are three important roles in Seata transaction management:
-
TC (Transaction Coordinator) - Transaction Coordinator: maintains the state of global and branch transactions, and coordinates global transaction commit or rollback.
-
TM (Transaction Manager) - Transaction Manager: Define the scope of a global transaction, start a global transaction, commit or rollback a global transaction.
-
RM (Resource Manager) - Resource Manager: manages resources for branch transactions, talks to TC to register branch transactions and report the status of branch transactions, and drives branch transaction commits or rollbacks.
The overall structure is shown in the figure:
Seata provides four different distributed transaction solutions based on the above architecture:
-
XA mode: strongly consistent phased transaction mode, sacrificing certain availability and no business intrusion
-
TCC mode: eventually consistent phased transaction mode, with business intrusion
-
AT mode: eventually consistent phased transaction mode, no business intrusion, and also the default mode of Seata
-
SAGA mode: long transaction mode, with business intrusion
No matter what kind of solution, it is inseparable from TC, that is, the coordinator of the transaction.
3.2. Deploy TC service
1. Download
First of all, we need to download the seata-server package, the address is at http://seata.io/zh-cn/blog/download.html
2. Unzip
Unzip the zip package in a non-Chinese directory, and its directory structure is as follows:
3. Modify the configuration
Modify the registry.conf file in the conf directory:
The content is as follows:
registry { # tc服务的注册中心类,这里选择nacos,也可以是eureka、zookeeper等 type = "nacos" nacos { # seata tc 服务注册到 nacos的服务名称,可以自定义 application = "seata-tc-server" serverAddr = "localhost:8848" group = "DEFAULT_GROUP" namespace = "" cluster = "SH" #所在集群 username = "nacos" password = "nacos" } } config { # 读取tc服务端的配置文件的方式,这里是从nacos配置中心读取,这样如果tc是集群,可以共享配置 type = "nacos" # 配置nacos地址等信息 nacos { serverAddr = "localhost:8848" namespace = "" group = "SEATA_GROUP" username = "nacos" password = "nacos" dataId = "seataServer.properties" } }
4. Add configuration in nacos
Special attention, in order to allow the cluster of tc service to share configuration, we chose nacos as the unified configuration center. Therefore, the server configuration file seataServer.properties needs to be configured in nacos.
The format is as follows:
The configuration content is as follows:
# 数据存储方式,db代表数据库 store.mode=db store.db.datasource=druid store.db.dbType=mysql store.db.driverClassName=com.mysql.jdbc.Driver store.db.url=jdbc:mysql://localhost:3306/seata?useUnicode=true&rewriteBatchedStatements=true store.db.user=root store.db.password=root store.db.minConn=5 store.db.maxConn=30 store.db.globalTable=global_table store.db.branchTable=branch_table store.db.queryLimit=100 store.db.lockTable=lock_table store.db.maxWait=5000 # 事务、日志等配置 server.recovery.committingRetryPeriod=1000 server.recovery.asynCommittingRetryPeriod=1000 server.recovery.rollbackingRetryPeriod=1000 server.recovery.timeoutRetryPeriod=1000 server.maxCommitRetryTimeout=-1 server.maxRollbackRetryTimeout=-1 server.rollbackRetryTimeoutUnlockEnable=false server.undo.logSaveDays=7 server.undo.logDeletePeriod=86400000 # 客户端与服务端传输方式 transport.serialization=seata transport.compressor=none # 关闭metrics功能,提高性能 metrics.enabled=false metrics.registryType=compact metrics.exporterList=prometheus metrics.exporterPrometheusPort=9898
The database address, user name, and password all need to be modified to your own database information
5. Create a database table
Special attention: When the tc service manages distributed transactions, it needs to record transaction-related data into the database, and you need to create these tables in advance.
Create a new database named seata
These tables mainly record global transactions, branch transactions, and global lock information
SET NAMES utf8mb4; SET FOREIGN_KEY_CHECKS = 0; -- ---------------------------- -- 分支事务表 -- ---------------------------- DROP TABLE IF EXISTS `branch_table`; CREATE TABLE `branch_table` ( `branch_id` bigint(20) NOT NULL, `xid` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL, `transaction_id` bigint(20) NULL DEFAULT NULL, `resource_group_id` varchar(32) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL, `resource_id` varchar(256) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL, `branch_type` varchar(8) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL, `status` tinyint(4) NULL DEFAULT NULL, `client_id` varchar(64) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL, `application_data` varchar(2000) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL, `gmt_create` datetime(6) NULL DEFAULT NULL, `gmt_modified` datetime(6) NULL DEFAULT NULL, PRIMARY KEY (`branch_id`) USING BTREE, INDEX `idx_xid`(`xid`) USING BTREE ) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact; -- ---------------------------- -- 全局事务表 -- ---------------------------- DROP TABLE IF EXISTS `global_table`; CREATE TABLE `global_table` ( `xid` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL, `transaction_id` bigint(20) NULL DEFAULT NULL, `status` tinyint(4) NOT NULL, `application_id` varchar(32) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL, `transaction_service_group` varchar(32) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL, `transaction_name` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL, `timeout` int(11) NULL DEFAULT NULL, `begin_time` bigint(20) NULL DEFAULT NULL, `application_data` varchar(2000) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL, `gmt_create` datetime NULL DEFAULT NULL, `gmt_modified` datetime NULL DEFAULT NULL, PRIMARY KEY (`xid`) USING BTREE, INDEX `idx_gmt_modified_status`(`gmt_modified`, `status`) USING BTREE, INDEX `idx_transaction_id`(`transaction_id`) USING BTREE ) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact; SET FOREIGN_KEY_CHECKS = 1;
6. Start the TC service
Enter the bin directory and run the seata-server.bat in it:
After the startup is successful, the seata-server should have been registered to the nacos registration center.
Open the browser, visit the nacos address: http://localhost:8848 , and then enter the service list page, you can see the information of seata-tc-server:
3.3. Microservice integration with Seata
Introduce dependencies
First, introduce dependencies in order-service:
<!--seata--> <dependency> <groupId>com.alibaba.cloud</groupId> <artifactId>spring-cloud-starter-alibaba-seata</artifactId> <exclusions> <!--版本较低,1.3.0,因此排除--> <exclusion> <artifactId>seata-spring-boot-starter</artifactId> <groupId>io.seata</groupId> </exclusion> </exclusions> </dependency> <dependency> <groupId>io.seata</groupId> <artifactId>seata-spring-boot-starter</artifactId> <!--seata starter 采用1.4.2版本--> <version>${seata.version}</version> </dependency>
Configure TC address
In application.yml in order-service, configure the TC service information, and obtain the TC address through the registration center nacos combined with the service name:
seata: registry: # TC服务注册中心的配置,微服务根据这些信息去注册中心获取tc服务地址 type: nacos # 注册中心类型 nacos nacos: server-addr: localhost:8848 # nacos地址 namespace: "" # namespace,默认为空 group: DEFAULT_GROUP # 分组,默认是DEFAULT_GROUP application: seata-tc-server # seata服务名称 username: nacos password: nacos tx-service-group: seata-demo # 事务组名称 service: vgroup-mapping: # 事务组与cluster的映射关系 seata-demo: SH
How does the microservice find the address of the TC based on these configurations?
We know that microservices registered in Nacos need four pieces of information to determine a specific instance:
namespace: Namespace
group: grouping
application: service name
cluster: cluster name
The above four information can be found in the yaml file just now:
The namespace is empty, which is the default public
Combined, the TC service information is: public@DEFAULT_GROUP@seata-tc-server@SH, so that the TC service cluster can be determined. Then you can go to Nacos to pull the corresponding instance information.