1. Business model
1.1, XA mode
The XA specification is a distributed transaction processing (DTP, Distributed Transaction Processing) standard defined by the X/Open organization. The XA specification describes the interface between the global TM and the local RM. Almost all mainstream databases support the XA specification. .
1.1.1, two-phase commit
XA is a specification. Currently, mainstream databases have implemented this specification. The principle of implementation is based on two-phase commit.
normal circumstances:
abnormal situation:
Phase one:
-
The transaction coordinator notifies each transaction participant to perform a local transaction
-
After the execution of the local transaction is completed, report the transaction execution status to the transaction coordinator. At this time, the transaction does not commit and continues to hold the database lock
Phase two:
-
The transaction coordinator judges the next step based on the report of the first stage
-
If all phases are successful, notify all transaction participants and commit the transaction
-
If any participant in a phase fails, notify all transaction participants to roll back the transaction
-
1.1.2, Seata's XA model
Seata simply encapsulates and transforms the original XA mode to adapt to its own transaction model. The basic architecture is shown in the figure:
The work of the first stage of RM:
① Register the branch transaction to TC
② Execute branch business sql but do not submit
③ Report execution status to TC
The work of the second phase of TC:
-
TC detects the transaction execution status of each branch
a. If all succeed, notify all RMs to commit the transaction
b. If there is a failure, notify all RMs to roll back the transaction
The work of the second phase of RM:
-
Receive TC instructions, commit or rollback transactions
1.1.3. Advantages and disadvantages
What are the advantages of XA mode?
-
The strong consistency of transactions meets the ACID principle.
-
Commonly used databases are supported, the implementation is simple, and there is no code intrusion
What are the disadvantages of XA mode?
-
Because the database resources need to be locked in the first stage and released only after the end of the second stage, the performance is poor
-
Relying on Relational Databases to Realize Transactions
1.1.4. Realize the XA mode
Seata's starter has completed the automatic assembly of the XA mode, and the implementation is very simple. The steps are as follows:
1) Modify the application.yml file (each microservice participating in the transaction), and enable the XA mode:
seata:
data-source-proxy-mode: XA
2) Add the @GlobalTransactional annotation to the entry method that initiates the global transaction:
1.2, AT mode
The AT mode is also a phased commit transaction model, but it makes up for the long resource locking period in the XA model.
1.2.1, Seata's AT model
Basic flowchart:
Phase 1 RM work:
-
register branch transaction
-
Record undo-log (data snapshot)
-
Execute business sql and submit
-
report transaction status
The work of RM at the time of phase 2 submission:
-
Just delete the undo-log
The work of RM during phase 2 rollback:
-
Restore data to before update according to undo-log
1.2.2. Process review
Let's sort out the principle of the AT mode with a real business.
For example, now there is another database table that records user balances:
id | money |
---|---|
1 | 100 |
The SQL to be executed by one of the branch services is:
update tb_account set money = money - 10 where id = 1
In AT mode, the current branch transaction execution flow is as follows:
Phase one:
- TM initiates and registers global transactions to TC
- TM call branch transaction
- Branch transaction prepares to execute business SQL
- RM intercepts business SQL, queries original data according to where conditions, and forms a snapshot.
{ "id": 1, "money": 100 }
- RM executes business SQL, submits local transactions, and releases database locks. at this time
money = 90
- RM reports local transaction status to TC
Phase two:
- TM informs TC that the transaction is over
- TC checks branch transaction status
a. If all are successful, delete the snapshot immediately
b. If a branch transaction fails, it needs to be rolled back. Read snapshot data ( {"id": 1, "money": 100}
), restore the snapshot to the database. At this point the database is restored to 100 again
flow chart:
1.2.3 The difference between AT and XA
Briefly describe the biggest difference between AT mode and XA mode?
-
The XA mode does not commit transactions in the first stage, and locks resources; the AT mode commits directly in the first stage, without locking resources.
-
The XA mode relies on the database mechanism to achieve rollback; the AT mode uses data snapshots to achieve data rollback.
-
Strong consistency in XA mode; final consistency in AT mode
1.2.4. Dirty write problem
When multiple threads access distributed transactions in AT mode concurrently, dirty write problems may occur, as shown in the figure:
The solution is to introduce the concept of a global lock. Before releasing the DB lock, get the global lock first. Avoid another transaction to operate the current data at the same time.
1.2.5. Advantages and disadvantages
Advantages of AT mode:
-
Complete the direct submission of transactions in one stage, release database resources, and have better performance
-
Using global locks to achieve read-write isolation
-
No code intrusion, the framework automatically completes rollback and commit
Disadvantages of AT mode:
-
The soft state between the two phases belongs to the final consistency
-
The snapshot function of the framework will affect performance, but it is much better than XA mode
1.2.6. Implement AT mode
Actions such as snapshot generation and rollback in AT mode are automatically completed by the framework without any code intrusion, so the implementation is very simple.
However, AT mode requires a table to record global locks and another table to record data snapshot undo_log.
1. Import the database table and record the global lock
Among them, the lock_table is imported into the database associated with the TC service, and the undo_log table is imported into the database associated with the microservice:
SET NAMES utf8mb4;
SET FOREIGN_KEY_CHECKS = 0;
-- ----------------------------
-- Table structure for undo_log
-- ----------------------------
DROP TABLE IF EXISTS `undo_log`;
CREATE TABLE `undo_log` (
`branch_id` bigint(20) NOT NULL COMMENT 'branch transaction id',
`xid` varchar(100) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL COMMENT 'global transaction id',
`context` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL COMMENT 'undo_log context,such as serialization',
`rollback_info` longblob NOT NULL COMMENT 'rollback info',
`log_status` int(11) NOT NULL COMMENT '0:normal status,1:defense status',
`log_created` datetime(6) NOT NULL COMMENT 'create datetime',
`log_modified` datetime(6) NOT NULL COMMENT 'modify datetime',
UNIQUE INDEX `ux_undo_log`(`xid`, `branch_id`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci COMMENT = 'AT transaction mode undo table' ROW_FORMAT = Compact;
-- ----------------------------
-- Records of undo_log
-- ----------------------------
-- ----------------------------
-- Table structure for lock_table
-- ----------------------------
DROP TABLE IF EXISTS `lock_table`;
CREATE TABLE `lock_table` (
`row_key` varchar(128) CHARACTER SET utf8 COLLATE utf8_general_ci NOT NULL,
`xid` varchar(96) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
`transaction_id` bigint(20) NULL DEFAULT NULL,
`branch_id` bigint(20) NOT NULL,
`resource_id` varchar(256) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
`table_name` varchar(32) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
`pk` varchar(36) CHARACTER SET utf8 COLLATE utf8_general_ci NULL DEFAULT NULL,
`gmt_create` datetime NULL DEFAULT NULL,
`gmt_modified` datetime NULL DEFAULT NULL,
PRIMARY KEY (`row_key`) USING BTREE,
INDEX `idx_branch_id`(`branch_id`) USING BTREE
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Compact;
SET FOREIGN_KEY_CHECKS = 1;
2. Modify the application.yml file and change the transaction mode to AT mode:
seata:
data-source-proxy-mode: AT # 默认就是AT
1.3, TCC mode
The TCC mode is very similar to the AT mode, and each stage is an independent transaction. The difference is that TCC implements data recovery through manual coding. Three methods need to be implemented:
-
Try: resource detection and reservation;
-
Confirm: Complete the resource operation business; it is required that the Try succeeds and the Confirm must succeed.
-
Cancel: Reserved resources are released, which can be understood as the reverse operation of try.
1.3.1. Process Analysis
For example, a business that deducts user balance. Assuming that the original balance of account A is 100, the balance needs to be deducted by 30 yuan.
-
Stage 1 (Try) : Check whether the balance is sufficient, if it is sufficient, the frozen amount will be increased by 30 yuan, and the available balance will be deducted by 30 yuan
Initial balance:
The balance is sufficient and can be frozen:
At this point, the total amount = frozen amount + available amount, and the quantity remains unchanged at 100. Transactions commit directly without waiting for other transactions.
-
Phase 2 (Confirm) : If you want to submit (Confirm), the frozen amount will be deducted by 30
Confirm that it can be submitted, but the available amount has been deducted before, so just clear the frozen amount here:
At this point, the total amount = frozen amount + available amount = 0 + 70 = 70 yuan
-
Phase 2 (Cancel) : If you want to roll back (Cancel), the frozen amount will be deducted by 30, and the available balance will be increased by 30
If a rollback is required, the frozen amount must be released and the available amount restored:
1.3.2, Seata's TCC model
The TCC model in Seata still continues the previous transaction architecture, as shown in the figure:
1.3.3. Advantages and disadvantages
What does each stage of TCC mode do?
-
Try: resource checking and reservation
-
Confirm: business execution and submission
-
Cancel: release of reserved resources
What are the advantages of TCC?
-
Complete the direct commit transaction in one stage, release database resources, and have good performance
-
Compared with the AT model, there is no need to generate snapshots, no need to use global locks, and the performance is the strongest
-
Does not rely on database transactions, but relies on compensation operations, which can be used for non-transactional databases
What are the disadvantages of TCC?
-
There is code intrusion, and it is too troublesome to manually write try, confirm and cancel interfaces
-
Soft state, transactions are eventually consistent
-
It is necessary to consider the failure of Confirm and Cancel, and do idempotent processing
1.3.4, transaction suspension and empty rollback
empty rollback
When the try phase of a branch transaction is blocked , it may cause the global transaction to time out and trigger the cancel operation of the second phase. When the try operation is not executed, the cancel operation is executed first. At this time, the cancel cannot be rolled back, which is an empty rollback .
As shown in the picture:
When executing the cancel operation, it should be judged whether the try has been executed, and if it has not been executed, it should be rolled back empty.
business suspension
For the business that has been rolled back empty, the previously blocked try operation resumes, and if the try continues to be executed, it will never be possible to confirm or cancel, and the transaction is always in an intermediate state, which is the business suspension .
When executing the try operation, it should be judged whether the cancel has been executed. If it has been executed, the try operation after the empty rollback should be prevented to avoid suspension
1.3.5. Implement TCC mode
To solve the problem of empty rollback and business suspension, it is necessary to record the current transaction status, is it in try or cancel?
1. Thinking Analysis
Here we define a table:
CREATE TABLE `account_freeze_tbl` ( `xid` varchar(128) NOT NULL, `user_id` varchar(255) DEFAULT NULL COMMENT '用户id', `freeze_money` int(11) unsigned DEFAULT '0' COMMENT '冻结金额', `state` int(1) DEFAULT NULL COMMENT '事务状态,0:try,1:confirm,2:cancel', PRIMARY KEY (`xid`) USING BTREE ) ENGINE=InnoDB DEFAULT CHARSET=utf8 ROW_FORMAT=COMPACT;
in:
xid: is the global transaction id
freeze_money: used to record the frozen amount of the user
state: used to record transaction status
At this time, how should we start our business?
Try business:
Record the frozen amount and transaction status to the account_freeze table
Deduct the amount available in the account table
Confirm business
Delete the frozen record of account_freeze table according to xid
Cancel business
Modify the account_freeze table, the frozen amount is 0, and the state is 2
Modify the account table to restore the available amount
How to judge whether the rollback is empty?
In the cancel business, query account_freeze according to the xid, if it is null, it means that the try has not been done yet, and an empty rollback is required
How to avoid business suspension?
In the try business, query account_freeze according to xid, if it already exists, it proves that Cancel has been executed, and refuses to execute the try business
Next, we transform the account-service and use TCC to implement the balance deduction function.
2. Declare the TCC interface
TCC's Try, Confirm, and Cancel methods all need to be declared in the interface based on annotations.
cn.itcast.account.service
We create a new interface in the package in the account-service project , and declare three interfaces of TCC:@LocalTCC public interface AccountTCCService { @TwoPhaseBusinessAction(name = "deduct", commitMethod = "confirm", rollbackMethod = "cancel") void deduct(@BusinessActionContextParameter(paramName = "userId") String userId, @BusinessActionContextParameter(paramName = "money")int money); boolean confirm(BusinessActionContext ctx); boolean cancel(BusinessActionContext ctx); }
3. Write the implementation class
Create a new class under the package in the account-service service
cn.itcast.account.service.impl
to implement the TCC business@Service @Slf4j public class AccountTCCServiceImpl implements AccountTCCService { @Autowired private AccountMapper accountMapper; @Autowired private AccountFreezeMapper freezeMapper; @Override @Transactional public void deduct(String userId, int money) { // 0.获取事务id String xid = RootContext.getXID(); // 1.扣减可用余额 accountMapper.deduct(userId, money); // 2.记录冻结金额,事务状态 AccountFreeze freeze = new AccountFreeze(); freeze.setUserId(userId); freeze.setFreezeMoney(money); freeze.setState(AccountFreeze.State.TRY); freeze.setXid(xid); freezeMapper.insert(freeze); } @Override public boolean confirm(BusinessActionContext ctx) { // 1.获取事务id String xid = ctx.getXid(); // 2.根据id删除冻结记录 int count = freezeMapper.deleteById(xid); return count == 1; } @Override public boolean cancel(BusinessActionContext ctx) { // 0.查询冻结记录 String xid = ctx.getXid(); AccountFreeze freeze = freezeMapper.selectById(xid); // 1.恢复可用余额 accountMapper.refund(freeze.getUserId(), freeze.getFreezeMoney()); // 2.将冻结金额清零,状态改为CANCEL freeze.setFreezeMoney(0); freeze.setState(AccountFreeze.State.CANCEL); int count = freezeMapper.updateById(freeze); return count == 1; } }
1.4, SAGA mode
Saga mode is Seata's upcoming open source long transaction solution, which will be mainly contributed by Ant Financial.
Its theoretical basis is the paper Sagas published by Hector & Kenneth in 1987 .
Seata official website guide for Saga: Seata Saga mode
1.4.1. Principle
In the Saga mode, there are multiple participants in the distributed transaction, and each participant is a reversal compensation service, which requires the user to implement its forward operation and reverse rollback operation according to the business scenario.
During the execution of the distributed transaction, the forward operations of each participant are executed sequentially. If all the forward operations are executed successfully, the distributed transaction is committed. If any forward operation fails, the distributed transaction will go back to perform the reverse rollback operation of the previous participants, roll back the submitted participants, and return the distributed transaction to the initial state.
Saga is also divided into two stages:
-
Phase 1: Submit local transactions directly
-
Phase 2: If it succeeds, do nothing; if it fails, it will roll back by writing compensation business
1.4.2. Advantages and disadvantages
advantage:
-
Transaction participants can implement asynchronous calls based on event-driven, high throughput
-
Submit transactions directly in one stage, no locks, good performance
-
It is easy to implement without writing the three stages in TCC
shortcoming:
-
The duration of the soft state is uncertain and the timeliness is poor
-
No locks, no transaction isolation, dirty writes
1.5. Comparison of four modes
We compare the four implementations in the following aspects:
-
Consistency: Can transaction consistency be guaranteed? Strong consistency or eventual consistency?
-
Isolation: How isolated are transactions?
-
Code intrusion: Do you need to modify the business code?
-
Performance: Is there any performance loss?
-
Scenarios: common business scenarios
As shown in the picture:
2. High availability
As the core of distributed transactions, Seata's TC service must ensure the high availability of the cluster.
2.1, high availability architecture model
Building a TC service cluster is very simple, just start multiple TC services and register with nacos.
However, the cluster cannot guarantee 100% security. What if the computer room where the cluster is located fails? Therefore, if the requirements are high, disaster recovery with multiple computer rooms in different places is generally done.
For example, one TC cluster is in Shanghai and another TC cluster is in Hangzhou:
The microservice finds which TC cluster should be used based on the mapping relationship between the transaction group (tx-service-group) and the TC cluster. When the SH cluster fails, you only need to change the mapping relationship in vgroup-mapping to HZ. Then all microservices will be switched to HZ's TC cluster.
2.2, to achieve high availability
1. Simulate remote disaster recovery TC cluster
It is planned to start two seata tc service nodes:
node name ip address The port number cluster name set 127.0.0.1 8091 SH seata2 127.0.0.1 8092 HZ We have started a seata service before, the port is 8091, and the cluster name is SH.
Now, copy the seata directory and name it seata2
Modify seata2/conf/registry.conf as follows:
registry { # tc服务的注册中心类,这里选择nacos,也可以是eureka、zookeeper等 type = "nacos" nacos { # seata tc 服务注册到 nacos的服务名称,可以自定义 application = "seata-tc-server" serverAddr = "127.0.0.1:8848" group = "DEFAULT_GROUP" namespace = "" cluster = "HZ" username = "nacos" password = "nacos" } } config { # 读取tc服务端的配置文件的方式,这里是从nacos配置中心读取,这样如果tc是集群,可以共享配置 type = "nacos" # 配置nacos地址等信息 nacos { serverAddr = "127.0.0.1:8848" namespace = "" group = "SEATA_GROUP" username = "nacos" password = "nacos" dataId = "seataServer.properties" } }
Enter the seata2/bin directory, and then run the command:
seata-server.bat -p 8092
Open the nacos console to view the service list:
Click to view details:
2. Configure the transaction group mapping to nacos
Next, we need to configure the mapping relationship between tx-service-group and cluster to the nacos configuration center.
Create a new configuration:
The content of the configuration is as follows:
# 事务组映射关系 service.vgroupMapping.seata-demo=SH service.enableDegrade=false service.disableGlobalTransaction=false # 与TC服务的通信配置 transport.type=TCP transport.server=NIO transport.heartbeat=true transport.enableClientBatchSendRequest=false transport.threadFactory.bossThreadPrefix=NettyBoss transport.threadFactory.workerThreadPrefix=NettyServerNIOWorker transport.threadFactory.serverExecutorThreadPrefix=NettyServerBizHandler transport.threadFactory.shareBossWorker=false transport.threadFactory.clientSelectorThreadPrefix=NettyClientSelector transport.threadFactory.clientSelectorThreadSize=1 transport.threadFactory.clientWorkerThreadPrefix=NettyClientWorkerThread transport.threadFactory.bossThreadSize=1 transport.threadFactory.workerThreadSize=default transport.shutdown.wait=3 # RM配置 client.rm.asyncCommitBufferLimit=10000 client.rm.lock.retryInterval=10 client.rm.lock.retryTimes=30 client.rm.lock.retryPolicyBranchRollbackOnConflict=true client.rm.reportRetryCount=5 client.rm.tableMetaCheckEnable=false client.rm.tableMetaCheckerInterval=60000 client.rm.sqlParserType=druid client.rm.reportSuccessEnable=false client.rm.sagaBranchRegisterEnable=false # TM配置 client.tm.commitRetryCount=5 client.tm.rollbackRetryCount=5 client.tm.defaultGlobalTransactionTimeout=60000 client.tm.degradeCheck=false client.tm.degradeCheckAllowTimes=10 client.tm.degradeCheckPeriod=2000 # undo日志配置 client.undo.dataValidation=true client.undo.logSerialization=jackson client.undo.onlyCareUpdateColumns=true client.undo.logTable=undo_log client.undo.compress.enable=true client.undo.compress.type=zip client.undo.compress.threshold=64k client.log.exceptionRate=100
3. The microservice reads the nacos configuration
Next, you need to modify the application.yml file of each microservice to let the microservice read the client.properties file in nacos:
seata: config: type: nacos nacos: server-addr: 127.0.0.1:8848 username: nacos password: nacos group: SEATA_GROUP data-id: client.properties
Restart the microservice. Whether the microservice is connected to tc's SH cluster or tc's HZ cluster is determined by the client.properties of nacos.