This article will take you through distributed transactions

1. What is a distributed transaction?

Before introducing this, let's understand these issues first

  1. What is a transaction?

  2. What are local affairs?

  3. What is distributed?

  4. What is a distributed transaction?

1.1. What is a transaction?

To accomplish something, there may be multiple participants who need to perform multiple steps, and ultimately multiple steps either all succeed or all fail.

For example: A transfers 100 yuan to B on WeChat, A’s account is reduced by 100, and B’s account is increased by 100. This is a transaction, and this operation either succeeds or fails.

There are many business scenarios, and the participants are also diverse

  1. User registration successfully sends mail, including 2 operations: insert user information into db, send mail to user, 2 main participants: db, mail server

  2. Using Alipay to recharge the phone bill includes 2 operations: Alipay account funds decrease, mobile phone balance increases, two main participants: Alipay account, mobile phone number service provider account

There are various participants in the transaction, but in this article we mainly use the transaction in db to illustrate.

1.2. What is a local transaction?

Local transactions, popular understanding: that is, all operations in a transaction occur in the same database.

For example, if A transfers money to B, the accounts of A and B are located in the same database.

Usually we use relational databases, such as: MySQL, Oracle, SQL Server. By default, these databases have realized the function of transactions, that is, to perform a transaction operation in a db, and the db itself can ensure this transaction The correctness of the transaction, without the need for us to consider how to ensure the correctness of the transaction.

1.3. Four characteristics of database transactions

1.3.1. Consistency

The result after the transaction operation is consistent with the expected result. A transfers 100 to B. After the transaction is over, it is seen that A’s account should be reduced by 100, and B’s account should be increased by 100. Nothing else will happen

1.3.2. Atomicity

The entire process of a transaction is like an atomic operation, and eventually either all succeed or all fail. This atomicity is seen from the final result, and the process is indivisible from the final result.

1.3.3. Isolation

The execution of a transaction cannot be interfered by other transactions. That is, the operations and data used within a transaction are isolated from other concurrent transactions, and the concurrently executed transactions cannot interfere with each other.

1.3.4. Durability

Once a transaction is committed, its changes to the data in the database should be permanent. After the transaction is committed, the data will be persisted to the hard disk, and the modification is permanent.

1.4. What is distributed?

There are multiple participants to complete a certain thing, and the multiple participants are distributed in different machines, and these machines communicate through the network or other methods.

For example, if you use an ICBC card to recharge Alipay, the account of the ICBC card is located in the db of ICBC, and the account of Alipay is located in the db of Alipay, and the two db are located in different places.

1.5. What is a distributed transaction?

Everyone understands the two concepts of distributed and transaction, then distributed transaction is easy to understand: multiple participants of the transaction are distributed in different places.

It is easy for us to ensure the correctness of the transaction in a single db, but how to ensure the correctness of the transaction when the participants of the transaction are located in multiple dbs?

For example: A transfers money to B, A is located in DB1, and B is located in DB2

step1.通过网络,给DB1发送指令:给A账户减少100
step2.通过网络,给DB2发送指令:给B账户增加100

After step1 succeeds, when step2 is executed, the network fails, which leads to the failure of step2 execution. In the end: A is reduced by 100, but B is not increased by 100. The final result is inconsistent with the expected result, which leads to the failure of the transaction.

Before introducing the solution of distributed transaction, we need to understand two other concepts: CAP and Base theory , these two theories provide the basis for the solution of distributed transaction.

2. CAP Theory

2.1. Understanding the CAP concept

CAP is the abbreviation of Consistency, Availability, and Partition tolerance, which represent consistency, availability, and partition tolerance respectively.

Let's explain it separately below:

In order to facilitate the understanding of CAP theory, we combine some business scenarios in the e-commerce system to understand CAP.

The following figure shows the execution process of commodity information management:

 

 

The overall execution process is as follows:

1. Commodity service requests the main database to write commodity information (add commodity, modify commodity, delete commodity)

2. The main database writes the response to the commodity service successfully.

3. Commodity service requests to read commodity information from the database.

2.1.1. C - Consistency

Consistency means that the read operation after the write operation can read the latest data state. When the data is distributed on multiple nodes, the data read from any node is the latest state.

In the figure above, the read and write of product information must meet the following goals:

1. If the commodity service is successfully written to the master database, the new data query to the slave database is also successful.

2. If the commodity service fails to be written to the master database, the query for new data from the slave database also fails.

How to achieve consistency?

1. After writing to the master database, the data must be synchronized to the slave database.

2. After writing to the master database, lock the slave database during the synchronization to the slave database, and release the lock after the synchronization is completed, so as to prevent the client from querying the old data from the slave database during the process of writing new data to the slave database .

The characteristics of distributed system consistency:

1. Due to the data synchronization process, there will be a certain delay in the response of the write operation.

2. In order to ensure data consistency, resources will be temporarily locked, and the locked resources will be released after data synchronization is completed.

3. If the node that requests data synchronization fails, an error message will be returned, and the old data will not be returned.

2.1.2. A - Availability

Availability means that any transaction operation can get a response result, and there will be no response timeout or response error.

In the figure above, reading product information to meet availability is to achieve the following goals:

1. Receive the data query request from the database and immediately respond to the data query result.

2. Response timeout or response error is not allowed from the database.

How to achieve usability?

1. After writing to the master database, the data must be synchronized to the slave database.

2. To ensure the availability of the slave database, the resources in the slave database cannot be locked.

3. Even if the data has not been synchronized, the data to be queried must be returned from the database, even if it is old data. If there is no old data, a default message can be returned according to the agreement, but no error or response timeout can be returned.

Characteristics of distributed system availability:

1. All requests are responded, and there will be no response timeout or response error.

2.1.3. P - Partition tolerance

Usually, each node of a distributed system is deployed on a different subnet, which is a network partition. It is inevitable that communication failure between nodes will occur due to network problems, and services can still be provided externally at this time, which is called partition tolerance. .

In the figure above, the reading and writing of product information to meet the partition tolerance is to achieve the following goals:

1. Failure to synchronize data from the master database to the slave database does not affect read and write operations.

2. The failure of one node does not affect the service provided by the other node.

How to achieve partition tolerance?

1. Try to use asynchronous instead of synchronous operations, such as using asynchronous methods to synchronize data from the master database to the slave data, so that loose coupling between nodes can be effectively achieved.

2. Add slave database nodes, one of which hangs up other slave nodes to provide services.

Features of distributed partition tolerance:

1. Partition tolerance is the basic capability of a distributed system

2.2. CAP combination

1. Does the product management example above have CAP at the same time?

In all distributed transaction scenarios, the three characteristics of CAP will not be available at the same time, because C and A cannot coexist under the premise of P.

for example:

The following figure satisfies P, which means that partition tolerance is achieved:

picture

The meaning of partition tolerance in this figure is:

1) The master database synchronizes data with the slave data through the network. It can be considered that the master-slave database is deployed in different partitions and interacts through the network.

2) When there is a problem with the network between the master database and the slave database, it will not affect the external services provided by the master database and the slave database.

3) The failure of one node does not affect the service provided by the other node.

If you want to implement C, you must ensure data consistency. In order to prevent inconsistent data from being queried from the database during data synchronization, you need to lock the slave database data and unlock it after the synchronization is completed. If the synchronization fails, the slave database will return an error message or timeout information.

If you want to implement A, you must ensure data availability, and you can query data from the slave data at any time, and it will not respond to timeout or return an error message.

Through analysis, it is found that there is a contradiction between C and A under the premise of satisfying P, as follows:

In the case of a network partition, the data of the master database cannot be synchronized to the slave database. In order to ensure that the data seen by the outside is consistent, the slave database cannot be accessed from the outside at this time, and only the master database can provide external services, and the slave database loses its availability.

In the case of a network partition, the data of the master library cannot be synchronized to the slave library. At this time, the data of the two libraries is inconsistent. If this allows both libraries to provide external services (availability), the accessed data will be inconsistent.

So CAP cannot be satisfied at the same time.

2. What are the combinations of CAP?

Therefore, when processing distributed transactions in production, it is necessary to determine which two aspects of CAP are satisfied according to the requirements.

1)AP:

Abandon consistency in pursuit of partition tolerance and availability. This is the choice of many distributed system designs.

For example:

The product management above can completely realize AP, as long as the user can accept that the queried data is not up-to-date within a certain period of time.

Usually, the realization of AP will guarantee the final consistency. The BASE theory mentioned later is extended based on AP. Some business scenarios such as: order refund, today’s refund is successful, tomorrow’s account will be credited, as long as the user can accept it within a certain period of time That's it.

2)CP:

Giving up availability and pursuing consistency and partition tolerance, our zookeeper is actually pursuing strong consistency.

3)CA:

Abandoning partition tolerance, that is, not partitioning, regardless of the problem of network failure or node hangup, can achieve consistency and availability.

Then the system will not be a standard distributed system, and our most commonly used relational data will satisfy CA.

For the commodity management above, if CA is to be implemented, the structure is as follows:

picture

Data synchronization is no longer performed between the master database and the slave database, the database can respond to each query request, and each query request can return the latest data through the transaction isolation level.

2.3. Summary

Through the above, we have learned the relevant knowledge of CAP theory. CAP is a proven theory: a distributed system can only satisfy the three requirements of consistency (Consistency), availability (Availability) and partition tolerance (Partition tolerance) at the same time. Two of the items. It can be used as our consideration standard for architecture design and technology selection. For most large-scale Internet application scenarios, there are many nodes and scattered deployments, and the current cluster size is getting larger and larger, so node failures and network failures are normal, and service availability must be guaranteed to reach N 9 (99.99..%), And to achieve good response performance to improve user experience, so the following choices are generally made: guarantee P and A, discard C strong consistency, and ensure final consistency

3. Base theory

3.1. Understanding strong consistency and eventual consistency

The CAP theory tells us that a distributed system can only satisfy at most two of the three items of consistency (Consistency), availability (Availability) and partition tolerance (Partition tolerance), among which there are many APs in practical applications, and AP is Abandon consistency to ensure availability and partition tolerance, but in actual production, consistency must be achieved in many scenarios. For example, in the example we gave earlier, the master database synchronizes data with the slave database. Even if consistency is not required, the data must be synchronized eventually Success to ensure data consistency. This consistency is different from the consistency in CAP. The consistency in CAP requires that the data of each node must be consistent at any time. It emphasizes strong consistency, but the final consistency is The data of each node is allowed to be inconsistent for a period of time, but the data of each node must be consistent after a period of time. It emphasizes the consistency of the final data.

3.2.1. Introduction to Base Theory

BASE is an acronym for the three phrases Basicly Available (basically available), Soft state (soft state) and Eventually consistent (final consistency). The BASE theory is an extension of the AP in the CAP. Availability is obtained by sacrificing strong consistency. When a failure occurs, some parts are not available but the core function is guaranteed to be available. The data is allowed to be inconsistent for a period of time, but eventually reaches a consistent state. The transaction that satisfies the BASE theory is called "flexible transaction".

3.2.2. Basically available

When a distributed system fails, it is allowed to lose some of the available functions to ensure that the core functions are available. For example, there is a problem with the transaction payment on the e-commerce website, but the products can still be browsed normally.

3.2.3. Soft state

Since strong consistency is not required, BASE allows the existence of intermediate states (also called soft states) in the system. This state does not affect the availability of the system, such as the "payment" and "data synchronization" states of the order. After the data is finally consistent The status changes to a "successful" status.

3.2.4. Eventual consistency

Final consistency means that after a period of time, all node data will reach consistency. For example, the "Payment" status of the order will eventually change to "Payment Successful" or "Payment Failed", so that the order status is consistent with the actual transaction result, but a certain period of delay and waiting is required

4. Five common solutions for distributed transactions

  1. Scheme 1: 2PC (two-phase commit)

  2. Solution 2: 3PC (three-phase commit)

  3. Option 3: TCC

  4. Scenario 4: Reliable news

  5. Option 5: Best Effort Notification

The following five programs are introduced in turn.

5. Scheme 1: 2PC (two-phase submission)

5.1. What is 2PC?

2PC is two-phase commit, which divides the entire transaction process into two phases, the prepare phase and the commit phase. 2 refers to the two phases, P refers to the prepare phase, and C refers to the commit phase.

The main 2 roles in 2PC:

  1. transaction coordinator

  2. transaction participants

5.1.1. Preparation phase

The transaction coordinator sends a prepare message to each transaction participant, and each participant executes the local transaction locally but does not commit the transaction (the resources of the transaction operation may be locked at this time), and then returns a yes or no message to the coordinator.

5.1.2. Commit phase

In the preparation phase, all participants return yes. At this time, the transaction coordinator will send a commit message to each transaction participant. After receiving the commit message, the participant will perform a commit operation on the local transaction.

If a participant returns no during the preparation phase, or the participant responds overtime (such as network reasons, which cause a communication failure between the transaction coordinator and the transaction participant), the transaction coordinator will send a rollback message to each transaction participant at this time, After the participant receives the rollback message, it will perform a rollback operation on the local transaction.

picture

5.1.3. Some rules in 2pc

  1. Phase 2 commit condition: all participants in phase 1 return yes

  2. Phase 2 rollback condition, 2 cases: when any participant in phase 1 returns no, or any participant in phase 1 responds with timeout

  3. When the participant's prepare can be successful, then sending commit to the participant must also be successful, and sending rollback must be able to roll back

  4. In 2PC, the transaction coordinator has a timeout mechanism, that is, in phase 1, the coordinator sends a message to the participant, but has not responded, resulting in a timeout. At this time, the second stage rollback is directly executed; while the coordinator does not timeout The machine, for example, all participants have finished executing phase 1, and then the coordinator hangs up. At this time, the participants can only wait and wait.

5.1.4. Problems with 2PC

  1. After the execution of stage 1 is completed, the participant’s local transaction has been executed but has not been submitted. At this time, the resources in the participant’s local transaction are locked. If the coordinator hangs up at this time, the participant’s local transaction will be locked. Resources cannot be released, but directly affect the execution of other businesses.

    For example, if participant 1 reduces the inventory of commodity 1, the inventory record of commodity 1 will be locked. If other businesses also need to modify this record at this time, it will be directly blocked and cannot be executed.

  2. 2PC has performance problems: For example, there are 10 participants in the transaction, participant 1 will lock the local resource in phase 1, and then wait for the other 9 participants to complete phase 1, and then participant 1 receives the commit sent by the transaction coordinator Or after the rollback, the resources will be released, and participant 1 needs to wait for 9 participants, which will result in too long time for locking resources, which will affect the concurrency of the system.

  3. The coordinator has a single point of failure: when the execution of phase 1 is completed, the coordinator hangs up. At this time, the participants are confused and can only wait. This can be solved by the high availability of the coordinator. This is solved in the 3pc mentioned later question.

  4. The problem of transaction inconsistency: In phase 2, some participants received the commit information. At this time, the coordinator hangs up or the network problem causes other coordinators to fail to receive the commit request. During this process, the data in multiple coordinators is inconsistent. Solution: The coordinator and participants must be highly available, the coordinator supports 2PC retry, and the two stages in 2PC need to support idempotence.

5.2. XA transactions

XA (eXtended Architecture) refers to the specification of distributed transaction processing proposed by the X/Open organization. XA is a distributed transaction protocol proposed by Tuxedo, so distributed transactions are also called XA transactions.

The XA protocol mainly defines the interface between the transaction manager TM (Transaction Manager, coordinator) and the resource manager RM (Resource Manager, participant).

Among them, resource managers are often implemented by databases, such as Oracle, DB2, and MySQL. These commercial databases all implement XA interfaces, and transaction managers, as a global scheduler, are responsible for the submission and rollback of various local resources.

XA transactions are implemented based on the Two-phase Commit (2PC) protocol, which can ensure data consistency. Many distributed relational data management systems use this protocol to complete distribution. Phase 1 is the preparation phase, that is, all participants are ready to execute the transaction and lock the required resources. When the participant is Ready, report to TM that he is ready. Phase two is the submission phase. When TM confirms that all participants are Ready, it sends COMMIT command to all participants.

To put it simply: XA is an implementation of 2PC in data.

Everyone has used mysql, common transaction process:

start transaction; //打开事务
执行事务操作
commit|rollback; // 提交或者回滚事务

In the above transaction operation, if the current connection does not send a commit or rollback operation, the connection is broken or mysql is restarted, the above transaction will be automatically rolled back.

The syntax of xa in mysql:

XA {START|BEGIN} xid [JOIN|RESUME]   //开启XA事务,如果使用的是XA START而不是XA BEGIN,那么不支持[JOIN|RESUME],xid是一个唯一值,表示事务分支标识符
XA END xid [SUSPEND [FOR MIGRATE]]   //结束一个XA事务,不支持[SUSPEND [FOR MIGRATE]]
XA PREPARE xid 准备提交
XA COMMIT xid [ONE PHASE] //提交,如果使用了ONE PHASE,则表示使用一阶段提交。两阶段提交协议中,如果只有一个RM参与,那么可以优化为一阶段提交
XA ROLLBACK xid  //回滚
XA RECOVER [CONVERT XID]  //列出所有处于PREPARE阶段的XA事务

like:

xa start 'xa-1';
执行事务操作;
xa prepare 'xa-1'; //阶段1,此时事务操作的资源被锁住,事务未提交
xa commit | rollback;//阶段2

The xa transaction is a bit different from the normal transaction. The above xa transaction has a logo xa-1. After xa-1the prepare, if the connection is broken or mysql is restarted, the transaction is still in preparethe stage. After the mysql restart or the caller reconnects to mysql, you can hold it The transaction ID xa-1continues to be sent xa commit |rollbackto end the transaction.

You can create several dbs in mysql, and then try the two-phase commit through the xa script above to get a feel for the process.

5.3. Key points of transaction coordinator design in XA

In XA, transaction participants, such as some common dbs, have already implemented the 2PC function, but the coordinator needs to be developed by itself. Some design points of the coordinator:

  1. Generate a globally unique XA transaction id record and record it

  2. The transaction coordinator needs to have the function of retrying. For the abnormal operation in the middle stage, the transaction can be finally completed through continuous retrying

  3. The coordinator will retry the operation, so it is necessary to ensure that each stage in 2pc is idempotent

5.4. 2PC solution

  1. Seata : Seata is an open source project Fescar initiated by the Alibaba middleware team, and later renamed Seata. It is an open source distributed transaction framework that supports 2PC.

  2. atomikos+jta : jta is an interface specification for distributed transactions in java. atomikos is an implementation of jta, which is implemented internally by means of XA. If transaction participants self-test XA transactions, they can use this method To solve, for example, the participants are: mysql, oracle, sqlserver, you can use this method; but performance is a problem worthy of everyone's consideration.

  3. Developers realize it by themselves  : After everyone understands the 2pc process, they can develop one by themselves and challenge it.

6. Scheme 2: 3PC (three-phase commit)

6.1. Review 2PC

For example, if A invites B and C to play Glory of Kings together, the 2PC process is as follows:

A is the coordinator, B and C are participants.

6.1.1. Phase 1 (prepare phase)

(1), step1-1: A WeChat B

step1-1-1:A->B:有空么,我们约C一起王者荣耀
step1-1-2:B->A:有空
step1-1-3:A->B:那你现在就打开电脑,登录王者荣耀,你等着,我去通知C,然后开个房间
step1-1-4:B->A:已登录

(2), step1-2: A WeChat C

step1-2-1:A->C:有空么,我约了B一起王者荣耀
step1-2-2:C->A:有空
step1-2-3:A->C:那你现在就打开电脑,登录王者荣耀,你等着,我去开个房间
step1-2-4:C->A:已登录

6.1.2. Phase 2 (commit phase)

At this time, both B and C have already logged into the King of Glory, and then A logs into the King of Glory and opens a room

(1), step2-1: A WeChat B

step2-1-1:A->B:房间号是xxx,你可以进来了
step2-1-2:B->A:我的,我进来了

(2), step2-2: A WeChat C

step2-2-1:A->C:房间号是xxx,你可以进来了
step2-2-2:C->A:我的,我进来了

Then the three of them started having fun.

6.1.3. Some exceptions of 2PC

(1) Situation 1: step1-2-4 timed out, causing A to be unable to receive the message that C has logged in

At this time, A does not know what is going on with C, but the coordinator in 2PC has a timeout mechanism . If the coordinator sends a message to the participants and does not get a response for a long time, it will be treated as a failure. At this time, A will give B and C sends a rollback message to make both B and C roll back, that is, cancel the game.

(2), Situation 2: After step1-1, coordinator A hangs up

At this time, B has already turned on the computer and waited there, but A and C are still nowhere to be seen. He is quite distressed, and he doesn't know how long he will have to wait. It's so hard!

(3), Situation 3: After phase 1, coordinator A hangs up

At this time, B and C have logged into their accounts, and they have been waiting for more than ten minutes. Even if A is nowhere to be seen, they can only wait and do nothing.

(4), Situation 4: There is a problem in step2-2-1, C network failure

At this time, C cannot receive the message sent by A. As a result, both A and B have entered the room, and C is missing. The game cannot start normally, and the final result is not consistent with the expected result (it is expected that 3 people will play together game, there are actually only 2 people in the room)

6.1.4. In general, 2PC has two main problems

Questions for participants to wait

Participants can only act according to the instructions of the coordinator. When the instructions of the coordinator are not received, the participants can only sit and wait. As a result in the db, the operation data will be locked all the time, causing other operators to be blocked.

data inconsistency problem

In the commit phase, if the coordinator or participant hangs up, it may lead to the problem of final data inconsistency.

6.2. 3PC

3PC mainly solves the problem of participants waiting in the commit phase of 2PC. In the commit phase of 2PC, if the coordinator hangs up, the participants do not know how to leave. In 2PC, only the coordinator has a timeout mechanism, while in 3PC, the coordinator and participants have introduced a timeout mechanism. In the commit phase, if the participant fails to receive the commit command after a certain period of time, the participant will automatically submit to solve the problem. It solves the problem of resource being locked for a long time in 2PC.

Compared with 2PC, 3PC has one more stage, which is equivalent to dividing the preparation stage of 2PC into two again, so that the three-stage submission has CanCommit, PreCommit, and DoCommitthree stages.

6.2.1. Phase 1: CanCommit phase

The previous stage of 2PC is that after the execution of the local transaction is completed, it will not commit at the end, and wait for other services to complete the execution and return Yes, and the coordinator will issue a commit before actually executing the commit, and CanCommit here refers to trying to acquire the database lock. If  possible  , Just return Yes.

This stage is mainly divided into 2 steps

Transaction query: the coordinator sends a CanCommit request to the participant. Ask if a transaction commit operation can be performed. Then start waiting for a response from the engagement.

Response feedback: After the participant receives the CanCommit request, under normal circumstances, if he or she thinks that the transaction can be executed smoothly, it will return a Yes response and enter the ready state. Otherwise, feedback No, and then the transaction ends, and the participant does not perform any operations on the task at this time.

6.2.2. Phase 2: PreCommit phase

In phase one, if all participants return Yes, then it will enter the PreCommit phase for transaction pre-commit. The PreCommit stage here  is similar to the first stage above, except that both  the coordinator and the participants have introduced a timeout mechanism (in 2PC, only the coordinator can timeout, and the participants have no timeout mechanism).

6.2.3. Phase 3: DoCommit phase

This is similar to the second stage of 2pc.

6.3. King of Glory 3PC process

6.3.1. Normal process

(1), Phase 1 (CanCommit phase)

step1-1: A WeChat B
step1-1-1:A->B:有空么,我们约C一起王者荣耀
step1-1-2:B->A:有空
step1-2: A WeChat C
step1-1-1:A->B:有空么,我们约B一起王者荣耀
step1-1-2:B->A:有空

(2), Phase 2 (PreCommit phase)

step2-1: A WeChat B
step2-1-1:A->B:你现在就打开电脑,登录王者荣耀,等我消息,如果10分钟没消息,你就自己开个房间玩吧(参与者超时机制)。
step2-1-2:B->A:已登录
step2-2: A WeChat C
step2-2-1:A->C:那你现在就打开电脑,登录王者荣耀,等我消息,如果10分钟没消息,你就自己开个房间玩吧(参与者超时机制)。
step2-2-2:C->A:已登录

(3), Phase 3 (DoCommit phase)

At this time, both B and C have already logged into the King of Glory, and then A logs into the King of Glory and opens a room

step3-1: A WeChat B
step3-1-1:A->B:房间号是xxx,你可以进来了
step3-1-2:B->A:我的,我进来了
step3-2: A WeChat C
step3-2-1:A->C:房间号是xxx,你可以进来了
step3-2-2:C->A:我的,我进来了

Then the three of them started having fun.

6.3.2. Several situations of exception

(1), stage 1 exception

There is no transaction operation at this time, so if something goes wrong at this stage, the transaction can be ended directly.

(2), stage 2, the participant hangs up

It doesn't matter if the participant hangs up, the coordinator directly notifies other participants to roll back.

(3), phase 2, the coordinator hangs up

The coordinator hangs up. Because the participant has introduced a timeout mechanism, the participant will not wait indefinitely. After waiting for a certain period of time, the local transaction will be automatically submitted.

Although this timeout mechanism solves the problem of infinite waiting, it does not solve the problem of consistency. For example, after the above 3PC step2-1:A微信B, the coordinator hangs up. At this time, A has logged in, but C has not received the message that A requests to log in. Timeout After 10 minutes, A went to open a game by himself to play, and the result was inconsistent with the expected result.

6.4. Problems with 3PC

Although it solves the problem of long-term blocking of participants in 2PC (problem that resources cannot be released for a long time), it does not solve the problem of consistency.

Is there a way around these issues?

Yes, TCC, next, let's look at TCC.

7. Scheme 3: TCC

7.1. What is TCC?

Several roles in distributed transactions

  • TM: transaction manager, which can be understood as the initiator of distributed transactions

  • Branch transaction: Multiple participants in a transaction can be understood as independent transactions.

TCC is the abbreviation of the three words Try, Confirm, and Cancel. TCC requires each branch transaction to implement three operations: preprocessing Try, confirming Confirm, and undoing Cancel.

The Try operation performs business checking and resource reservation, Confirm performs business confirmation operations, and Cancel implements an operation opposite to Try, that is, a rollback operation.

TM first initiates the try operation of all branch transactions. If the try operation of any branch transaction fails, TM will initiate the Cancel operation of all branch transactions. If all the try operations succeed, TM will initiate the Confirm operation of all branch transactions. If the Confirm/Cancel operation fails, TM will retry.

7.1.1. Normal flow

Try phase: call the try method of the participant in turn, and all return success

Confirm phase: call the confirm method of the participant in turn, and all return success

The transaction is complete.

picture

7.1.2. Exception flow

Try phase: call the try method of the participants in turn, the first two participants try methods return yes, and the participant 3 returns no

Cancel phase: execute the cancel operation on the successful participants. Note: the order of participants in the cancel phase is reversed from that of the participants in the try phase, that is, the cancel of participant 2 is called first, and then the cancel of participant 1 is called .

picture

7.2. TCC scenario case

7.2.1. Case 1: Cross-library transfer

For example, the scenario is that A transfers 100 yuan to B, and the accounts of A and B are in different services.

账户A
try:
 try幂等校验
 检查余额是否够100元
 A账户扣减100元
confirm:
 空
cancel:
 cancel幂等校验
 A账户增加可用余额100元

账户B
try:
 空
confirm:
 confirm幂等校验
 B账户增加100元
cancel:
 空

7.2.2. Case 2: Withdraw to Alipay

For example, everyone has played Douyin, and some friends have income on Douyin, and they can withdraw the income to Alipay. If you withdraw 100 to Alipay

抖音(账户表:余额、冻结金额)
try:
 try幂等校验
 检查余额是否够100元
 抖音账户表余额-100,冻结金额+100
confirm:
 confirm幂等校验
 抖音账户冻结金额-100
cancel:
 cancel幂等校验
 抖音账户表余额+100,冻结金额-100

账户B
try:
 空
confirm:
 confirm幂等校验
 调用支付宝打款接口,打款100元(对于商户同一笔订单支付宝接口是支持幂等的)
cancel:
 空

7.3. TCC common framework

framework name github address number of stars
tcc-transaction https://github.com/changmingxie/tcc-transaction 4750
hmily https://github.com/Dromara/hmily 2900
ByteTCC https://github.com/liuyangming/ByteTCC 2450
EasyTransaction https://github.com/QNJR-GROUP/EasyTransaction 2100

7.4. Self-developed TCC framework design ideas

7.4.1. Roles involved (transaction initiator, transaction participant, TCC service)

(1), transaction initiator (TM)

  • Initiate a distributed transaction: call the tcc service to register a distributed transaction order

  • call branch: call each branch in turn

  • Reporting results: finally report the execution results of all branches of the transaction to the TCC service

  • Provide compensation interface: used by TCC service, tcc service will call this compensation interface to perform compensation operation

(2), transaction participants

  • Provide 3 methods: try, confirm, cancel

  • Ensure idempotency of 3 methods

  • There are only 3 result status codes returned by the 3 methods (success, failure, and processing). Processing is equivalent to an unknown status. For those whose status is unknown, retries will be made during the compensation process.

(3), TCC service

  • is an independent service

  • Provide a distributed transaction order registration interface: use [the transaction initiator calls the tcc service to generate a distributed transaction order (order status: 0: processing, 1: processing successfully, 2: processing failed) for the transaction initiator to obtain a distributed transaction order Order id: TID]

  • Provide a distributed transaction result reporting interface: for the transaction initiator to use [The transaction initiator reports the execution result of the transaction to the TCC service during the execution of the transaction]

  • Provide transaction compensation operation: start a job to poll the order whose status is 1 in the tcc order, continue to call the transaction initiator to compensate, and finally after multiple compensations, the final status of this order should be 1 (success) or 2 (failure); Otherwise, manual intervention for processing

7.4.2. Timing diagram

](img/5.jpg)

7.4.3. Technical points of self-developed TCC framework

(1) Where the framework should be considered

Developers should only pay attention to the code of the three methods in the branch, and the rest should be completed by the framework.

(2), transaction order table design in tcc service

  • id: order id

  • bus_order_id: business order id

  • bus_order_type: business type (bus_order_id & bus_order_type need to be unique)

  • request_data: business request data, stored in json format, including the request data of the fun business side

  • status: status, 0: processing, 100: processing successfully, 200: processing failed, the initial status is 0, and the final status must be 100 or 200

(3), about the idempotent design of the three methods in the branch

Taking spring in java as an example, it can be realized through an interceptor, which intercepts the three methods of the branch, and implements idempotent operations in the interceptor.

It can be realized with a table [branch method execution record table: tid, branch, method (try, confirm, cancel), status (0: processing; 100: success; 200: failure), request_json (request parameters), response_json (response parameter)]

About request parameters: This is used to record the complete parameters of the entire method request, which contains business parameters and can be stored in json format.

Response parameter: the execution result of the branch method, stored in json format.

In the interceptor, use the two conditions of branch & method to query the branch method execution record table. If the query record status is 100 or 200, then return the response_json directly.

(4), the try phase is synchronous, and the other phases are asynchronous

If the try phase is all successful, then the confirm phase must be successful in the end. If there is a failure in the try phase, then cancel needs to be executed. In the end, all cancels should also be successful; so after the try phase is completed, the final result is already known. As a result, after the try phase is completed, the subsequent confirm or cancel can be executed in an asynchronous manner; improving the overall performance of the system.

(5), asynchronously report transaction execution results

The initiator reports the execution results of each step of all branches and the execution results of the final transaction to the tcc service, and the tcc service puts them in the warehouse, which is convenient for operators to view the transaction execution results and troubleshoot.

(6) About compensation

Add a compensation job to the tcc service, periodically poll the tcc distributed order table, and pull out the records whose status is processing. The order table request_data contains request parameters, and use request_data to call the compensation interface provided by the transaction initiator to perform compensation operations. Until the status of the order is final (success or failure).

The compensation is in the form of attenuation, and corresponding to the same order, it is compensated in the form of time interval attenuation, each interval: 10s, 20s, 40s, 80s, 160s, 320s. . .

(7), manual intervention

If the tcc distributed order has been in processing for a long time, after many times of compensation, it has not reached the final state. At this time, there may be business problems and manual compensation is required. For this pair of order records, a monitoring system is required to alarm and remind the development to intervene.

7.5. Summary

If you compare the processing flow of TCC transactions with 2PC two-phase commit, 2PC is usually at the cross-database DB level, while TCC is at the application level, which is an implementation of 2PC at the application level and needs to be implemented through business logic. accomplish. The advantage of this implementation of distributed transactions is that it allows applications to define the granularity of data operations, making it possible to reduce lock conflicts and improve throughput. The disadvantage is that it is very intrusive to the application, and each branch of the business logic needs to implement the three operations of try, confirm, and cancel, and the amount of code is relatively large.

8. Scenario 4: Reliable news

8.1. What is reliable message eventual consistency?

The reliable message eventual consistency scheme means that when the transaction initiator completes the local transaction and sends a message, the transaction participant (message consumer) must be able to receive the message and process the transaction successfully. This scheme emphasizes that as long as the message is sent to the transaction participant The final transaction of both parties must reach a consensus.

There are 2 key points here:

  1. After the local transaction of the message sender is successfully executed, the message will be delivered successfully

  2. The message consumer will eventually be able to consume this message, and eventually the distributed transaction will finally reach a consensus

8.2. Business scenario: place an order to send points

There is such a scenario in e-commerce: after the product is ordered, points need to be sent to the user. The order table and point table are in different dbs, which involves distributed transactions.

We address this with reliable messaging:

  1. We use mq to achieve the operation of sending points after the order is successfully placed

  2. After the product order is successfully placed, a message is delivered to mq, and the point system consumes the message to increase points for the user

Let's mainly discuss how to implement the operation of placing an order for a product and delivering a message to mq? Advantages and disadvantages of each method?

8.3. Message Delivery Process: Method 1

8.3.1. Process

  • step1 : Open local transaction

  • step2 : Generate a shopping order

  • step3 : Post the message to mq

  • step4 : Submit local transactions

This way is to send the message before the transaction commits.

8.3.2. Possible problems

  • An exception occurred in step3 : resulting in the failure of step4, the failure to place an order for the product, which directly affects the business of placing an order for the product

  • An exception occurred in step4, and the other steps succeeded : the product order failed, the message was delivered successfully, and points were added to the user

8.4. Message Delivery Process: Method 2

Next, let's change the way, we will send the message after the transaction.

8.4.1. Process

  • step1 : Open local transaction

  • step2 : Generate a shopping order

  • step3 : Submit local transactions

  • step4 : Post the message to mq

8.4.2. Possible problems

An exception occurred in step4, and the other steps were successful : resulting in successful ordering of products, failure to deliver messages, and no points added by the user

The above two are more common practices, but also the most error-prone.

8.5. Message Delivery Process: Method 3

  • step1 : Open local transaction

  • step2 : Generate a shopping order

  • step3 : Insert a record t_msg_record that needs to send a message in the local library

  • step3 : Submit local transactions

  • step5 : add a timer, poll t_msg_record, and post the records to be sent to mq

This method uses database transactions, business and message records as an atomic operation. After the business is successful, the message log must exist. The problems encountered in the first two ways are solved. If our business system is relatively simple, we can use this method.

For the case of microservices, the above method is not very good, each service needs the above operations; it is also not conducive to expansion.

8.6. Message Delivery Process: Method 4

Add a message service and message library , which is responsible for the storage of messages, sending and delivering messages to mq.

  • step1 : Open local transaction

  • step2 : Generate a shopping order

  • step3 : Insert a log into the current transaction library: generate a unique business id (bus_id), associate the bus_id with the order and save it in the library where the current transaction is located

  • step4 : Call the message service: carry the bus_id, and put the message into the warehouse first. At this time, the state of the message is waiting to be sent, and return the message id (msg_id)

  • step5 : Submit local transactions

  • step6 : If all the above are successful, call the message service and deliver the message to mq; if the above fails, call the message service to cancel the sending of the message

It can be considered that the above method has made great progress. Let’s continue to analyze the possible problems:

  1. A message service is added to the system. The ordering operation of commodities depends on this service, and the business is highly dependent on this service. When the message service is unavailable, the entire business will be unavailable.

  2. If step6 fails, the message will be in the state of being sent. At this time, the business side needs to provide a checkback interface (via bus_id query) to verify whether the business is executed successfully; the message service needs to add a scheduled task, and for the message that is in the state of being sent Do compensation processing, check whether the business is processed successfully; thus determine whether the message is delivered or canceled

  3. Step4 relies on the message service. If the message service performance is poor, the transaction submission time of the current business will be prolonged, deadlocks will easily occur, and the concurrency performance will be reduced . We are usually taboo to do remote call processing in transactions. The performance and time of remote calls are often uncontrollable, which will cause the current transaction to become a large transaction, causing other failures.

8.7. Message Delivery Process: Method 5

In the above methods, we continued to improve, and a better way appeared:

  • Step1 : Generate a globally unique business message id (bus_msg_id), call the message service, carry the bus_msg_id, and put the message into the warehouse first. At this time, the status of the message is pending sending, and return the message id (msg_id)

  • step2 : Open local transaction

  • step3 : Generate a shopping order

  • step4 : Insert a log into the current transaction library (associate the business in step3 with bus_msg_id)

  • step5 : Submit local transactions

  • step6 : There are two cases: if the above is successful, call the message service and deliver the message to mq; if there is a failure above, call the message service to cancel the sending of the message

If step6 fails, the message will be in the state of being sent. At this time, the business party needs to provide a checkback interface (via bus_msg_id query) to verify whether the business is successfully executed;

The message service needs to add a new scheduled task to compensate for the messages whose status is pending sending, and check whether the business is successfully processed; thus determine whether the message is delivered or canceled.

Compared with method 5 and method 4, one of the better points is that the call to the message service and the message landing operation are carried out outside the transaction. This small improvement is actually a very good optimization, which reduces the execution time of the local transaction. , so that the amount of concurrency can be increased. Ali has a message middleware RocketMQ that supports method 5, and you can use it.

picture

8.8. Some questions about message consumption

How to solve the problem of repeated consumption?

The consumer polls to pull messages from the mq server and then consumes them.

The process of message consumers consuming messages

  • step1 : Pull messages from mq

  • step2 : Execute local business, such as adding points

  • step3 : After consumption, delete the message from mq

When step2 succeeds and step3 fails, the message will be pulled from mq again, and there will be a problem of repeated consumption, so we need to consider the idempotence of consumption. The result of multiple consumption and one consumption of the same message should be Consistently, idempotency is another topic, which will be discussed in detail next time.

9. Scenario 5: Best Effort Notification

9.1. Alipay recharge case

If we have our own e-commerce system that supports users to use Alipay to recharge, the process is as follows:

picture

9.2. User payment process (is a synchronous process)

  1. The user initiates a recharge request in the browser -> e-commerce service

  2. The e-commerce service generates a recharge order, and the status is 0: pending payment (0: pending payment, 100: payment successful, 200: payment failed)

  3. The e-commerce service requests Alipay with the order information, generates an Alipay order, assembles the Alipay payment request address (order information, the page return_url displayed to the user after the payment is successful, and the asynchronous payment notification address notify_url), and returns the assembled information to the user

  4. The user browser jumps to the Alipay payment page to confirm the payment

  5. Alipay carries the payment result and calls back return_url synchronously, and return_url will display the payment result to the user

9.3. Alipay notifies the merchant of the payment result asynchronously

After the user's payment process is completed, the payment order in Alipay has been paid at this time, but the status of the recharge order in the e-commerce is still 0 (pending payment). At this time, Alipay will notify the payment result to notify_url in an asynchronous manner. During the notification process The Alipay notification may fail due to network problems. At this time, Alipay will try its best to notify the merchant of the result through multiple attenuating retries. This process is the best effort notification type.

After receiving the notification from Alipay, the merchant processes the local order in an idempotent manner, and then informs Alipay that the processing is successful, after which Alipay will no longer notify.

9.4. What are decaying notifications?

For example, Alipay will try to notify up to 100 times, and the interval between each notification will increase. For example, after the first failure, the second notification is made every 10s, and after the second failure, the third notification is made every 30s, and the intervals are incremented in turn.

9.5. What should I do if the Alipay notification fails?

Merchants can take the initiative to call Alipay's query interface to query the payment status of the order.

9.6. Why is asynchronous notification required?

Isn't there a return_url in the process of user payment? After Alipay pays successfully, it will call this address synchronously with the payment result, so the merchant can directly process the local order status in this return_url? This method is possible, but the user's network may be bad, and the call to return_url fails. At this time, the payment result must be notified to the merchant by asynchronous notification of notify_url.

9.7. What is the best-effort notification type used for?

In a distributed transaction, if the call result cannot be known immediately, the called party may take a long time to process the business. After the called party’s business is processed, the result can be notified to the caller by means of best effort notification.

9.8. Best Effort Notification Must Have Compensation Mechanism

The called party will try its best to notify the caller of the result. In extreme cases, there is a possibility of failure. At this time, the called party needs to provide a query interface.

The caller can actively go to the called party to inquire about the business for which the result has not been known for a long time, and then process it.

9.9. Can I take the initiative to check without notification?

Yes, the called party will provide a query interface, and the caller can know the result by proactively querying, but the notification method is more real-time.

After the callee succeeds, the caller will be notified immediately, but the caller actively adopts the query method, so when to query? This degree is difficult to grasp, so the combination of the two is better.

10. Comparative Analysis of Distributed Transactions

After studying various distributed transaction solutions, we learned the advantages and disadvantages of various solutions:

The biggest criticism of 2PC is that it is a blocking protocol. RM needs to wait for TM's decision after executing the branch transaction, and the service will block and lock resources at this time. Due to its blocking mechanism and high worst-case time complexity, this design cannot adapt to the needs of expansion as the number of services involved in the transaction increases, and it is difficult to use for high concurrency and long sub-transaction life cycle (long-running transactions) distributed services.

If you compare the processing flow of TCC transactions with the two-phase commit of 2PC, 2PC is usually at the DB level across databases, while TCC is processed at the application level and needs to be implemented through business logic . The advantage of this implementation of distributed transactions is that it allows applications to define the granularity of data operations, making it possible to reduce lock conflicts and improve throughput . The disadvantage is that it is very intrusive to the application, and each branch of the business logic needs to implement three operations: try, confirm, and cancel. In addition, its implementation is relatively difficult, and different rollback strategies need to be implemented according to different failure reasons such as network status and system failure. Typical usage scenarios: full, log in to send coupons, etc.

Reliable message eventual consistency transactions are suitable for scenarios with long execution cycles and low real-time requirements. After the introduction of the message mechanism, the synchronous transaction operation becomes an asynchronous operation based on message execution, which avoids the influence of synchronous blocking operations in distributed transactions and realizes the decoupling of the two services. Typical usage scenarios: sign up to get points, log in to get coupons, etc.

Best-effort notification is one of the lowest requirements in distributed transactions, and it is suitable for some businesses with low eventual consistency time sensitivity; it allows the initiating party to handle business failures, and actively handles failures after the receiving party receives the notification, regardless of the initiator How the notifying party handles the results will not affect the subsequent processing of the notifying party; the initiating notifying party needs to provide a query execution interface for the receiving notifying party to verify the results. Typical usage scenarios: bank notification, payment result notification, etc.

2PC TCC reliable news best effort notification
consistency strong consistency eventually consistent eventually consistent eventually consistent
throughput Low middle high high
implementation complexity easy Disaster middle easy

11. Summary

When conditions permit, we try our best to choose a single data source for local transactions, because it reduces the performance loss caused by network interaction and avoids various problems caused by weak data consistency. If a system frequently and unreasonably uses distributed transactions, it should first be observed from the perspective of overall design whether the service split is reasonable, whether it is high cohesion and low coupling? Is the granularity too small? Distributed transactions have always been a problem in the industry because of the uncertainty of the network, and we are used to comparing distributed transactions with stand-alone transactions ACID.

Whether it is XA at ​​the database layer, TCC at the application layer, reliable messages, best effort notifications, etc., none of them perfectly solve the problem of distributed transactions. They just make trade-offs in terms of performance, consistency, and availability, and seek some trade-offs under preference.

Guess you like

Origin blog.csdn.net/Javatutouhouduan/article/details/131895386