Distributed transaction learning-it is very clear

Artificial intelligence, zero-based entry! http://www.captainbed.net/inner 

  1. What is a transaction?
  2. Look at business from another angle
  3. Transactions in Java
  4. What is a distributed transaction?
  5. Several Realization Ideas of Distributed Transaction
  6. to sum up

Preface

Today, when distributed and microservices are popular, I believe that everyone is familiar with these terms. When it comes to the benefits of using distributed or splitting microservices, you can definitely think of a lot.

For example, each person only needs to maintain their own separate service, and there is no previous code conflicts. If you want to test, release, or upgrade, you only need to care. The code written by yourself is OK. It's very convenient and considerate!

However, things have two sides, but it also brings some problems at the same time. Today's article talks about one of the thorny problems brought about by the distributed system architecture: distributed transactions.

What is a transaction?

First raise a question: What is a transaction?

Some people will say that a transaction is a series of operations that either succeed or fail at the same time; then they will start the narrative from the ACID characteristics of the transaction (atomicity, consistency, isolation, and durability).  Indeed, the transaction is to ensure that a series of operations can be executed normally, and it must also meet the ACID characteristics.

But today, let’s think about it from a different perspective. We not only need to know  what (for example, what is a transaction) , but also the why of a transaction (for example, why is there a concept of a transaction? What problem is a transaction to solve).

Sometimes, from another angle, there may be different gains.

Look at business from another angle

Just as classic literary works come from life, but are higher than life, the concept of affairs also comes from life. The introduction of "affairs" must be to solve a certain problem, otherwise, who would want to do such boring things?

The simplest and most classic example: bank transfer, we want to transfer 1,000 yuan from account A to account B. Under normal circumstances, if 1000 is transferred from A to account B, the balance of A account is reduced by 1000 (we use action1 for this operation), and the balance of B account is increased by 1000 (we use action2 for this operation)

First of all, we have to make it clear that action1 and action2 are two operations. Since there are two operations, there must be a sequence of execution. Then there may be a problem when action1 is executed and just ready to execute action2 (for example, the database is overloaded and temporarily denied access).

An analogy in our lives is that I transferred 1,000 yuan to a friend, and then the balance in my card was less than 1,000, but my friend did not receive the money.

In order to solve this "where did money go" problem, the concept of "transactions" was introduced. In other words, since you cannot guarantee 100% success when I transfer money, for example, the banking system can only guarantee 99.99% high availability, then if the above problems occur within 0.01% of the time, the banking system directly rolls back the action1 operation ? (That is, add 1000 yuan back to the balance) For the banking system, I may not be able to guarantee that action1 and action2 will succeed at the same time 0.01% of the time, so when something goes wrong, I guarantee that both of them will fail at the same time. (The atomicity of transactions) Through this example, the two questions raised at the beginning have been answered (Why are there transactions? What problems are transactions intended to solve?)

To summarize: transaction is to ensure that a series of operations can be executed safely and correctly under any circumstances through its ACID feature.

Transactions in Java

After figuring out the affairs, let's look at something familiar, how do transactions in java play?  In Java, what we usually use most is to add @Transactional annotations to the addition, deletion and modification methods of the service layer , so that spring can help us manage our affairs.

Its bottom layer will generate a corresponding proxy dynamic proxy for our service component, so that all methods to the service component are taken over by its corresponding proxy. When the proxy calls the corresponding business method such as add(), the proxy will be based on AOP The idea is to execute setAutoCommit(false) to open the transaction before calling the real business method.

Then execute commit to commit the transaction after the business method is executed, and rollback will be executed to roll back the transaction when an exception occurs during the execution of the business method .

Of course, the specific implementation details of @Transactional annotation will not be expanded here. This is not the focus of this article. The topic of this article is "distributed transaction". If you are interested in @Transactional annotation, you can interrupt the debug source code research. The source code tells the truth.

What is a distributed transaction?

After so long, I finally reached the first point of this article!

First of all, have you ever thought about it: Since there are transactions, and it is so convenient to use spring's @Transactional annotation to control transactions, why do you have to develop a concept of distributed transactions? Furthermore, what is the relationship between distributed transactions and ordinary transactions? What's the difference? What problem does distributed transaction solve?

Various questions follow one after another, don't worry, with these thoughts, let's talk about distributed transactions in detail next.

Since it is called a distributed transaction, it must have something to do with distributed! Simply put, distributed transactions refer to transactions in a distributed system.

First take a look at the following picture:

 

As shown in the figure above, a single-block system has 3 modules: employee module, financial module and leave module. We now have an operation that needs to call and complete the interfaces in these 3 modules in order.

This operation is a whole, contained in a transaction, and either succeeds at the same time or fails at the same time to roll back. If you don't succeed, you will become benevolent, and there is no problem with this.

But when we split the monolithic system into a distributed system or microservice architecture, transactions are not as fun as above.

First, let's take a look at the architecture diagram after splitting into a distributed system, as shown below:

 

 

The figure above is the execution of the same operation in a distributed system. The staff module, financial module and leave module are split into staff system, financial system and leave system respectively.

For example, a user performs an operation. This operation needs to call the employee system for pre-processing, and then call the interfaces of the financial system and the leave system through http or rpc for further processing, and their operations need to be implemented in the database.

A series of operations of these three systems actually need to be all wrapped in the same distributed transaction. At this time, the operations of these three systems either succeed or fail at the same time.

Completing an operation in a distributed system usually requires coordinated calls and communication between multiple systems, such as the example above.

The three subsystems: employee system, financial system, and leave system communicate through http or rpc instead of calling between different modules in a monolithic system. This is the biggest difference between a distributed system and a monolithic system .

Some students who usually don't pay much attention to distributed architecture may say: I just use spring's @Transactional annotation directly and it's OK, so what do you do!

But here is an extremely important point: the monolithic system runs in the same JVM process, but each system in the distributed system runs in its own JVM process.

Therefore, you can't add @Transactional annotation directly, because it can only control transactions in the same JVM process, but it is powerless for such transactions that span multiple JVM processes.

Several Realization Ideas of Distributed Transaction

After figuring out what distributed transactions are, how do distributed transactions work? Let's introduce several implementation schemes of distributed transactions to you below.

Reliable Message Eventual Consistency Scheme

The entire flow chart is as follows:

 

Let's explain the general process of this program:

  1. The system A first sends a prepared message to mq. If the prepared message fails to be sent, then the operation is cancelled directly. Do not perform any subsequent operations.
  2. If the message is sent successfully, then execute the local transaction of the A system, if the execution fails, tell mq to roll back the message, and subsequent operations will not be executed.
  3. If the local transaction of system A is executed successfully, tell mq to send a confirmation message.
  4. What if system A delays in sending confirmation messages? At this time, mq will automatically poll all prepared messages at regular intervals, and then call the interface provided in advance by the A system to check whether the last local transaction of the A system was successfully executed through this interface. If it succeeds, it will send a confirmation message to mq; if it fails, it will tell mq. Roll back the message (subsequent operations will not be executed).
  5. At this time, the B system will receive the confirmation message, and then execute the local transaction. If the local transaction is successfully executed, the transaction is completed normally.
  6. What if the local transaction execution of system B fails? Retry based on mq, mq will automatically retry until it succeeds. If it fails, you can send an alarm to manually roll back and compensate. The point of this scheme is that it can be retried continuously based on mq, and it will be executed successfully in the end. The reason for the general execution failure is network jitter or the instantaneous load of the database is too high, which are temporary problems. Through this scheme, 99.9% of the cases can guarantee the final consistency of the data, and when the remaining 0.1% has a problem, the data is manually repaired.

Applicable scenarios:  This program is still widely used. At present, most domestic Internet companies are playing based on this idea.

Best Effort Notification Scheme

The entire flow chart is as follows:

 

The general process of this program:

  1. After system A's local transaction is executed, it sends a message to MQ.
  2. There will be a best-effort notification service dedicated to consuming MQ. This service will consume MQ and then write it to the database and record it, or put it in a memory queue. Then call the interface of system B.
  3. If system B executes successfully, everything is ok, but what if system B fails to execute? At this time, the best-effort notification service will try to re-invoke system B regularly, repeat N times, and finally give up if it fails.

The difference between this scheme and the above reliable message final consistency scheme:

The reliable message eventual consistency scheme can guarantee that as long as the transaction of system A is completed, the transaction of system B will always be completed through non-stop (unlimited) retries.

But the best-effort solution is different. If system B's local transaction fails, it will retry N times and then not retry again, and system B's local transaction may not be completed.

As for how much effort you have to control it, this needs to be configured in conjunction with your own business.

For example, for an e-commerce system, in a business scenario where an order is sent to notify the user that the order is successful after placing an order, the order is completed normally, but there is a temporary problem with the short message service at this link, which leads to 3 retries or failures.

Then no longer try to send SMS at this time, because in this scenario we think that 3 times is considered as "best effort".

A simple summary: within the specified number of retries, if the execution succeeds, everyone is happy, if the maximum number of retries is exceeded, they will give up and no more retries.

Applicable scenarios:  Generally used in less important business operations, that is, the scenario where it is completed is the icing on the cake, but it does not have any bad effect on me if it fails.

For example, some of the notification messages in the e-commerce mentioned above are more suitable for using this best-effort notification scheme to ensure distributed transactions.

tcc strong consistency scheme

The full name of TCC is:

  • Try
  • Confirm (confirm/submit)
  • Cancel (roll back).

This actually uses the concept of compensation, which is divided into three stages:

  1. Try phase: This phase is about testing the resources of each service and locking or reserving resources;
  2. Confirm phase: This phase is about performing actual operations in each service;
  3. Cancel phase: If the business method of any service is executed incorrectly, then compensation is needed here, which is to perform the rollback operation of the successfully executed business logic.

Let me give you an example:

 

For example, when transferring money across banks, the distributed transactions of two banks are involved. If you use the TCC solution to implement it, the idea is like this:

  1. Try stage: first freeze the funds in the two bank accounts and stop the operation;
  2. Confirm phase: the actual transfer operation is performed, the funds in the bank account of A are deducted, and the funds in the bank account of B are increased;
  3. Cancel stage: If the operation of any bank fails, then it needs to be rolled back for compensation. For example, if the bank account of A has been deducted, but the increase of the funds in the bank account of B fails, then the funds in the bank account of A must be added go back.

Applicable scenarios: To be honest, few people use this scheme, and we use less, but there are also scenarios for use.

Because this transaction rollback actually relies heavily on you to write code to roll back and compensate, it will cause huge compensation code, which is very disgusting.

For example, we generally use TCC for money-related, money-related, payment, and transaction-related scenarios to strictly ensure that distributed transactions are either all successful or all automatically rolled back, and the correctness of funds is strictly guaranteed. , No problems with funding are allowed.

More suitable scenarios: Unless you really require too much consistency, it is the core of your system. For example, the common scenario is funding, then you can use the TCC solution. You need to write a lot of business logic yourself, judge whether each link in a transaction is ok, and execute the compensation/rollback code if it is not ok.

And it is best that the execution time of your various businesses is relatively short.

But to be honest, generally try not to do this. Handwriting the rollback logic or compensation logic by yourself is really disgusting. The business code is difficult to maintain.

to sum up

This article introduces what is a distributed transaction, and then also introduces the 3 most commonly used distributed transaction schemes. However, in addition to the above scheme, there are actually two-phase submission scheme (XA scheme) and local message table and other schemes. But to be honest, very few companies use these programs, and due to space limitations, we won't introduce them.

 

Guess you like

Origin blog.csdn.net/qq_35860138/article/details/102621351