Distributed transaction and distributed system consistency solution (transfer)

Reprinted from: http://www.infoq.com/cn/articles/solution-of-distributed-system-transaction-consistency

 

In the field of OLTP systems, we face transaction consistency requirements in many business scenarios, such as the most classic case of Bob transferring money to Smith. In traditional enterprise development, the system often exists in the form of a single application and does not span multiple databases. We usually only need to use the unique data access technologies and frameworks in the development platform (such as Spring, JDBC, ADO.NET), and combine the transaction management mechanism of the relational database to achieve transactional requirements. Relational databases usually have ACID characteristics: Atomicity, Consistency, Isolation, Durability.

Large-scale Internet platforms are often composed of a series of distributed systems, and the development language platform and technology stack are relatively complex, especially in today's SOA and microservice architectures, a seemingly simple function may need to be called internally. To implement a "service" and operate multiple databases or shards, the situation is often much more complicated. A single technical means and solution can no longer deal with and satisfy these complex scenarios.

Characteristics of Distributed Systems

Readers who have studied distributed systems may have heard of "CAP law", "Base theory", etc. Coincidentally, in chemical theory, ACID is an acid, and Base happens to be a base. The author does not explain these concepts too much here, and interested readers can check the relevant references. The CAP law is as follows:

 

In a distributed system, it is impossible to satisfy the "consistency", "availability" and "partition fault tolerance" in the "CAP Law" at the same time. ” or “white, rich, beautiful” is more difficult. In the vast majority of scenarios in the Internet field, strong consistency needs to be sacrificed in exchange for high system availability. The system often only needs to ensure "eventual consistency", as long as the final time is within the range acceptable to users.

Distributed transaction

When it comes to distributed systems, it is inevitable to mention distributed transactions. To understand distributed transactions, we have to first introduce the two-phase commit protocol. Let's take a simple but imprecise example to illustrate:

In the first stage, Mr. Zhang, as the "coordinator", sent WeChat to Xiaoqiang and Xiaoming (participants, nodes), and organized them to gather at the school gate at 8 o'clock tomorrow, go to the mountain together, and then began to wait for Xiaoqiang and Xiaoming to reply.

In the second stage, if both Xiaoqiang and Xiaoming answered no question, then everyone will come as promised. If one of Xiaoqiang or Xiaoming answers, "I'm not free tomorrow, no," then Teacher Zhang will immediately notify Xiaoqiang and Xiaoming that "the mountain climbing activity is canceled".

Attentive readers will find that there may be many problems in this process. If Xiaoqiang didn't look at his phone, then Teacher Zhang would have been waiting for an answer. Xiaoming might have prepared all the climbing equipment at home but had been waiting for Teacher Zhang to confirm the information. What's more serious is that if Xiaoqiang has not replied by 8 o'clock tomorrow, then even if it is "timed out", will Xiaoming go or not go to the mountain?

This is the drawback of the two-phase commit protocol, so later the industry introduced the three-phase commit protocol to solve this type of problem.

The two-stage submission protocol is widely used and implemented in mainstream development language platforms and database products. Let's introduce the DTP model diagram provided by the XOpen organization:

The XA protocol refers to the interface between TM (Transaction Manager) and RM (Resource Manager). The current mainstream relational database products all implement the XA interface. JTA (Java Transaction API) conforms to the X/Open DTP model, and the XA protocol is also used between the transaction manager and the resource manager. In essence, distributed transactions are realized by means of a two-phase commit protocol. Let’s take a look at the model diagrams of XA transaction success and failure:

Under the JavaEE platform, mainstream commercial application servers such as WebLogic and Webshare provide the implementation and support of JTA. It is not implemented under Tomcat (in fact, I don't think Tomcat can be regarded as a JavaEE application server), which requires the help of third-party frameworks such as Jotm and Automikos, both of which support spring transaction integration.

In the Windows .NET platform, it can be programmed with the help of the TransactionScop  API in ado.net, and the MSDTC service in the Windows operating system must also be configured and used. If your database uses mysql, and mysql is deployed on the Linux platform, it cannot support distributed transactions. Due to space constraints, it is not expanded here, and interested readers can consult relevant materials and practice by themselves.

Summary: This method is not too difficult to implement, and it is more suitable for traditional single applications. There are cases of cross-database operations in the same method. However, the impact of distributed transactions on performance will be relatively large, and it is not suitable for scenarios with high concurrency and high performance requirements.

Provide a rollback interface

In the service-oriented architecture, function X needs to coordinate the back-end A, B or even more atomic services. So the question is, what if one of the calls of A and B fails?

In the author's work, I often encounter such problems, and often provide a BFF layer to coordinate the invocation of A and B services. If some need to return results synchronously, I will try to call them in a "serial" way. If calling A fails, it will not blindly call B. If the call to A succeeds and the call to B fails, it will try to roll back the call to A just now.

Of course, sometimes we don't have to strictly provide a separate corresponding rollback interface, which can be implemented cleverly by passing parameters.

In this case, we will try to put the service that can provide the rollback interface in the front. For example:

One of our forum websites will reward users with 5 points after successful login every day, but points and users are two independent subsystem services corresponding to different DBs, which is more troublesome to control. Solutions:

  1. Put the service calls for logging in and adding points in a native method in the BFF layer.
  2. When the user requests to log in to the interface, first perform the operation of adding points, and then perform the login operation after the points are added successfully.
  3. If the login is successful, it is of course the best, and the points are also added successfully. If the login fails, call the rollback interface corresponding to the point addition (execute the operation of reducing the points).

Summary: This method has many disadvantages and is usually not recommended in complex scenarios, unless it is a very simple scenario, it is very easy to provide rollback, and there are very few dependent services.

 

This implementation will result in a huge amount of code and high coupling. And it is very limited, because there are many businesses that cannot be rolled back easily. If there are many serial services, the cost of rollback is too high.

本地消息表

这种实现方式的思路,其实是源于ebay,后来通过支付宝等公司的布道,在业内广泛使用。其基本的设计思想是将远程分布式事务拆分成一系列的本地事务。如果不考虑性能及设计优雅,借助关系型数据库中的表即可实现。

举个经典的跨行转账的例子来描述。

第一步伪代码如下,扣款1W,通过本地事务保证了凭证消息插入到消息表中。

第二步,通知对方银行账户上加1W了。那问题来了,如何通知到对方呢?

通常采用两种方式:

  1. 采用时效性高的MQ,由对方订阅消息并监听,有消息时自动触发事件
  2. 采用定时轮询扫描的方式,去检查消息表的数据。

两种方式其实各有利弊,仅仅依靠MQ,可能会出现通知失败的问题。而过于频繁的定时轮询,效率也不是最佳的(90%是无用功)。所以,我们一般会把两种方式结合起来使用。

解决了通知的问题,又有新的问题了。万一这消息有重复被消费,往用户帐号上多加了钱,那岂不是后果很严重?

仔细思考,其实我们可以消息消费方,也通过一个“消费状态表”来记录消费状态。在执行“加款”操作之前,检测下该消息(提供标识)是否已经消费过,消费完成后,通过本地事务控制来更新这个“消费状态表”。这样子就避免重复消费的问题。

总结:上诉的方式是一种非常经典的实现,基本避免了分布式事务,实现了“最终一致性”。但是,关系型数据库的吞吐量和性能方面存在瓶颈,频繁的读写消息会给数据库造成压力。所以,在真正的高并发场景下,该方案也会有瓶颈和限制的。

MQ(非事务消息)

通常情况下,在使用非事务消息支持的MQ产品时,我们很难将业务操作与对MQ的操作放在一个本地事务域中管理。通俗点描述,还是以上述提到的“跨行转账”为例,我们很难保证在扣款完成之后对MQ投递消息的操作就一定能成功。这样一致性似乎很难保证。

先从消息生产者这端来分析,请看伪代码:

根据上述代码及注释,我们来分析下可能的情况:

  1. 操作数据库成功,向MQ中投递消息也成功,皆大欢喜
  2. 操作数据库失败,不会向MQ中投递消息了
  3. 操作数据库成功,但是向MQ中投递消息时失败,向外抛出了异常,刚刚执行的更新数据库的操作将被回滚

从上面分析的几种情况来看,貌似问题都不大的。那么我们来分析下消费者端面临的问题:

  1. 消息出列后,消费者对应的业务操作要执行成功。如果业务执行失败,消息不能失效或者丢失。需要保证消息与业务操作一致
  2. 尽量避免消息重复消费。如果重复消费,也不能因此影响业务结果

如何保证消息与业务操作一致,不丢失?

主流的MQ产品都具有持久化消息的功能。如果消费者宕机或者消费失败,都可以执行重试机制的(有些MQ可以自定义重试次数)。

如何避免消息被重复消费造成的问题?

  1. 保证消费者调用业务的服务接口的幂等性
  2. 通过消费日志或者类似状态表来记录消费状态,便于判断(建议在业务上自行实现,而不依赖MQ产品提供该特性)

 

总结:这种方式比较常见,性能和吞吐量是优于使用关系型数据库消息表的方案。如果MQ自身和业务都具有高可用性,理论上是可以满足大部分的业务场景的。不过在没有充分测试的情况下,不建议在交易业务中直接使用。

MQ(事务消息)

举个例子,Bob向Smith转账,那我们到底是先发送消息,还是先执行扣款操作?

好像都可能会出问题。如果先发消息,扣款操作失败,那么Smith的账户里面会多出一笔钱。反过来,如果先执行扣款操作,后发送消息,那有可能扣款成功了但是消息没发出去,Smith收不到钱。除了上面介绍的通过异常捕获和回滚的方式外,还有没有其他的思路呢?

下面以阿里巴巴的RocketMQ中间件为例,分析下其设计和实现思路。

RocketMQ第一阶段发送Prepared消息时,会拿到消息的地址,第二阶段执行本地事物,第三阶段通过第一阶段拿到的地址去访问消息,并修改状态。细心的读者可能又发现问题了,如果确认消息发送失败了怎么办?RocketMQ会定期扫描消息集群中的事物消息,这时候发现了Prepared消息,它会向消息发送者确认,Bob的钱到底是减了还是没减呢?如果减了是回滚还是继续发送确认消息呢?RocketMQ会根据发送端设置的策略来决定是回滚还是继续发送确认消息。这样就保证了消息发送与本地事务同时成功或同时失败。如下图:

总结:据笔者的了解,各大知名的电商平台和互联网公司,几乎都是采用类似的设计思路来实现“最终一致性”的。这种方式适合的业务场景广泛,而且比较可靠。不过这种方式技术实现的难度比较大。目前主流的开源MQ(ActiveMQ、RabbitMQ、Kafka)均未实现对事务消息的支持,所以需二次开发或者新造轮子。比较遗憾的是,RocketMQ事务消息部分的代码也并未开源,需要自己去实现。

其他补偿方式

做过支付宝交易接口的同学都知道,我们一般会在支付宝的回调页面和接口里,解密参数,然后调用系统中更新交易状态相关的服务,将订单更新为付款成功。同时,只有当我们回调页面中输出了success字样或者标识业务处理成功相应状态码时,支付宝才会停止回调请求。否则,支付宝会每间隔一段时间后,再向客户方发起回调请求,直到输出成功标识为止。

其实这就是一个很典型的补偿例子,跟一些MQ重试补偿机制很类似。

一般成熟的系统中,对于级别较高的服务和接口,整体的可用性通常都会很高。如果有些业务由于瞬时的网络故障或调用超时等问题,那么这种重试机制其实是非常有效的。

当然,考虑个比较极端的场景,假如系统自身有bug或者程序逻辑有问题,那么重试1W次那也是无济于事的。那岂不是就发生了“明明已经付款,却显示未付款不发货”类似的悲剧?

其实为了交易系统更可靠,我们一般会在类似交易这种高级别的服务代码中,加入详细日志记录的,一旦系统内部引发类似致命异常,会有邮件通知。同时,后台会有定时任务扫描和分析此类日志,检查出这种特殊的情况,会尝试通过程序来补偿并邮件通知相关人员。

在某些特殊的情况下,还会有“人工补偿”的,这也是最后一道屏障。

小结

上诉的几种方案中,笔者也大致总结了其设计思路,优势,劣势等,相信读者已经有了一定的理解。其实分布式系统的事务一致性本身是一个技术难题,目前没有一种很简单很完美的方案能够应对所有场景。具体还是要使用者根据不同的业务场景去抉择。

关于作者

丁浪,现就职于某垂直电商平台,担任技术架构师。关注高并发、高可用的架构设计,对系统服务化、分库分表、性能调优等方面有深入研究和丰富实践经验。热衷于技术研究和分享。

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=326645352&siteId=291194637