Solve distributed transactions using kafka message queue (reliable message eventual consistency scheme - local message service) (transfer)

Introduction to Microservice Framework Spring Cloud Part 1: Implementing Distributed Transactions Using Event and Message Queues

Unlike monolithic applications, in a distributed environment, it will be difficult to perform transaction operations, because there are usually multiple data sources in a distributed environment, and it is difficult to ensure the consistency of data from multiple data sources only by using local database transactions. In this case, a two-phase or three-phase commit protocol can be used to complete the distributed transaction. However, this method generally has poor performance because the transaction manager needs to wait multiple times between multiple data sources. There is a This method can also solve the problem of distributed transactions, and has better performance. This is what I will introduce in this article to implement distributed transactions using events, local transactions and message queues.

Let's start with a simple example. Basically all Internet applications will have the function of user registration. In this example, we have two steps for user registration: 
1. The registration is successful, and the user information is saved.
2. The user needs to be issued a The purpose of vouchers is to encourage users to consume.
If it is a single-architecture application, implementing this function is very simple: in a local transaction, insert a record into the user table, and insert a record in the voucher table, and the transaction is submitted. But if our application is implemented with microservices, maybe users and vouchers are two independent services, they have their own applications and databases, then there is no way to simply use local transactions to ensure the atomicity of operations . Now let's see how to use the event mechanism and message queue to achieve this requirement. (The message queue I use here is kafka, the principle is also applicable to other queues such as ActiveMQ/RabbitMQ)

We will create an event for the user registration operation, which is called the user created event (USER_CREATED). After the user service successfully saves the user record, it will send the user created event to the message queue, and the voucher service will listen to the user created event. Once received When the event arrives, the voucher service will create a voucher for the user in its own database. Well, these steps seem quite simple and intuitive, but how to ensure the atomicity of the transaction? Consider the following two Scenario:
1. The user service goes down before saving the user record and before sending a message to the message queue. How to ensure that the user creation event must be sent to the message queue?
2. The voucher service receives the user creation event and has not yet It goes down when there is no time to process the event. How to consume the previous user-created event after restarting?
The essence of these two problems is: how to make the two operations of operating the database and operating the message queue become an atomic operation. Regardless of 2PC, here we This problem can be solved by the event table. Below is the class diagram. 

EventPublish is a table that records events to be published. Among them:
id: Each event will generate a globally unique ID when it is created, such as UUID.
status: Event status, enumeration type. Now there are only two statuses: to be published (NEW) , Published (PUBLISHED).
payload: event content. Here we will convert the event content into json and store it in this field.
eventType: event type, enumeration type. Each event will have a type, such as we mentioned before Creating a user USER_CREATED is an event type.
EventProcess is used to record pending events. The fields are basically the same as EventPublish.

我们首先看看事件的发布过程. 下面是用户服务发布用户创建事件的顺序图. 
1. 用户服务在接收到用户请求后开启事务, 在用户表创建一条用户记录, 并且在EventPublish表创建一条status为NEW的记录, payload记录的是事件内容, 提交事务.
2. 用户服务中的定时器首先开启事务, 然后查询EventPublish是否有status为NEW的记录, 查询到记录之后, 拿到payload信息, 将消息发布到kafka中对应的topic.
发送成功之后, 修改数据库中EventPublish的status为PUBLISHED, 提交事务.

下面是代金券服务处理用户创建事件的顺序图. 
1. 代金券服务接收到kafka传来的用户创建事件(实际上是代金券服务主动拉取的消息, 先忽略消息队列的实现), 在EventProcess表创建一条status为NEW的记录, payload记录的是事件内容, 如果保存成功, 向kafka返回接收成功的消息.
2. 代金券服务中的定时器首先开启事务, 然后查询EventProcess是否有status为NEW的记录, 查询到记录之后, 拿到payload信息, 交给事件回调处理器处理, 这里是直接创建代金券记录. 处理成功之后修改数据库中EventProcess的status为PROCESSED, 最后提交事务.

回过头来看我们之前提出的两个问题:
1. 用户服务在保存用户记录, 还没来得及向消息队列发送消息之前就宕机了. 怎么保证用户创建事件一定发送到消息队列了?
根据事件发布的顺序图, 我们把创建事件和发布事件分成了两步操作. 如果事件创建成功, 但是在发布的时候宕机了. 启动之后定时器会重新对之前没有发布成功的事件进行发布. 如果事件在创建的时候就宕机了, 因为事件创建和业务操作在一个数据库事务里, 所以对应的业务操作也失败了, 数据库状态的一致性得到了保证.
2. 代金券服务接收到用户创建事件, 还没来得及处理事件就宕机了. 重新启动之后如何消费之前的用户创建事件?
根据事件处理的顺序图, 我们把接收事件和处理事件分成了两步操作. 如果事件接收成功, 但是在处理的时候宕机了. 启动之后定时器会重新对之前没有处理成功的事件进行处理. 如果事件在接收的时候就宕机了, kafka会重新将事件发送给对应服务.

通过这种方式, 我们不用2PC, 也保证了多个数据源之间状态的最终一致性.
和2PC/3PC这种同步事务处理的方式相比, 这种异步事务处理方式具有异步系统通常都有的优点:
1. 事务吞吐量大. 因为不需要等待其他数据源响应.
2. 容错性好. A服务在发布事件的时候, B服务甚至可以不在线.
缺点:
1. 编程与调试较复杂.
2. 容易出现较多的中间状态. 比如上面的例子, 在用户服务已经保存了用户并发布了事件, 但是代金券服务还没来得及处理之前, 用户如果登录系统, 会发现自己是没有代金券的. 这种情况可能在有些业务中是能够容忍的, 但是有些业务却不行. 所以开发之前要考虑好.

另外, 上面的流程在实现的过程中还有一些可以改进的地方:
1. 定时器在更新EventPublish状态为PUBLISHED的时候, 可以一次批量更新多个EventProcess的状态.
2. 定时器查询EventProcess并交给事件回调处理器处理的时候, 可以使用线程池异步处理, 加快EventProcess处理周期.
3. 在保存EventPublish和EventProcess的时候同时保存到Redis, 之后的操作可以对Redis中的数据进行, 但是要小心处理缓存和数据库可能状态不一致问题.
4. 针对Kafka, 因为Kafka的特点是可能重发消息, 所以在接收事件并且保存到EventProcess的时候可能报主键冲突的错误(因为重复消息id是相同的), 这个时候可以直接丢弃该消息.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326349174&siteId=291194637