Three scenarios where you need to consider consistency issues

1 Introduction

In the process of distributed system development, there are often many scenarios that need to ensure data consistency. For example 接收mq消息, 接收http请求if 内部业务处理。you don't know these scenarios or how to deal with them, please continue reading.

2. Receive mq messages

The scenario of receiving mq messages must be a relatively common scenario in the development process of distributed systems. The specific process is that the peripheral system pushes mq messages to the development system, and the development system performs business logic processing after receiving the messages. On the surface it is a very simple process, but when it comes to data consistency, it is not so simple.

Why is it not so simple? Let's take a look at the following scenarios:

2.1 First ack the message and then process the business

ack();
// 处理业务逻辑
复制代码

There will be no problem in the scenario of acking the message first and then processing the business 理想. You may have doubts when you see this. If there is no problem, then you will have it. What else need to be considered? Please note that this kind of no problem is based on 理想the premise that if the external interface is called abnormally or the database is down during business processing, it will cause message loss and data inconsistency.

2.2 Process the business first and then ack the message

// 处理业务逻辑
ack();
复制代码

Since it is impossible to ack the message first and then process the business, there is no problem in processing the business first and then acking the message. Yes, even if the business processing failure message is not acked, the message will be re-consumed, and there will be no data inconsistency problem. But this will involve another problem, that is, the idempotency problem. If the idempotent problem is not handled well, it will still cause data inconsistency.

In fact, processing the business first and then acking the message will cause another problem. If there is a bug in the business system, the message will not be acked all the time, which will cause the message processing to enter an infinite loop.

This doesn't work, that doesn't work, there's no solution? Of course not, there are still plans, let me tell you slowly

2.3 Combining message ack mechanism + database + timing task scheme 1

try {
    try{
        // 根据消息唯一编号查询该消息是否已处理过,如果没有处理过,进行处理业务;如果处理过,则说明都不做
    } catch (Exception e) {
       // 将处理异常的消息插入数据库中
    }
    ack();
} catch (Exception e) {
    unack();
}
复制代码

大体思路就是,根据消息唯一编号判断消息是否被处理过,如果未被处理过,就对消息进行处理,处理成功则对消息进行ack;处理失败则将消息存入数据库中。如果存入数据库这一步操作还是失败,那么就对消息进行unack操作,将消息重新投递到消息服务器中,进而重新消费,直到数据库恢复为止。针对处理失败入库的消息,可以通过定时任务重试处理。

该方案不仅可以解决2.2中的幂等性问题,还可以解决业务出现bug进而导致消息处理进入死循环的问题(限制重试次数)。但是该方案还是会存在一个跟本文无关的问题,那就是消息积压问题。

2.4 结合消息ack机制 + 数据库 + 定时任务方案二

try {
    // 根据消息唯一编号查询该消息是否存在,不存在则直接插入数据库中,存在则不进行处理
    ack();
    // 异步处理业务逻辑
} catch (Exception e) {
    unack();
}
复制代码

2.42.3优缺点对比

序号 优点 缺点
2.3 只在业务处理失败将消息插入数据库中,消息数量不会太多 消息处理慢会导致消息积压
2.4 消息异步处理,不会导致消息积压 所有消息都存储数据库,消息数量可能会很多

关于这两种方案可以根据实际情况进行自由选择,消息积压问题处理也可以参考:消息积压你作何处理?

3. 接收http请求

看到这个图你可能会想这不就是一个很简单的流程嘛,开发系统接收请求、处理请求、响应结果就可以了。如你所想,确实很简单,但是如果你的开发系统业务处理失败,就会导致外围系统进行重试,直到重试次数用完,开发系统还未恢复正常,那么此次的外围请求数据就会丢失,从而引起数据不一致性问题。

认真分析一下该场景,你会发现造成数据不一致性问题的关键在于开发系统的业务处理。如果开发系统能在正确接收外围系统请求后立刻进行响应,那么就可以解决该问题。

我们只需要在接收http请求后,将请求内容写入数据库,写入成功进行异步处理并返回成功;写入失败返回失败,通过外围重试来保证数据可以正常写入数据库。处理失败的内容可以结合定时任务对请求进行重试。

4.内部业务处理

开发系统某些业务在处理成功后往往需要通知某些外围系统并且还不能因为外围系统故障从而导致当前业务无法正确处理,既然不能影响当前业务,可以采用异步的方式进行处理,异步处理就会存在当前业务处理成功,通知外围失败的问题,进而引起数据不一致性问题。

那有没有办法可以保证业务处理成功的同时,对外围的通知也一定成功呢?

我们可以采用本地事务的方案,把通知给外围的数据放在当前业务的事务中插入到数据库,异步通知外围系统,处理失败的数据再结合定时任务进行重试。

5.总结

从文中我们可以看到一致性问题的解决方案都逃不开数据库 + 重试,因此在解决一致性问题的时候可以多往这方面考虑。

Guess you like

Origin juejin.im/post/7079686504619966500