Talk about the database, cache coherency

A few years ago, I was watching the blog, see the title there is a blog about the database, cache coherency, disagree, direct jump over, I thought, such a simple question also discussed a Ghosts. This idea lasted a long time, until one day, I see more and more people are discussing database, cache coherency problems, it looked good next blog, I realized that database, cache coherency is not really a simple question. Today I have to talk to the database, cache coherency problem.

Popular Science

Considering there are some small technical partners may not so good, may not come into contact with the cache, so here is spend a minute to introduce what is the cache, why we need caching, and database and how to match the cache is used.

Reads the database is relatively time-consuming operation, if always need to read data to the database, the database will cause some pressure on program performance will be relatively low, so it is necessary to introduce the cache.

Cache program is to enhance the performance of the most important and most effective, one of the most simple means.

After the introduction of the cache, the cache read operation will go look, if there is no cache hit, only to read the database, and then read out the data and then go into the cache, so next time you can read a cache hit, If the cache hit, the data can be directly returned out.

Write operation, in addition to modify the database, also need to remove the cache, because they do not delete the cache, the cache is always the old data read operation read.

Delete cache, the revised database

This program is clearly problematic.

Two concurrent read and write operations:

Advanced to a write operation, the cache is deleted;
When the write operation has not been updated in the database, a read request has come in, we found no cache hit, went to the old database data taken out;
Write operation updates the database;
Read the old data in the cache.

Thus, the data in the database and the data cache will not match, in order to better enable us to understand the process, to offer an ugly inextricable map:

This solution is clearly not, but this program is really good for nothing it?

No, let's imagine this scenario: a written request comes in, delete the cache, this time, Redis server suddenly a problem, or sudden network problems, resulting in the failure to delete the cache, an exception is thrown, causing the program no proceeding to modify the database. From the database, cache coherency point of view, this ensures a good database, cache consistency, both saved data is the same, even though the old data is stored.

To modify the database, delete the cache

相信绝大多数小伙伴都是运用的这个方案，先前我觉得数据库，缓存一致性没有什么好讨论的，太简单了，就是因为我觉得这个方案是如此完美，但是后面我才慢慢发现这个方案也有一定的问题。

看到第一种方案存在的问题，大家也一定想到了这个方案也有同样的问题。

在没有缓存的情况下，两个并发的读写操作：

读操作先进来，发现没有缓存，去数据库中读数据，这个时候因为某种原因卡了，没有及时把数据放入缓存；
写的操作进来了，修改了数据库，删除了缓存；
读操作恢复，把老数据写进了缓存。

这样就造成了数据库、缓存不一致，不过，这个概率出现的非常低，因为这需要在没有缓存的情况下，有读写的并发操作，在一般情况下，写数据库的操作要比读数据库操作慢得多，在这种情况下，还要保证读操作写缓存晚于写操作删除缓存才会出现这个问题，所以这个问题应该可以忽略不计。

说了这么多，并没有看到先修改数据库，后删除缓存的致命问题啊，别急，让我们继续设想这样的场景：一个写的操作进来，修改了数据库，但是删除缓存的时候，由于Redis服务器出现问题了，或者网络出现问题了，导致删除缓存失败，这样数据库保存的是新数据，但是缓存里面的数据还是老数据，妥妥的数据库、缓存不一致啊。

延迟双删

可以看到修改数据库，后删除缓存有两个问题，虽然两个问题都是低概率的，但是永远追求完美的程序员可不能允许有这样的事情发生，所以第三种方案出现了：延迟双删。

延迟双删就是先删除缓存，后修改数据库，最后延迟一定时间，再次删除缓存。

Pictures .png

这么做就可以在一定程度上缓解上述两个问题，第一次删除缓存相当于检测下缓存服务是否可用，网络是否有问题，第二次延迟一定时间，再次删除缓存，是因为要保证读的请求在写的请求之前完成。

但是这么做，还是有一定问题，比如第一次删除缓存是成功的，第二次删除缓存才失败，又该怎么办？

内存队列删除缓存

上面三种方式，都有一定的问题：

修改数据库、删除缓存这两个操作耦合在了一起，没有很好的做到单一职责；
如果写操作比较频繁，可能会对Redis造成一定的压力；
如果删除缓存失败，该怎么办？

为了解决上面三个问题，第四种方式出现了：内存队列删除缓存：写操作只是修改数据库，然后把数据的Id放在内存队列里面，后台会有一个线程消费内存队列里面的数据，删除缓存，如果缓存删除失败，可以重试多次。

Pictures .png

这样，就把修改数据库和删除缓存两个操作解耦了，如果删除缓存失败，也可以多次尝试。由于后台有一个线程去消费内存队列去删除缓存，不是直接删除缓存，所以修改数据库和删除缓存之间产生了一定的延迟，这延迟应该可以保证读操作已经执行完毕了。

但是这么做也有不好的地方：

程序复杂度成倍上升，需要维护线程、队列以及消费者；
如果写操作非常频繁，队列的数据比较多，可能消费会比较慢，修改数据库后，间隔了一定的时间，缓存才被删除。

但是这也是没有办法的事情，哪有十全十美的解决方案。

第三方队列

一般来说，系统分为前台系统和后台系统，前台系统主要是读操作，后台系统才有写操作。

比如商品中心，前台是面向用户的，当用户打开商品详情页，会去缓存中拿数据，后台是面向业务人员的，业务人员可以在后台系统对商品信息进行修改。

如果是具有一定规模的公司，前台系统和后台系统肯定不在同一个服务器上，而且是由不同的部门去负责的，所以内存队列是肯定用不了的，如果后台系统修改数据库后，直接删除缓存，一定会发生如下的故事。

后台系统小明：你们前台系统的产品详情缓存的key是什么格式的？发我下。
前台系统小花：Product:XXXXX。
后台系统小明：好的。

过了几天，小花找到小明。

前台系统小花：不对啊。你们怎么没有把活动中的产品详情缓存给删掉啊？
后台系统小明：纳尼，我怎么知道你们是两个缓存啊，把活动中的产品详情缓存的key的格式发我下。
前台系统小花：Activity:Product:XXXX。
后台系统小明：好的。

A few days later, the development of the system of orders and find Xiao Ming.
Order system Xiaoqiang: After you modify the details of the product, but also to order the product details to delete the cache.
Back-office systems Xiao Ming: . .

A few days later, developing advertising system has found Xiao Ming.
Advertising System Wang: After you modify the details of the product, but also to the advertised product details to delete the cache.

Background system Xiaoming died at the age of 25.

If the queue referenced third party, such as RabbitMQ, Kafka, Xiao Ming would not "death", and Xiao Ming back-office systems to modify the database, do not care about the cache of things, as long as the data Id thrown message queues, the foreground system, advertising system, the development of consumer data message queuing system in order to delete the cache.

The above said several options, are relatively common, relatively simple, of course, different scenarios can also be used with, but there is no "silver bullet", there is no perfect solution, it depends on your team, for your scene what kind of a solution.

Today's topic to end here.