[Turn] the distributed database and cache coherence scheme to resolve the double write

Transfer: https://www.cnblogs.com/rjzheng/p/9041659.html

introduction

Why write this article?

First, the cache due to its high concurrency and high performance characteristics, has been widely used in the project. In terms of read cache, we doubt nothing, are in accordance with the flow chart next to business operations.
image
But in terms of updating the cache, to finish updating the database, it is updating the cache, or delete the cache. Or is to delete the cache, and then update the database, in fact, we remain highly controversial. There is no one comprehensive blog, parses these different kinds of programs. So bloggers in fear, braved the risk of being sprayed, wrote this article.

Article structure

This article consists of three parts
1, to explain the cache update policy
2, for each analysis strategy disadvantage
3, are given for the shortcomings of refinement

text

Do a description, in theory, to set the cache expiration time, it is to ensure the consistency of the final solution. In this program, we can set the expiration time for the cached data, all writes to the database prevail, the operation of the cache can only do our best. This means that if the database write successful cache update failed, then once they reach the expiration time, the subsequent read request will naturally read the new values from the database and then backfill the cache. Therefore, the following discussion of ideas does not depend on the cache settings to the expiration of this program.
Here, we discuss three kinds update strategy:

  1. To update the database, and then update the cache
  2. Delete the cache, and then update the database
  3. To update the database, and then delete the cache

No one should ask me, why did not update the cache first, and then update the database of this strategy.

(1) to update the database, and then update the cache

This program, they are generally opposed. why? The following two reasons.
One reason (thread-safe angle)
while Request A and Request B update operation, then there will be
(1) A thread updates the database
(2) thread B updates the database
(3) Thread B updates the cache
thread A (4) update the cache
which appeared in a request to update the cache should request early fishes than the B update the cache, but because the network and other reasons, B earlier than a cache is updated. This leads to dirty data, and therefore not considered.
Two reasons (business scenario angle)
has the following two points:
(1) If you are writing a database scenario more, and read data scene is relatively small business needs, using this program will lead to, are they not read data, cache was frequent updates, waste performance.
(2) If you write the value of the database, not directly write cache, but to go through a series of complex calculations re-write cache. So, after each write to the database, the calculated values are written to the cache again, no doubt a waste of performance. Clearly, delete the cache is more suitable.

The next discussion is the most controversial, first delete the cache, and then update the database. Or to update the database, and then delete the cache problem.

(2) to delete the cache, and then update the database

The reason for this program will lead to inconsistent Yes. At the same time there is a request to update the operation A, the other B requests query operation. Then will appear the following situations:
(1) A request to write, delete cache
(2) B request inquiry found the cache does not exist
(3) B request to query the database to get the old value
(4) Request B will write caching old value
( 5) a request for the new value written to the database
of the above situation will lead to inconsistencies. Moreover, if the cache is not used to set the expiration time of the policy, the data is always dirty.
So, how to solve it? A time delay double deletion strategy
Pseudo code

public void write(String key,Object data){ redis.delKey(key); db.updateData(data); Thread.sleep(1000); redis.delKey(key); }

Translated into Chinese is described in
(1) out of the first cache
(2) write database (these two steps, like the original)
(3) Sleep 1 second, out of the cache again
to do so, the cache dirty data can be caused by one second, deleted again.
Well, this one second how determined the specific sleep how long?
For the above case, the reader should evaluate their own time-consuming to read data business logic of their own projects. Then write data at sleep time consuming basis of the read data on the business logic, you can add a few hundred ms. The purpose of doing so is to ensure that the end of the read requests, write requests can delete cached read requests caused by dirty data.
If you use a separate read and write architecture mysql how to do?
ok, in this case, as the cause of inconsistent data, or two requests, one request for an update operation A, the other B requests query operation.
(1) A write request, delete cache
(2) A request to write data to the database,
(3) the query request buffer B was found, no value buffer
(4) to query request B from the library, this time, has not master-slave synchronization is completed, so that the old value to the query
(5) to request the write cache the old value B
(6) from the master database synchronization is completed, the library becomes a new value from
the above-described case, the reason is inconsistent data. Or double deletion delay tactics. But, as modified sleep time in a master-slave synchronization on the basis of the delay time, plus a few hundred ms.
With this synchronous phase out strategy to reduce throughput how to do?
ok, then the second delete as asynchronous. Yourself a thread, asynchronous delete. Thus, the written request would not sleep after a period of time, and then return. To do so, to increase throughput.
The second If deletion fails how to do?
This is a very good question, because the second deletion failed, a situation will arise. There are two requests, one request for an update operation A, the other B requests query operation, for convenience, assume that a single library:
(1) A write request, delete cache
(2) discovery cache query request B absent
(3) B request to query the database to get the old value
(4) request B will write caching old value
(5) A request for a new value will be written to the database
(6) A request to try to write to a cache removal request B values, results Failed.
ok, that is to say. If the second failure delete cache, cache and database problems occur again inconsistent.
How to solve it?
Concrete solutions, resolve Look at the first blogger (3) update the kinds of policies.

(3) to update the database, and then delete the cache

First, let's talk about. Foreigners made a cache update routine, called "Cache-Aside pattern" . Which pointed out

  • Failure : An application to start taking data cache, do not get, fetch data from the database, after the success, into the cache.
  • Hit : application access to data from the cache, taken after return.
  • Updated : put data into the database, after the success, let cache invalidation.

In addition, the social networking site facebook is also well-known paper "Scaling Memcache at Facebook" proposed, they are also used to update the database, and then delete the cached policy.
This concurrency problem does not exist it?
no. Assuming that there will be two requests to do a query request A, B a request to do an update operation, then there will be generated a situation
(1) failure of just buffer
(2) A request to query a database to obtain an old value
(3) Request B the new value written to the database
(4) removal of the cached request B
(5) a request to the old value of the write cache found
ok, if this happens, dirty data will indeed occur.
However, the probability of this happening and how many do?
This occurs there is a congenital condition that step (3) database write operation than in step (2) to read and less time consuming database operations, possible that step (4) prior to step (5). However, we think, read speed much faster than the database write operations (or else why do read and write separation, separate read and write it is to make sense because the read operation faster, consume less resources), and therefore step (3) shorter than the time-consuming step (2), this situation is difficult to appear.
Assume that someone have to bicker, obsessive-compulsive disorder, must resolve how to do?
How to solve the concurrency problem?
First, to set the buffer time is an effective program. Secondly, the use of asynchronous strategy (2) given the delay in the deletion policy to ensure that after reading the request is complete, then delete it.
There are other reasons for inconsistencies caused it?
Yes, this is the cache update policy (2) and cache update policy (3) there is a problem, delete the cache fails if how to do, that is not the case it would appear inconsistent. For example, a request to write data, and then written to the database, and delete the cache fails, it will appear on the inconsistencies. This is also the last cache update policy questions (2) in left.
How to solve ?
Provides a safeguard mechanism to retry, here are two options.
A program :
As shown below
image
scheme shown below,
(1) update the database data;
(2) deleting the failed cache problems because
(3) The key to be deleted is transmitted to the message queue
(4) their own consumption message to be deleted is obtained Key
(. 5) continue to retry the delete operation, until successful
However, this solution has the disadvantage of causing a large number of service lines intrusion code. So with Option II, in the second scheme, the subscription program to start a subscription database binlog, acquire necessary operation. In the application, the other from a program, access to information coming from this subscription program, delete cache operation.
Scheme II :
image
Process as shown below:
(1) update the database data
(2) the operation information database will be written to the log which binlog
(3) Feed program extracting the desired data, and Key
(. 4) from another section of the non-service code, this information is obtained
(5) operation tries to delete cache, delete the found failure
(6) to send the information to a message queue
(7) of the data retrieved from the message queue in retry operation.

Remarks: the above-mentioned subscription binlog program have an existing middleware called the canal, can be done to subscribe binlog log function in mysql. As for the oracle, bloggers do not currently know whether there are ready-made middleware can be used. Further, the retry mechanism, is the use of the blogger message queue manner. If the consistency requirement is not very high, directly in the program from another thread, from time to time to retry to, these flexible it can play freely, just a thought.

to sum up

本文其实是对目前互联网中已有的一致性方案,进行了一个总结。对于先删缓存,再更新数据库的更新策略,还有方案提出维护一个内存队列的方式,博主看了一下,觉得实现异常复杂,没有必要,因此没有必要在文中给出。最后,希望大家有所收获。

参考文献

1、主从DB与cache一致性
2、缓存更新的套路

 
作者:孤独烟 出处:  http://rjzheng.cnblogs.com/

引言

为什么写这篇文章?

首先,缓存由于其高并发和高性能的特性,已经在项目中被广泛使用。在读取缓存方面,大家没啥疑问,都是按照下图的流程来进行业务操作。
image
但是在更新缓存方面,对于更新完数据库,是更新缓存呢,还是删除缓存。又或者是先删除缓存,再更新数据库,其实大家存在很大的争议。目前没有一篇全面的博客,对这几种方案进行解析。于是博主战战兢兢,顶着被大家喷的风险,写了这篇文章。

文章结构

本文由以下三个部分组成
1、讲解缓存更新策略
2、对每种策略进行缺点分析
3、针对缺点给出改进方案

正文

先做一个说明,从理论上来说,给缓存设置过期时间,是保证最终一致性的解决方案。这种方案下,我们可以对存入缓存的数据设置过期时间,所有的写操作以数据库为准,对缓存操作只是尽最大努力即可。也就是说如果数据库写成功,缓存更新失败,那么只要到达过期时间,则后面的读请求自然会从数据库中读取新值然后回填缓存。因此,接下来讨论的思路不依赖于给缓存设置过期时间这个方案。
在这里,我们讨论三种更新策略:

  1. 先更新数据库,再更新缓存
  2. 先删除缓存,再更新数据库
  3. 先更新数据库,再删除缓存

应该没人问我,为什么没有先更新缓存,再更新数据库这种策略。

(1)先更新数据库,再更新缓存

这套方案,大家是普遍反对的。为什么呢?有如下两点原因。
原因一(线程安全角度)
同时有请求A和请求B进行更新操作,那么会出现
(1)线程A更新了数据库
(2)线程B更新了数据库
(3)线程B更新了缓存
(4)线程A更新了缓存
这就出现请求A更新缓存应该比请求B更新缓存早才对,但是因为网络等原因,B却比A更早更新了缓存。这就导致了脏数据,因此不考虑。
原因二(业务场景角度)
有如下两点:
(1)如果你是一个写数据库场景比较多,而读数据场景比较少的业务需求,采用这种方案就会导致,数据压根还没读到,缓存就被频繁的更新,浪费性能。
(2)如果你写入数据库的值,并不是直接写入缓存的,而是要经过一系列复杂的计算再写入缓存。那么,每次写入数据库后,都再次计算写入缓存的值,无疑是浪费性能的。显然,删除缓存更为适合。

接下来讨论的就是争议最大的,先删缓存,再更新数据库。还是先更新数据库,再删缓存的问题。

(2)先删缓存,再更新数据库

该方案会导致不一致的原因是。同时有一个请求A进行更新操作,另一个请求B进行查询操作。那么会出现如下情形:
(1)请求A进行写操作,删除缓存
(2)请求B查询发现缓存不存在
(3)请求B去数据库查询得到旧值
(4)请求B将旧值写入缓存
(5)请求A将新值写入数据库
上述情况就会导致不一致的情形出现。而且,如果不采用给缓存设置过期时间策略,该数据永远都是脏数据。
那么,如何解决呢?采用延时双删策略
伪代码如下

public void write(String key,Object data){ redis.delKey(key); db.updateData(data); Thread.sleep(1000); redis.delKey(key); }

转化为中文描述就是
(1)先淘汰缓存
(2)再写数据库(这两步和原来一样)
(3)休眠1秒,再次淘汰缓存
这么做,可以将1秒内所造成的缓存脏数据,再次删除。
那么,这个1秒怎么确定的,具体该休眠多久呢?
针对上面的情形,读者应该自行评估自己的项目的读数据业务逻辑的耗时。然后写数据的休眠时间则在读数据业务逻辑的耗时基础上,加几百ms即可。这么做的目的,就是确保读请求结束,写请求可以删除读请求造成的缓存脏数据。
如果你用了mysql的读写分离架构怎么办?
ok,在这种情况下,造成数据不一致的原因如下,还是两个请求,一个请求A进行更新操作,另一个请求B进行查询操作。
(1)请求A进行写操作,删除缓存
(2)请求A将数据写入数据库了,
(3)请求B查询缓存发现,缓存没有值
(4)请求B去从库查询,这时,还没有完成主从同步,因此查询到的是旧值
(5)请求B将旧值写入缓存
(6)数据库完成主从同步,从库变为新值
上述情形,就是数据不一致的原因。还是使用双删延时策略。只是,睡眠时间修改为在主从同步的延时时间基础上,加几百ms。
采用这种同步淘汰策略,吞吐量降低怎么办?
ok,那就将第二次删除作为异步的。自己起一个线程,异步删除。这样,写的请求就不用沉睡一段时间后了,再返回。这么做,加大吞吐量。
第二次删除,如果删除失败怎么办?
这是个非常好的问题,因为第二次删除失败,就会出现如下情形。还是有两个请求,一个请求A进行更新操作,另一个请求B进行查询操作,为了方便,假设是单库:
(1)请求A进行写操作,删除缓存
(2)请求B查询发现缓存不存在
(3)请求B去数据库查询得到旧值
(4)请求B将旧值写入缓存
(5)请求A将新值写入数据库
(6)请求A试图去删除请求B写入对缓存值,结果失败了。
ok,这也就是说。如果第二次删除缓存失败,会再次出现缓存和数据库不一致的问题。
如何解决呢?
具体解决方案,且看博主对第(3)种更新策略的解析。

(3)先更新数据库,再删缓存

首先,先说一下。老外提出了一个缓存更新套路,名为《Cache-Aside pattern》。其中就指出

  • 失效:应用程序先从cache取数据,没有得到,则从数据库中取数据,成功后,放到缓存中。
  • 命中:应用程序从cache中取数据,取到后返回。
  • 更新:先把数据存到数据库中,成功后,再让缓存失效。

另外,知名社交网站facebook也在论文《Scaling Memcache at Facebook》中提出,他们用的也是先更新数据库,再删缓存的策略。
这种情况不存在并发问题么?
不是的。假设这会有两个请求,一个请求A做查询操作,一个请求B做更新操作,那么会有如下情形产生
(1)缓存刚好失效
(2)请求A查询数据库,得一个旧值
(3)请求B将新值写入数据库
(4)请求B删除缓存
(5)请求A将查到的旧值写入缓存
ok,如果发生上述情况,确实是会发生脏数据。
然而,发生这种情况的概率又有多少呢?
发生上述情况有一个先天性条件,就是步骤(3)的写数据库操作比步骤(2)的读数据库操作耗时更短,才有可能使得步骤(4)先于步骤(5)。可是,大家想想,数据库的读操作的速度远快于写操作的(不然做读写分离干嘛,做读写分离的意义就是因为读操作比较快,耗资源少),因此步骤(3)耗时比步骤(2)更短,这一情形很难出现。
假设,有人非要抬杠,有强迫症,一定要解决怎么办?
如何解决上述并发问题?
首先,给缓存设有效时间是一种方案。其次,采用策略(2)里给出的异步延时删除策略,保证读请求完成以后,再进行删除操作。
还有其他造成不一致的原因么?
有的,这也是缓存更新策略(2)和缓存更新策略(3)都存在的一个问题,如果删缓存失败了怎么办,那不是会有不一致的情况出现么。比如一个写数据请求,然后写入数据库了,删缓存失败了,这会就出现不一致的情况了。这也是缓存更新策略(2)里留下的最后一个疑问。
如何解决
提供一个保障的重试机制即可,这里给出两套方案。
方案一
如下图所示
image
流程如下所示
(1)更新数据库数据;
(2)缓存因为种种问题删除失败
(3)将需要删除的key发送至消息队列
(4)自己消费消息,获得需要删除的key
(5)继续重试删除操作,直到成功
然而,该方案有一个缺点,对业务线代码造成大量的侵入。于是有了方案二,在方案二中,启动一个订阅程序去订阅数据库的binlog,获得需要操作的数据。在应用程序中,另起一段程序,获得这个订阅程序传来的信息,进行删除缓存操作。
方案二
image
流程如下图所示:
(1)更新数据库数据
(2)数据库会将操作信息写入binlog日志当中
(3)订阅程序提取出所需要的数据以及key
(4)另起一段非业务代码,获得该信息
(5)尝试删除缓存操作,发现删除失败
(6)将这些信息发送至消息队列
(7)重新从消息队列中获得该数据,重试操作。

备注说明:上述的订阅binlog程序在mysql中有现成的中间件叫canal,可以完成订阅binlog日志的功能。至于oracle中,博主目前不知道有没有现成中间件可以使用。另外,重试机制,博主是采用的是消息队列的方式。如果对一致性要求不是很高,直接在程序中另起一个线程,每隔一段时间去重试即可,这些大家可以灵活自由发挥,只是提供一个思路。

总结

This article is actually the current Internet has the consistency of the program, carried out a summary. For the update policy to delete the cache, and then update the database, as well as the proposed scheme maintains a memory queue way, bloggers looked at, that realization is very complex, it is not necessary, it is not necessary given in the text. Finally, I hope you gain something.

references

1, the main and cache coherence from the DB
2, cache update routine

Guess you like

Origin www.cnblogs.com/wjqhuaxia/p/11785240.html