The double-write consistency problem of the cache and the database that the interview must ask!

It is a program, as long as it uses cache Redis and the like, it will face double-write consistency problems. Many programmers will fall down on this problem.

The double-write consistency problem of the cache and the database that the interview must ask!

Because no matter how you answer, it doesn't look perfect.

First of all, what we face is whether you write to the cache or the database first. Suppose we write to the cache first, and then to the database. Then, when the cache write succeeds and the database write fails, there is an inconsistency.

You might say that when the database fails to write, I delete the cache again. Can you help, will you succeed in deleting the cache? Not to mention, after your cache write is successful, there may be a query request to read the cache before the database is updated.

Then we improve the update strategy. We delete the cache first, then update the database, and then delete the database after the update is successful.

This method, if it is serial, is logically no problem. However, serialization will definitely lead to excessive service pressure or even collapse. If it is not executed serially, then when you delete the cache and have not updated the database, other threads read the old data in the DB into the cache. Then you update the DB. In this gap period, the read cache reads old data until you perform the subsequent delete cache operation again.

Do you feel that writing to the cache first, and then to the database, this scheme does not work. Then let's take a look at the second method, write DB first, and write cache.

If the DB write fails, then there is no need to write to the cache. However, if the DB write succeeds, what if the cache write fails? The data is inconsistent again.

What about concurrency? One requests A to do a query operation, and the other requests B to do an update operation. Then there must be the following situation:

When A was requested to check the cache, the cache just failed. So A goes to check the database and gets a value; requests B to write the new value into the database, request B to delete the cache, and request A to write the old value found into the cache.

However, if I say that Facebook adopts this kind of scheme, you will not be surprised.

Although Facebook also knows this problem. Although this case will appear in theory, the probability of occurrence may be very low in practice, because this condition needs to occur when the cache is read and the cache is invalid, and there is a concurrent write operation. In fact, the write operation of the database will be much slower than the read operation, and the table must be locked. The read operation must enter the database operation before the write operation, and it is later than the write operation to update the cache. All these conditions are met. The probability is basically not great.

The most important thing is that even if something like this happens, it will have little impact on some of Facebook's businesses. Double balance in business and strategy is the real art of architecture.

In addition, what I said earlier: Cache Aside Pattern, Read through Pattern, Write through Pattern, Write behind caching Pattern are not perfect.

Again, if you must ensure strong consistency, use 2PC, 3PC or Paxos protocols to ensure consistency.

No technology is perfect, and any technology has rebuttals. Architecture is an art, it depends on how you balance it!

Guess you like

Origin blog.51cto.com/15127565/2664937