How to ensure the consistency of the cache and the database?

table of Contents

Delete the cache first, then update the database

solution

Delayed double delete

Update the database first, then delete the cache

solution

message queue

Advanced message queue

Other solutions

Set cache expiration time

Why delete instead of update the cache?

to sum up


For the operation of the cache and database, there are mainly the following two ways.

Delete the cache first, then update the database

Delete the cache first. The database has not been updated successfully. At this time, if the cache is read, the cache does not exist, and the old value is read from the database, and the cache inconsistency occurs.

image

solution

Delayed double delete

The idea of ​​the delayed double delete solution is to avoid that other threads cannot read data from the cache when the database is updated. After updating the database, Sleep for a period of time, and then delete the cache again.

The sleep time needs to evaluate the business read-write cache time, and the sleep time is greater than the read-write cache time.

The process is as follows:

  1. Thread 1 deletes the cache and then updates the database.

  2. Thread 2 reads the cache and finds that the cache has been deleted, so it reads directly from the database. At this time, because thread 1 has not completed the update, it reads the old value, and then writes the old value to the cache.

  3. Thread 1, according to the estimated time, Sleep, because the Sleep time is greater than the time for thread 2 to read data + write cache, so the cache is deleted again.

  4. If there are other threads to read the cache, the latest value will be read from the database again.

image

Update the database first, then delete the cache

What about the reverse operation, first update the database, and then delete the cache?

This is a more obvious problem. If the database is successfully updated, if the cache fails to be deleted or has not had time to delete, then other threads read the old value from the cache, and inconsistency will still occur.

image

solution

message queue

This is a plan that has been written in many articles on the Internet. But the shortcomings of this scheme will be more obvious.

Update the database first, send a message to the message queue after success, delete the cache after the message is consumed, and use the retry mechanism of the message queue to achieve the effect of final consistency.

image

This solution actually has more problems.

  1. After the introduction of message middleware, the problem is more complicated, and how to ensure that the message is not lost is more troublesome.

  2. Even if there is no problem with updating the database and deleting the cache, the delay of the message will bring a short-term inconsistency, but the delay is relatively acceptable.

Advanced message queue

In order to solve the problem of cache consistency, it is too complicated to introduce a message queue separately.

In fact, most large companies have their own message queues for monitoring binlog messages, mainly for the purpose of checking.

In this way, we can use the message queue monitoring binlog to delete the cache. The advantage of this is that you don't need to introduce it yourself and invade your business code. Middleware helps you decouple. At the same time, the middleware itself guarantees high availability.

Of course, the problem of message delay still exists, but it is better than simply introducing a message queue.

Moreover, if the concurrency is not particularly high, the real-time and consistency of this approach is still acceptable.

image

Other solutions

Set cache expiration time

Each time it is put into the cache, set an expiration time, such as 5 minutes. The subsequent operations only modify the database, do not operate the cache, and wait for the cache to reread from the database after the timeout.

If the consistency requirements are not very high, this solution can be used.

There is another problem with this scheme, that is, if the data is updated very frequently, the problem of inconsistency will be great.

In actual production, we have some active cached data processed in this way.

Because activities do not change frequently, and short-term inconsistencies are not a big problem for activities.

Why delete instead of update the cache?

Let's take the example of updating the database and then deleting the cache .

If it is an update, then update the database first, and then update the cache .

For example: if the database is updated 1,000 times in one hour, the cache must be updated 1,000 times, but the cache may only be read once in one hour. Is the 1,000 updates necessary?

Conversely, if it is a deletion, even if the database is updated 1000 times, it will only be cached once, and the database will be loaded only when the cache is actually read.

to sum up

First of all, we have to make it clear that the cache is not an update, but a deletion.

There are two ways to delete the cache:

  1. Delete the cache first, and then update the database. The solution is to use delayed double deletion.

  2. Update the database first, and then delete the cache. The solution is to synchronize message queues or other binlogs. The introduction of message queues will cause more problems, and it is not recommended to use them directly.

For scenarios where the cache consistency requirement is not very high, just set the timeout period.

In fact, if the concurrency is not high, no matter whether you choose to delete the cache first or delete the cache later, this kind of problem is rarely caused, but under high concurrency, you should know how to solve the problem.

Guess you like

Origin blog.csdn.net/jack1liu/article/details/111755183