Seeing the sun through the clouds: How to ensure the consistency between Redis and the database?

Overview

During the use of Redis, there are four abnormal problems: cache penetration, cache breakdown, cache avalanche, and double-write consistency between cache and database (MySQL).

The first three problems may vary depending on the size of the business, but the last problem cannot be avoided. Even if your e-commerce business is small, inconsistencies between the price or inventory information in the cache and the database can bring huge losses to the company. Therefore, this question has also become a compulsory question in the interview.

So, how do you solve the problem of inconsistency between Redis and MySQL double-write data? Or, how do you ensure consistency between them?

problem background

To answer a question, we must first understand why it came about.

If Redis technology is not introduced, will there be consistency problems when we simply write data (single instance) into MySQl? of course not.

The transaction feature of MySQL's InnoDB engine has already guaranteed this, so we won't go into details.

Because the reading process after we introduce Redis is like this: if the cache exists, read it directly; if the cache does not exist, read the database data directly, and then update it to the cache. So this consistent scenario arises:

  1. There is data in the cache, which is consistent with the data in MySQL.

  2. There is no data in the cache, MySQL stores the latest data.

For the first case, we need to ensure that the MySQL and Redis data are the same; for the second case, we need to ensure that the data is updated to the cache in an appropriate way. There is no strong constraint of transactions between MySQL and Redis to ensure the above two points. When any situation is broken, the data is inconsistent.

In the use of cache, it is divided into read-write cache and read cache:

  • Read and write cache: In addition to reading scenarios, the data in the cache will be directly modified, which is divided into synchronous direct write and asynchronous write back. Synchronous write-through will directly modify the cached data and database synchronously; asynchronous write-back will write back to the database when the cache is eliminated. This scheme has a higher probability of inconsistency and is used in a large number of read and write scenarios, so it is not considered here.

  • Read cache: does not directly modify the cache content. When the database is modified, the cache is directly invalidated. If it does not exist in the cache, the database value will be read and then set to the cache. This kind of scenario is suitable for reading more and writing less. We mainly discuss this solution.

First give a conclusion: In fact, technically speaking, we can hardly guarantee the strict consistency between Redis and the database. All solutions are to reduce the possibility of inconsistency and the inconsistency time as much as possible.

1. Delete the cache first and then update the database

If you delete the cache first and then update the database, the following situations may occur:

  1. Thread 1 deletes the cache and prepares to update the database.

  2. Thread 2 starts to read the data and finds that there is no cache, so it reads the data in the database. Since reading is much faster than writing, there is a high probability that thread 1 has not completed the update at this time.

  3. Thread 2 updates the old data to Redis, and thread 1 completes the database change at this time. Data inconsistencies occur.

Since such a problem may occur if you delete it first, it will be fine if I delete it again later.

 

2. Cache delay double delete

Compared with the previous solution, we added one more deletion, but this also has disadvantages.

How to determine the sleep time? The sleep time is longer than the time for thread 2 to read and set the cache. If it is set too small, it may not be deleted after setting; if it is set too large, it will affect the throughput of the system.

3. Update the database first and then delete the cache

 

Can you not delete it for the first time, but just delete it for the second time? Update the database first, then delete the cache. In this way, thread 2 can read the old data until thread 1 submits the data, and only after thread 1 is updated will it go to the database to read the new value. It seems to solve this problem perfectly.

But there is an edge case.

  1. Thread 1 updates the database, but has not yet committed the transaction. Then delete the cache, and wait for the transaction to be committed before completing the data modification.

  2. But in the gap between the final commit of the transaction and the deletion of the cache, the second thread reads the cache and finds that it does not exist, and the process of setting the old value of the database occurs.

  3. In this way, after thread 1 finally submits the data, the database and cache inconsistency problem occurs again.

But how likely is this? very very small. The gap between a commit operation is obviously very short, and only one network link is required. But the possibility still exists.

This problem can also be solved in this way: set a lock and add a short expiration time when deleting the cache, as an atomic operation of lua scripts. If a thread finds that the cache does not exist, it will block and wait or return query failure. Delete the lock after the database transaction commits successfully. Of course, this solution will reduce the throughput or the interface will be temporarily unavailable.

What if the timing of thread 1 transaction submission is submitted before the cache is deleted?

 

This process will look much simpler. After the thread 1 database is modified, thread 2 will read the old data within a short period of time before deleting redis, but it will be deleted by thread 1 after all. This scheme obviously has the lowest inconsistent time and cost.

But what if the last step of deleting the cache fails? The old data will always be in the cache.

 

4. Message queue retry deletion

Whether it is the second or third solution, there may be a cache deletion failure. You can use the message queue to increase the retry mechanism when deleting the cache at the end to ensure the final consistency.

  1. Update the database.

  2. Delete the cache and send an MQ message.

  3. The MQ is received and the deletion is successful. Otherwise keep retrying.

This solution may cause a problem of business intrusion. But I think it’s okay, you can encapsulate the delete logic, write delete and send MQ together, and consume MQ as a general delete function.

Guess you like

Origin blog.csdn.net/m0_37723088/article/details/131221814