Double-write consistent solution for database and cache

In order to improve the efficiency of the database, the database + cache is often used;
for data acquisition, the cache is first checked. If the database is not checked in the cache, the database query results are returned and cached;
but for the written part, the execution order It will confuse many friends. Improper order will lead to inconsistencies between the data in the cache and the database. Should you write to the cache or the database table first? You look down:
Insert picture description here


Bypass caching strategy


Basic reading method

Let's look at the reading method first: read the cache first, go to the database to read if there is no data in the cache, and then store it in the cache, and return the response at the same time. There is nothing to say, just look at the picture:


Insert picture description here

Update the database first, then delete the cache

Looking at the writing part again, out of instinct, you might ask why not update the cache after updating the database, but delete it. Indeed, this is also possible in many business scenarios. But you have to be careful when encountering the following scenarios.

(1) Frequent updates are wasting resources.
Think about it if you update a field in the library frequently over a period of time. Then how many times are updated, the cache is updated how many times. However, this cached data was occasionally queried several times during this period. So will it lead to a waste of resources?

(2) The calculation of cached data is complicated.
There is a situation where the calculation cost of the cached data is relatively high. For example, for one piece of data, multiple tables must be used to calculate the result. So every time you modify it, you have to query multiple tables in order to update the cache. Is it not worth the loss?

(3) Both situations are available.
This situation is the most deadly, not only frequent modification, but also complex calculations for cached data.

Since the way to update the cache is not feasible, then change your thinking and delete it?

Follow the steps above to update the database first, but replace the operation of updating the cache with deleting the cache.
In this case, when the read request comes, because the data in the cache is deleted during the update operation, it will be read from the database and then written to the cache. This is a lazy loading method, which will be calculated only when the cache is needed. This can avoid a lot of calculations and frequent updates.

Doesn't it look like nothing wrong? You are thinking about it, what happens if the data update succeeds but the cache deletion fails?


Insert picture description here

As shown in the figure, the initial state database and the data in the cache are consistent, but after the write request comes, the database update is successful, but the cache deletion fails. This causes the data in the database to be up to date, but there are still old data in the cache. At this time, if a read request comes, it will directly read the old data in the cache and return.


Double write consistency solution


1. Delete the cache first, then update the database

Since the cause of the problem is the failure to delete the cache, we first make sure that the cache is deleted successfully, and then update the database. In other words, we delete the cache first, and then update the database.

At this point, you may have an idea from your mind, what if the database update fails? Let's take a look at this situation: the


Insert picture description here


cache is empty after a successful deletion, but the database fails, and the old data is still the original. If there is a request at this time, as soon as there is no data in the cache, the database reads the old data and updates it to the cache.

Do you think that "everything is going well" and you have finished work for dinner? Don't worry, see if there are any questions.

2. Cache delay double delete strategy

If the concurrency of the project is very low, the number of daily visits will be so little. Then there is nothing wrong with the method of " deleting the cache first, then updating the database ". In rare cases, data inconsistency will occur. But this strategy can only be regarded as a rudimentary solution. Why do you say that?

In the case of high concurrency, it is very likely that if two requests come at the same time, one write request and one read request.

The write request first deletes the data in the cache, and then goes to the database to update. However, the write request has not been updated successfully at this time, or a transaction has not been successfully executed.

At this moment, the read request judges whether there is data in the cache, and then requests the database, and writes the data into the cache after getting the data.

In this case, the read request gets the old unmodified data and writes it to the cache. After a while, the write request successfully updated the database, then the data in the cache and the database are inconsistent at this time.

Maybe you are a bit messy, let me explain it with a picture:

Insert picture description here


Solution
Insert picture description here
Write the request and delete the Redis cache first. After the database is updated successfully, wait for a while and delete the cache again.
This solution is fast to read, but dirty data will appear for a short time.

to sum up

Bypass caching strategy

When reading, first read the cache, if there is no cache, then read the database, then take it out and put it in the cache, and return the response at the same time.

When updating, update the database first, and then delete the cache.

Double write consistent scheme

Delete the cache first, then update the database: The
problem of inconsistency between the database and the cache caused by the failure of cache deletion is solved, and it is suitable for business scenarios with low concurrency.

Cache delay double delete strategy:
This solution solves the problem of inconsistency caused by simultaneous read requests and write requests under high concurrency. The read speed is fast, but dirty data may appear for a short time.


note

Each solution has its own advantages and disadvantages, and there is no general technical solution for different businesses. When choosing a technical solution, it needs to be determined according to the business itself. There is no best, only the most suitable.

Guess you like

Origin blog.csdn.net/QiuHaoqian/article/details/109089867