[redis] Consistency between redis cache and database

[1] Four synchronization strategies

To ensure that the cache and the database are double-written consistently, there are 4 methods, namely 4 synchronization strategies:
(1) update the cache first, and then update the database;
(2) update the database first, and then update the cache;
(3) delete first cache, and then update the database;
(4) update the database first, and then delete the cache.

Which is more appropriate to update the cache or delete the cache? Should the database be operated first or the cache be operated first?

【2】Update cache or delete cache

(1) Update cache

(1) Advantages
The cache is updated in time for each data change, so it is not easy to miss when querying.

(2) Disadvantages
The consumption of updating the cache is relatively large. If the data needs to be written into the cache after complex calculations, frequent updates to the cache will affect the performance of the server. If it is a business scenario where data is frequently written, there may be no business to read the data when the cache is frequently updated.

(2) Delete the cache

(1) Advantages
The operation is simple, no matter whether the update operation is complicated or not, the data in the cache is directly deleted.

(2) Disadvantages
After the cache is deleted, the next query cache will miss, and the database needs to be read again. From the above comparison, in general, deleting the cache is a better solution.

[3] Update the database first or delete the cache first

(1) When a failure occurs

First, we will delete the cache first and update the database first, and make a comparison when there is a failure:

1- Delete the cache first, then update the database (updating the database failed)

Delete the cache first and then update the database. Possible problems in case of failure:
(1) Thread A deletes the cache successfully, but thread A fails to update the database;
(2) Thread B reads data from the cache; because the cache is deleted, process B The data cannot be obtained from the cache, and then read from the database; at this time, the data update in the database fails, and thread B successfully obtains the old data from the database, and then updates the data to the cache.
(3) In the end, the cache and database data are consistent, but still old data

insert image description here

2-Update the database first, then delete the cache (delete cache failed)

First update the database and then delete the cache. Possible problems in case of failure:
(1) Thread A successfully updates the database, but thread A fails to delete the cache;
(2) Thread B reads the cache successfully, but because the cache deletion fails, thread B reads What is fetched is the old data in the cache.
(3) Finally, thread A deletes the cache successfully, and other threads access the same data in the cache, which is the same as the data in the database.
(4) In the end, the cache and database data are consistent, but some threads will read old data.

insert image description here

3- Summary

After the above comparison, we found that when a failure occurs, it is impossible to clearly distinguish which method is better to delete the cache first or update the database first, thinking that there are problems in both of them. How to solve the problems in the above scenarios? It is recommended to use the retry mechanism to solve it.

(2) When there is no failure

1- Delete the cache first, then update the database

(1) Thread A deletes the cache successfully;
(2) Thread B fails to read the cache;
(3) Thread B successfully reads the database and gets old data;
(4) Thread B successfully updates the old data to the cache;
( 5) Thread A successfully updates the new data to the database.
insert image description hereIt can be seen that the two-step operations of process A are successful, but due to the existence of concurrency, process B has accessed the cache between these two steps. The end result is that old data is stored in the cache and new data is stored in the database, and the data between the two is inconsistent.

Delayed double deletion solves this problem

If the cache is deleted first and then the database is updated, data inconsistency may result if there is no failure. If in practical applications, we need to choose this method due to some considerations, we can adopt the delayed double-delete strategy. The basic idea of ​​delayed double-delete is as follows: (1) delete the cache; (2)
update
the database;
( 3) sleep for N milliseconds;
(4) delete the cache again.

public void write(String key, Object data) {
    
    
    Redis.delKey(key);
    db.updateData(data);
    Thread.sleep(1000);
    Redis.delKey(key);
}

After blocking for a period of time, delete the cache again to delete the inconsistent data in the cache during this process. As for the specific time, you need to evaluate the approximate time of your business, and just set it according to this time. Finally, the data consistency between the database and the cache is guaranteed

What if it is an architecture with read-write separation (mandatory reading of the main library)

If the database adopts a read-write separation architecture, new problems will arise, as shown in the figure below:
At this time, there are two requests, request A (update operation) and request B (query operation)
(1) Request A update operation , deleted Redis;
(2) Request the master database to perform an update operation, and the master database and the slave database will synchronize data; (
3) Ask B to query and find that there is no data in Redis;
(4) Take it from the slave database (5) At this time ,
the master-slave synchronization data has not been completed, and the obtained data is old data;

insert image description here
The solution at this time is to force Redis to point to the main database for querying if it is to query the database for filling data.

What should I do if the deletion fails?

If the deletion still fails, you can increase the number of retries, but this number must be limited. When the number exceeds a certain number, you must take measures such as error reporting, logging, and sending email reminders.

2-Update the database first, then delete the cache (optimal solution)

(1) Thread A successfully updates the database;
(2) Thread B reads the cache successfully;
(3) Thread A deletes the cache successfully.
insert image description here
It can be seen that the final cache and database data are consistent and are the latest data. But thread B reads old data during this process, and there may be other threads like thread B that read old data in the cache between these two steps, but because the execution speed of these two steps will be faster, So it doesn't matter much. After these two steps, when other processes read the cached data, there will be no problems similar to process B.

Compensation for deletes using message queues

There will also be problems in the case of updating the database first and then deleting the cache. For example, the update of the database is successful, but an error occurs during the deletion of the cache and the deletion is not successful. Then when the cache is read again at this time, the data is wrong every time. up.

The solution at this point is to use message queues to compensate for deletions. The specific business logic is described in language as follows:
(1) Request thread A to update the database first;
(2) An error is reported when deleting Redis, and the deletion fails;
(3) At this time, the key of Redis is used as The message body is sent to the message queue;
(4) The system deletes Redis again after receiving the message sent by the message queue;

insert image description hereHowever, this solution has a disadvantage that it will cause a lot of intrusion into the business code and be deeply coupled together, so there will be an optimization method at this time. We know that after the update operation on the Mysql database, we will add it to the binlog log. can find the corresponding operation, then we can subscribe to the binlog log of the Mysql database to operate the cache.

【4】Summary

Under normal circumstances, deleting the cache is a better solution than updating the cache; updating the database first is a better solution than deleting the cache first; in general [update the database first, then delete the cache] is the least impact of the four strategies, the effect optimal solution.

However, if you need to use the solution of [delete the cache first, then update the database], you can use [delayed double deletion] [forced to read the main library when reading and writing are separated] [retry mechanism] to solve the problem.

If you fail to delete the cache when using [update the database first, then delete the cache], you can use [binlog synchronization to redis] to solve the problem.

Guess you like

Origin blog.csdn.net/weixin_44823875/article/details/129051198