How to ensure data consistency when the cache and the database are double-written?

When doing system optimization, I came up with the idea of tiered storage of data. Because there will be some data in the system, the real-time requirements of some data are not high, such as some configuration information. Basically, the configuration will only change once after a long time. However, some data have very high real-time requirements, such as order and flow data. Therefore, the data is divided into three levels according to the real-time requirements of the data.

Level 1: order data and payment flow data; these two pieces of data have high requirements for real-time and accuracy, so without adding any cache, read and write operations will directly operate the database.
Level 2: User-related data; these data are related to users and have the characteristics of reading more and writing less, so we use redis for caching.
Level 3: Payment configuration information; these data have nothing to do with users, and have the characteristics of small amount of data, frequent reading, and almost no modification, so we use local memory for caching.

However, as long as the cache is used, whether it is the local memory for the cache or the redis for the cache, there will be a problem of data synchronization, because the configuration information is cached in the memory, and the memory cannot perceive the modification of the data in the database. This will cause the data in the database to be inconsistent with the data in the cache. Next, we will discuss about ensuring data consistency when the cache and the database are double-written.

solution

So here we list all the strategies and discuss their pros and cons.

Update the database first, then update the cache
Update the database first, then delete the cache
Update the cache first, then update the database
Delete the cache first, then update the database

Update the database first, then update the cache

This scenario is generally not used by anyone, the main reason is the step of updating the cache, why? Because some business requirements exist in the cache value that is not directly checked from the database, and some require a series of calculations to calculate the cache value, then it is actually very expensive to update the cache after this time. If there are a large number of requests to write data to the database at this time, but there are not many read requests, then if the cache is updated for each write request at this time, the performance loss is very large.

For example, there is a value of 1 in the database. At this time, we have 10 requests to add 1 to it each time, but no read operation comes in during this period. If we use the method of updating the database first, then At this time, there will be ten requests to update the cache, and a large amount of cold data will be generated. If we delete the cache instead of updating the cache, then when a read request comes, the cache will only be updated once.

Update the cache first, then update the database

This case should not need us to consider it, it is the same as the first case.

Delete the cache first, then update the database

The program will also have problems, and the specific reasons are as follows.

Delete the cache first, then update the database

At this point, there are two requests, request A (update operation) and request B (query operation)

Request A will delete the data in Redis first, and then go to the database for update operation
At this time, when request B sees the data in Redis, it will query the value in the database and record it in Redis.
But at this time request A has not been updated successfully, or the transaction has not been committed

Then there will be a problem of inconsistency between the database and Redis data. How to solve it? In fact, the easiest solution is to delay the double deletion strategy.

Delayed double deletion

However, there is another problem with deleting the cache after the above-mentioned guarantee transaction is committed, that is, if you are using Mysql's read-write separation architecture, there will actually be a time difference between master-slave synchronization.

Master-slave synchronization time difference

At this point, there are two requests, request A (update operation) and request B (query operation)

Request A update operation, delete Redis
The main library is requested to perform an update operation, and the main library and the slave library perform data synchronization operations
Ask B to query and find that there is no data in Redis
to fetch data from the repository
At this time, the synchronization data has not been completed, and the data obtained is the old data

The solution at this time is to force Redis to point to the main database for querying if the database is filled with data.

Get data from the main library

Update the database first, then delete the cache

Problem: There will also be problems in this situation. For example, the database is updated successfully, but there is an error in the stage of deleting the cache, and the deletion is not successful. Then every time the cache is read again, the data is wrong every time.

Update the database first, then delete the cache

The solution at this point is to use message queues to compensate for deletions. The specific business logic is described in language as follows:

Request A to update the database first
An error was reported when the delete operation was performed on Redis, and the deletion failed
At this time, the key of Redis is sent to the message queue as the message body
The system deletes Redis again after receiving the message sent by the message queue

However, this solution has a disadvantage that it will cause a lot of intrusions into the business code and are deeply coupled together, so there will be an optimized solution at this time. We know that we can find it in the binlog log after the update operation of the Mysql database. Corresponding operation, then we can subscribe to the binlog log of the Mysql database to operate the cache.

Delete cache by subscribing to binlog

Summarize

Each scheme has its own advantages and disadvantages. For example, in the second scheme, delete the cache first, and then update the database. We finally discussed that when you want to update Redis, the problem can be solved by forcing the main database query, so this operation will cause a lot of business code. Intrusion, but does not require increased system, does not need to increase the complexity of the overall service. For the last solution, we finally discussed the use of subscribing binlog logs to build an independent system to operate Redis. The disadvantage of this is that it increases the complexity of the system. In fact, every choice requires us to evaluate our business to choose, and no technology is universal for all businesses. There is no best, only the best for us.