Redis cache and database double write consistency

The problem of double-write data consistency in databases and caches (such as redis) is a public problem that has nothing to do with the development language. Especially in high concurrency scenarios, this problem becomes more serious. In today’s article, I will go from shallow to deep, and talk with you about common solutions to database and cache double-write data consistency problems, possible pitfalls in these solutions, and what is the optimal solution.

In theory, setting an expiration time for the cache is a solution to ensure eventual consistency. Under this scheme, we can set an expiration time for the data stored in the cache, all write operations are subject to the database, and we only do our best for the cache operation. That is to say, if the database is successfully written and the cache update fails, as long as the expiration time is reached, subsequent read requests will naturally read new values ​​from the database and then backfill the cache. Therefore, the ideas discussed next do not depend on the scheme of setting an expiration time for the cache.

Here, we discuss four update strategies:

  1. Update the cache first, then update the database

  1. Update the database first, then update the cache

  1. Delete the cache first, then update the database

  1. Update the database first, then delete the cache

1. Update the cache first, then update the database

This update strategy should not be used in the project, the existing problems are obvious:

As shown in the figure above, for each write operation of a certain user, if the cache has just been written, an abnormality occurs in the network suddenly, causing the write to the database to fail.

The result is that the cache is updated with the latest data, but the database is not. Doesn't the data in the cache become dirty data ? If the user's query request just reads the data at this time, there will be a problem, because the data does not exist in the database at all, and this problem is very serious.

We all know that the main purpose of caching is to temporarily store the data in the database in memory, which is convenient for subsequent queries and improves query speed.

But if a piece of data doesn't exist in the database , what's the point of caching this kind of fake data?

Therefore, it is not advisable to update the cache first, and then update the database, and it is not used much in actual work.

2. Update the database first, then update the cache

This plan is generally opposed by everyone. There are two main reasons:

Cause 1:

From the perspective of thread safety, if there are request A and request B for update operations at the same time, then there will be:

(1) Thread A updates the database
(2) Thread B updates the database
(3) Thread B updates the cache
(4) Thread A updates the cache

This means that requesting A to update the cache should be earlier than requesting B to update the cache, but due to network and other reasons, B updated the cache earlier than A. This results in dirty data and is therefore not considered!

Reason two:

From the perspective of business scenarios, there are two problems:

(1) If it is a business scenario where the database writes more and reads less, adopting this solution will cause the cache to be updated frequently before the data is even read, wasting performance.

(2) If the value you write to the database is not directly written into the cache, but is written into the cache after a series of complex calculations. Then, after each write to the database, the value written to the cache is recalculated, which is undoubtedly a waste of performance.

Obviously, deleting the cache is more suitable. The next discussion is the most controversial, delete the cache first, and then update the database; or update the database first, and then delete the cache.

3. Delete the cache first, and then update the database:

The reason why this scheme will cause inconsistency is: at the same time, there is a request A for update operation, and another request B for query operation. Then the following situation will appear:

(1) Delete the cache before requesting A to perform a write operation
(2) Request B to query and find that the cache does not exist
(3) Request B to query the database to get the old value
(4) Request B to write the old value into the cache
(5) Request A write new value to database

If the policy of setting an expiration time for the cache is not adopted here, the data will always be dirty data! ! !

Solution: delayed double deletion strategy:

(1) delete the cache first
(2) write to the database again
(3) sleep for a period of time (such as one second), and delete the cache again

The purpose of doing this is to delete the dirty cache data generated during the sleep time (this sleep time needs to be specified according to the time-consuming business logic of the project).

What if it is MySQL's read-write separation architecture?

In this case, the reason for the data inconsistency is as follows, there are still two requests, one requests A to perform an update operation, and the other requests B to perform a query operation.

(1) Request A to perform a write operation and delete the cache;
(2) Request A to write data into the database;
(3) Request B to query the cache and find that the cache has no value;
(4) Request B to query from the database, at this time, The master-slave synchronization has not been completed, so the old value is queried;
(5) Request B to write the old value into the cache;
(6) The database completes the master-slave synchronization, and the slave library becomes the new value;

The above situation is the reason for the data inconsistency. Still use the double-deletion delay strategy. However, the sleep time is modified to add several hundred ms to the delay time of master-slave synchronization.

What should I do if the throughput decreases due to the delayed double deletion strategy?

Another thread can be started to perform the second delete operation asynchronously, so that the write request does not need to sleep for a period of time before returning, thereby increasing the throughput.

Next, there is another question: What should I do if the deletion fails when deleting the cache for the second time?

4. Update the database first, then delete the cache

This strategy is not without concurrency problems. If the following situations occur, dirty data will still be generated. Assuming that there will be two requests, one request A for query operation and one request B for update operation, then the following situation will occur:

(1) The cache just fails
(2) Request A to query the database and get an old value
(3) Request B to write the new value into the database
(4) Request B to delete the cache
(5) Request A to write the old value found into the cache

However, this situation is still relatively rare, and the following conditions must be met at the same time:

1. The cache just expires automatically.

2. Requesting A to find out the old value from the database and updating the cache takes longer than requesting B to write to the database and delete the cache.

We all know that the speed of querying the database is generally faster than writing the database, not to mention deleting the cache after writing the database. So in most cases, writing data requests takes longer than reading data.

It can be seen that the probability that the system satisfies the above two conditions at the same time is very small.

It is recommended that you use the solution of first writing to the database and then deleting the cache. Although the problem of data inconsistency cannot be avoided 100%, the probability of this problem is the smallest compared with other solutions.

But in this scenario, what if deleting the cache fails?

Answer: The retry mechanism needs to be added.

In the interface, if the update of the database is successful, but the update of the cache fails, you can retry 3 times immediately. If any of them are successful, success is returned directly. If it fails all three times, it will be written to the database for subsequent processing.

Of course, if you directly retry synchronously in the interface, when the concurrency of the interface is relatively high, the performance of the interface may be slightly affected.

At this time, you need to change to asynchronous retry.

There are many ways to asynchronously retry, such as:

1. A separate thread is started each time, and this thread is dedicated to retrying. However, if in a high-concurrency scenario, too many threads may be created, causing system OOM problems, so it is not recommended.

2. Hand over the retried tasks to the thread pool, but if the server restarts, some data may be lost.

3. Write the retry data to the table, and then use elastic-job and other scheduled tasks to retry.

4. Write the retried request into message middleware such as mq, and process it in the consumer of mq.

However, the above methods all have a disadvantage, causing a lot of intrusion to the line of business code. So there is method 5:

Subscribe to the binlog of mysql . Among the subscribers, if a data update request is found, the corresponding cache will be deleted.

Ali already has a ready-made middleware canal, if you are interested, you can learn by yourself~

In the same way, even if canel is used, there will still be the problem of deletion failure, which requires the addition of the retry mechanism discussed earlier.

If the canal client (that is, the subscriber) fails to delete the cache again, it is recommended to write to mq and let mq automatically retry.

As shown below:

Guess you like

Origin blog.csdn.net/cj_eryue/article/details/129737398