How to ensure consistency between Redis cache and database? Synchronous deletion + delayed double deletion + asynchronous monitoring + multiple guarantee schemes

navigation:

[Java Notes + Stepping on the Pit Summary] Java Basics + Advanced + JavaWeb + SSM + SpringBoot + St. Regis Takeaway + SpringCloud + Dark Horse Tourism + Guli Mall + Xuecheng Online + MySQL Advanced Chapter + Design Mode + Common Interview Questions + Source Code

Table of contents

1. Four basic synchronization strategies

1.1 Synchronization strategy

1.2 Update cache or delete cache?

1.2.1 Advantages and disadvantages of updating the cache

1.2.2 Advantages and disadvantages of deleting the cache (recommended)

1.3 Operate the database first or delete the cache first?

1.3.1 Advantages and disadvantages of deleting the cache first and then operating the database

1.3.2 Advantages and disadvantages of operating the database first and then deleting the cache (recommended)

1.4 Optimal synchronization strategy: update the database first, then delete the cache

2. Synchronous deletion + reliable message scheme

3. Delayed double deletion: a higher consistency solution

4. Asynchronous monitoring + reliable message deletion scheme

5. Multiple Guarantees: Final Strong Consensus Solution


1. Four basic synchronization strategies

1.1  Synchronization strategy

To ensure the double-write consistency between the cache and the database, there are four synchronization strategies , that is, update the cache first and then update the database, update the database first and then update the cache, delete the cache first and then update the database, and update the database first and then delete the cache. 

  • Update the cache first and then update the database: the second step fails and the cache library is dirty data
  • Update the database first and then update the cache: the second step fails and the cache library is old data
  • Delete the cache first and then update the database: the second step fails and the cache library is empty data
  • Update the database first, then delete the cache (recommended): the second step fails and the cache library is old data 

1.2 Update cache or delete cache?

1.2.1 Advantages and disadvantages of updating the cache

The advantage of updating the cache is that the cache can be updated in a timely manner every time the data changes , so that the query miss is not easy to occur, but this operation consumes a lot . If the data needs to be written into the cache after complex calculations, frequent The updated cache will affect the performance of the server. If it is a scenario where data is frequently written, the cache may be updated frequently but there is no business to read the data.

1.2.2  Advantages and disadvantages of deleting the cache (recommended)

The advantage of deleting the cache is that the operation is simple , no matter whether the update operation is complex or not, the data in the cache is directly deleted. The disadvantage of this approach is that after the cache is deleted, a miss is likely to occur next time, and then the database needs to be read again. 

In contrast, deleting the cache is undoubtedly a better choice

1.3 Operate the database first or delete the cache first?

1.3.1  Advantages and disadvantages of deleting the cache first and then operating the database

Case 1: The database and cache content are inconsistent

When thread 1 deletes the cache and has not had time to update the database, thread 2 reads the cache. Since the data in the cache has been cleared by thread 1, thread 2 needs to read the data from the database, and then save the read results in the cache. At this time, thread 1 updates and updates the database successfully. There will be inconsistencies between the database and the cache content.

Case 2: Cache breakdown, database stuck

When thread 1 deletes the cache and has not had time to update the database, a large number of read requests come in. Since there is no data in the cache, the cache breakdown directly accesses a large number of requests to the database, causing the database to crash.

1.3.2 Advantages and disadvantages of operating the database first and then deleting the cache (recommended)

Dirty data problem: If the database is operated first but the cache fails to be deleted, old data will always remain in the cache library, while new data will be stored in our database.

Solution: asynchronous retry mechanism

When the above problems occur, we generally use the retry mechanism to solve them. In order to avoid the retry mechanism from affecting the execution of the main business, it is generally recommended that the retry mechanism be executed in an asynchronous manner. When we adopt the retry mechanism, due to the existence of concurrency, deleting the cache first may still have old data stored in the cache, while new data is stored in the database, and the two data are inconsistent.

1.4 Optimal synchronization strategy: update the database first, then delete the cache

So we come to the conclusion that updating the database first and then deleting the cache is a solution with less impact . If the second step fails, a retry mechanism can be used to solve the problem.

Synchronous deletion scheme: update the database first, and then delete the cache. Suitable for scenarios where data consistency is not mandatory

Process: Update the database first, then delete the cache.

question:

  • Dirty data during concurrency: During the period from querying the database to writing the cache, other threads perform an update and delete, resulting in the cached data being old data
  • Cache deletion failed: the cache library is still old data due to deletion failure

2. Synchronous deletion + reliable message scheme

Synchronous deletion + reliable message deletion: suitable for scenarios where data consistency is not mandatory

Process: Update the database first, then delete the cache. If the deletion fails, send reliable MQ and retry to delete the cache until the deletion is successful or retry 5 times.

Problem: MQ failed multiple retries, resulting in long-term dirty data.

3. Delayed double deletion: a higher consistency solution

Delayed double deletion scheme: a scheme with higher consistency than the synchronous deletion strategy.

Process: Delete the cache first and then update the database, about once after the database is updated from the library.

Problem: The timing is out of control and there is no guarantee that the cache will be deleted after the database is updated from the repository. If it is deleted before the slave library is updated, the user will check the slave library before updating and write the dirty data in the cache.

4. Asynchronous monitoring + reliable message deletion scheme

Asynchronous monitoring + reliable message deletion: a solution that many major manufacturers are using.

process:

  1. No operation after updating the database;
  2. Canal and other components monitor the binlog and send reliable MQ to delete the cache when they find an update;
  3. If the deletion of the cache fails, the message will be retried within a limited number of times based on mechanisms such as manual ack and retry.

advantage:

  • Asynchronous deletion, higher performance;
  • Reliable message retry mechanism, multiple deletions ensure successful deletion.

Problem: Binlog capture components such as canal are required to be highly available. If canal fails, long-term dirty data will result.

5. Multiple Guarantees: Final Strong Consensus Solution

Multiple guarantee schemes: synchronous deletion + asynchronous monitoring + reliable message deletion, set expiration time when caching, and force the main library to check when querying; suitable for situations where data consistency is mandatory

  1. Synchronous deletion: update the database first, then delete the cache; after this link, it is forbidden to check the data again, so as to prevent the old cache data from being found again before the cache is deleted.
  2. Canal monitoring: Canal and other components monitor the binlog and send reliable MQ to delete the cache when they find that there is an update; the second layer ensures that the cache is deleted successfully;
  3. Delayed message verification consistency: Canal and other components monitor binlog, send delayed MQ, and verify cache consistency after N seconds;
  4. Cache expiration time: set the expiration time each time the cache is cached; the third layer guarantees the success of deleting the cache;
  5. Mandatory Redis main library check: When checking the cache in the future, it is forced to check the main library from the cache; because there is a delay in the master-slave synchronization, and at the same time, there is no need to worry about the pressure on the main library because of the sharding cluster mechanism.

Guess you like

Origin blog.csdn.net/qq_40991313/article/details/131304564