How do Redis and MySQL maintain data consistency?

In a high-concurrency scenario, a large number of requests directly accessing MySQL can easily cause performance problems. Therefore, we will use Redis to cache data and reduce requests to the database. However, MySQL and Redis are two different databases, how to ensure data consistency between different databases is very critical.

1. What causes the data inconsistency?

  1. In high-concurrency business scenarios, the database is in most cases the weakest link for concurrent user access. Therefore, it is necessary to use redis to do a buffer operation, so that the request can access redis first, instead of directly accessing databases such as MySQL;
  2. There is generally no problem in reading the cache step, but once data updates, database and cache updates are involved, data consistency problems between the cache (Redis) and the database (MySQL) are prone to occur;
  3. This business scenario mainly solves the problem of reading data from Redis cache, and generally performs business operations according to the process in the figure below.

2. The problem of cache deletion

Whether you write to the MySQL database first, then delete the Redis cache, or delete the cache first, and then write to the library, there may be data inconsistencies.

1. Delete the cache first

        1) If the Redis cache data is deleted first, but before it has time to write to MySQL, another thread will read it;

        2) At this time, if the cache is found to be empty, go to the Mysql database to read the old data and write it into the cache. At this time, the cache is dirty data;

        3) Then after the database was updated, it was found that Redis and Mysql had data inconsistencies.

2. After delete the cache

        1) If the library is written first, and then the cache is deleted, unfortunately the thread writing the library hangs up, resulting in the cache not being deleted;

        2) At this time, the old cache will be read directly, which eventually leads to data inconsistency;

        3) Because writing and reading are concurrent, the sequence cannot be guaranteed, and there will be data inconsistencies between the cache and the database.

For more content, please pay attention to the official account [Programmer Style] to get more exciting content!

3. Solutions

1. Delayed double deletion strategy

1 ).  Basic ideas

Perform the redis.del(key) operation before and after writing the library, and set a reasonable timeout period.

The pseudo code is as follows:

public void write( String key, Object data ){
    redis.delKey(key);
    db.updateData(data); 
    Thread.sleep(500);
    redis.delKey(key);
}

2). Specific steps

  • Delete the cache first;
  • Rewrite the database;
  • Sleep for 500 milliseconds;
  • Delete the cache again.

Question: How is this 500 milliseconds determined, and how long should it sleep for?

  • Need to evaluate the time-consuming of reading data business logic of your own project.
  • The purpose of doing this is to ensure that the read request ends, and the write request can delete the cache dirty data caused by the read request.
  • Of course, this strategy also takes into account the time-consuming synchronization between redis and database master-slave.
  • The final sleep time for writing data: on the basis of the time-consuming business logic for reading data, add a few hundred ms.

    For example: Sleep for 1 second.

3). Setting the cache expiration time is the key point

  • In theory, setting an expiration time for the cache is a solution to ensure eventual consistency;
  • All write operations are subject to the database, as long as the cache expiration time is reached, the cache will be deleted;
  • If there is a read request later, the new value will be read from the database and the cache will be backfilled.

4). Disadvantages of the solution

    Combined with double deletion strategy + cache timeout setting, the worst case is:

  • Data inconsistency occurs within the cache expiration time;
  • At the same time, it increases the time-consuming of writing requests.

2. Asynchronous update cache (synchronization mechanism based on Mysql binlog)

1). The overall idea

        1>, Involving updated data operations, use Mysql binlog for incremental subscription consumption;

        2>, send the message to the message queue;

        3>. Update incremental data to Redis through message queue consumption.

2). Operation

  • Read Redis cache: hot data is on Redis;

  • Write Mysql: additions, deletions and changes are all performed in Mysql;

  • Update Redis data: Mysql data operations are recorded in the binlog, and updated to Redis in time through the message queue.

3). Redis update process

        There are two main types of data operations:

        1>, one is the full amount (write all data to Redis at one time);

        2>, one is incremental (real-time update).

What is mentioned here is increment, which refers to the update, insert, and delete change data of mysql.

After reading the binlog, analyze it, and use the message queue to push and update the redis cache data of each station.

        1>, so that once new write, update, delete and other operations are generated in MySQL, the binlog-related messages can be pushed to Redis;

        2>, Redis then updates Redis according to the records in the binlog;

        3>. In fact, this mechanism is very similar to the master-slave backup mechanism of MySQL, because the master-slave backup of MySQL is also achieved through binlog to achieve data consistency.

You can also use other third parties for the message push tool here: kafka, rabbitMQ, etc. to push and update Redis!

Four. Summary

In high-concurrency application scenarios, if there is a high requirement for data consistency, it is necessary to locate the cause of data and cache inconsistency.

There are two solutions to data consistency in high-concurrency scenarios, namely the delayed double-delete strategy and the asynchronous update cache solution.

In addition, setting the expiration time of the cache is a key operation to ensure data consistency, and it needs to be set reasonably in combination with the business.

 

Guess you like

Origin blog.csdn.net/dreaming317/article/details/129844613