Re-study of Distributed Cache Consistency

The so-called cache is actually a technology that uses high-speed storage media to improve read performance. From an engineering point of view, caches can be divided into local caches and distributed caches . Local caches are generally carried by static variables, which can achieve inter-class sharing and intra-process sharing, but cannot achieve inter-process sharing, while distributed caches are It can be shared between processes, and the distributed cache is often hosted by Redis.

The introduction of distributed caching can not only significantly improve query performance and throughput, but also shield the database from the impact of large traffic. However, technology is often a double-edged sword. While enjoying the benefits of distributed caching, it also brings data consistency. sexual challenge. In a distributed environment, there must be data inconsistency between the database and the distributed cache. What we need to do is to minimize the time window for data inconsistency and pursue eventual consistency!

1 Consistency Challenge

After the introduction of distributed cache, data query scenarios often involve cache backfill, while data update scenarios involve cache updates. Whether it is backfilling or updating, data inconsistency may be triggered, but if you want to get to the bottom of it, the cache update is the culprit of the data inconsistency. If the old data is backfilled to Redis, it must be here A cache update occurred during.

Cache update generally has four combined actions, namely:

  • Update MySQL first and then update Redis
  • Update Redis first and then update MySQL
  • Update MySQL first and then delete Redis
  • Delete Redis first and then update Redis

Let's analyze how these four cache update actions cause data inconsistency; at the same time, we must also explore which cache update action is the best and the time window for data inconsistency is the smallest. In the article "Research on Redis Distributed Locks", I mentioned that Martin Kleppmann believes that RedLock is a nondescript distributed lock based on three assumptions (process suspension, network delay, and clock drift). Solution, indeed, process suspension, network delay and clock drift can be regarded as the three major obstacles in the JVM distributed system ; in order to describe data inconsistency more intuitively and clearly, we still use process suspension as an analysis method . Extension also has the same effect...

Update MySQL first and then update Redis

Under the action of updating MySQL first and then updating Redis, cache backfilling is not involved.

modify_mysql_modify_redis.png

Update Redis first and then update MySQL

Under the action of updating Redis first and then updating MySQL, cache backfilling is also not involved.

modify_redis_modify_mysql.png

Update MySQL first and then delete Redis

Under the action of first updating MySQL and then deleting Redis, it involves cache backfilling.

modify_mysql_delete_redis_1st.png

From the picture above, there is indeed data inconsistency, but it should not be a big problem to ensure that the final consistency is achieved, but the following situation is not good.

modify_mysql_delete_redis_2nd.png

Delete Redis first and then update MySQL

Under the action of first deleting Redis and then updating MySQL, it also involves cache backfilling.

delete_redis_modify_mysql.png

The first two cache schemes are a kind of cache update action, which is a bit laborious and thankless. If the cache is not hit after the cache update, you should avoid using it. In the latter two schemes, updating MySQL first and then deleting Redis is better than deleting Redis first and then updating MySQL, because the time window between database query and cache backfill is often smaller than the time window between cache deletion and database update , that is to say, the probability of data inconsistency caused by updating MySQL first and then deleting the Redis scheme is lower.

2 better solutions

If the concurrency of the business scenario is not high, it is enough to choose the solution of updating MySQL first and then deleting Redis, but if you want to pursue a higher consistency experience, I think the dry goods | Distributed cache and DB second level The consistent design practice scheme is worth learning from. The main core points of the program are as follows:

  • In the business layer, there is only cache query logic and does not involve any update, delete, and backfill logic for distributed caches. The independent cache update platform is responsible for the update, delete, and backfill logic for distributed caches.
  • In the business layer, if the data in the database changes, the business layer is responsible for pushing the data change events for INSERT, UPDATE and DELETE statements to MQ; in order to ensure the sequence of operations on the same key in a distributed environment, it is necessary In mod(hash(key))this way, the same key is sent to the same queue in MQ.
  • The data change events in MQ are consumed serially by the cache update platform, but the problem of data inconsistency cannot be avoided 100%. Maybe the order of data change events sent to the same queue is wrong due to network jitter and other factors, so Design a version number for the cached data. This version number can be the update time (monotonically increasing). Then you need to compare this version number when you actually operate the cache to avoid filling old data into the distributed cache.
  • For hot keys, after the cache update platform completes the corresponding data operation, it needs to send a broadcast message to MQ to notify the business party to update the local cache data.

This article will not introduce the CDC solution, because the details of the CDC solution except the CDC tool are reflected in the Ctrip solution.

Summarize

Of course, data consistency is not the only challenge in the field of distributed caching, but also cache penetration, cache breakdown, and cache avalanche.

distributed_cache_common_problems.png

Guess you like

Origin juejin.im/post/7222214389138374717