Three cache strategies: Cache Aside strategy, Read/Write Through strategy, Write Back strategy

Author: Xiaolin coding
computer stereotyped essay website: https://xiaolincoding.com

Hello everyone, I am Xiaolin.

Let me talk to you today about common cache update strategies.

  • Cache Aside (bypass caching) strategy;
  • Read/Write Through (read through / write through) strategy;
  • Write Back policy;

In actual development, the update strategy of Redis and MySQL uses Cache Aside, and the other two strategies are mainly used in computer systems.

Cache Aside (bypass caching) strategy

The Cache Aside (bypass cache) strategy is the most commonly used. The application directly interacts with the "database and cache" and is responsible for maintaining the cache. This strategy can be subdivided into "read strategy" and "write strategy".

img

Steps to write a policy:

  • Update the data in the database first, and then delete the data in the cache.

Steps to read policy:

  • If the read data hits the cache, the data is returned directly;
  • If the read data does not hit the cache, the data is read from the database, then written to the cache, and returned to the user.

Note that the order of the steps of the write strategy cannot be reversed, that is, the cache cannot be deleted first and then the database updated . The reason is that when "read + write" is concurrent, there will be data inconsistency between the cache and the database.

For example, suppose a user's age is 20, and request A wants to update the user's age to 21, so it will delete the content in the cache. At this time, another request B wants to read the user's age. After it queries the cache and finds a miss, it will read the age 20 from the database and write it into the cache, and then request A to continue to change the database, and the user The age of is updated to 21.

img

In the end, the user's age is 20 (old value) in the cache and 21 (new value) in the database, and the data in the cache and database are inconsistent.

Why "update the database first and then delete the cache" will not have the problem of data inconsistency?

Continue to analyze the concurrent scenario of "read + write" requests.

If a certain user data does not exist in the cache, when request A reads the data, the age is 20 from the database query, and another request B updates the data when it is not written into the cache. It updates the age in the database to 21 and clears the cache. At this time, request A writes the age 20 data read from the database into the cache.

img

In the end, the user's age is 20 (old value) in the cache and 21 (new value) in the database, and the cache and database data are inconsistent. From the above theoretical analysis, updating the database first and then deleting the cache will also cause data inconsistency, but in practice, the probability of this problem is not high .

Because cache writes are usually much faster than database writes , it is difficult in practice for request A to finish updating the cache after request B has updated the database and deleted the cache. Once request A updates the cache before request B deletes the cache, the next request will re-read data from the database because of a cache miss, so this inconsistency will not occur.

The Cache Aside policy is suitable for scenarios with more reads and fewer writes, but not for scenarios with a lot of writes , because when writes are frequent, the data in the cache will be cleaned up frequently, which will have some impact on the cache hit rate. If the business has strict requirements on the cache hit ratio, two solutions can be considered:

  • One approach is to update the cache when updating data, but add a distributed lock before updating the cache, because only one thread is allowed to update the cache at the same time, and there will be no concurrency problems. Of course, doing so will have some impact on the performance of writing;
  • Another way is to update the cache when updating data, but add a short expiration time to the cache, so that even if the cache is inconsistent, the cached data will expire soon, and the impact on the business is acceptable.

Read/Write Through (read through / write through) strategy

The policy principle of Read/Write Through is that the application only interacts with the cache and no longer interacts with the database, but the cache interacts with the database, which is equivalent to the operation of updating the database by the cache itself.

Read Through Strategy

First check whether the data in the cache exists, if it exists, return it directly, if not, the cache component is responsible for querying the data from the database, and writes the result to the cache component, and finally the cache component returns the data to the application.

Write Through Strategy

When there is data update, first query whether the data to be written already exists in the cache:

  • If the data in the cache already exists, the data in the cache is updated, and the cache component is synchronously updated to the database, and then the cache component notifies the application that the update is complete.
  • If the data in the cache does not exist, directly update the database and return;

The following is a schematic diagram of the Read Through/Write Through strategy:

img

The characteristic of the Read Through/Write Through strategy is that the cache node rather than the application program deals with the database. In our development process, it is less common than the Cache Aside strategy because the distributed cache components we often use, whether it is Memcached or Neither Redis provides the ability to write to the database and automatically load the data in the database. And we can consider using this strategy when using local cache.

Write Back Policy

When updating data, the Write Back strategy only updates the cache, sets the cache data as dirty, and returns immediately without updating the database. For the update of the database, it will be performed by batch asynchronous update.

In fact, the Write Back (write back) strategy cannot be applied to our common database and cache scenarios, because Redis does not have the function of updating the database asynchronously.

Write Back is a design in computer architecture. For example, the cache of the CPU and the cache of the file system in the operating system all adopt the Write Back (write back) strategy.

The Write Back strategy is especially suitable for scenarios with a lot of writing , because when a write operation occurs, it only needs to update the cache and return immediately. For example, when writing a file, it is actually written to the cache of the file system and then returned, and will not be written to the disk.

But the problem is that the data is not strongly consistent, and there is a risk of data loss , because the cache generally uses memory, and the memory is non-persistent, so once the cache machine is powered off, the original cache will be dirty. data lost. So you will find that after the system is powered off, some of the previously written files will be lost, because the Page Cache has not had time to flush the disk.

Here is a flowchart of the CPU cache and memory use Write Back strategy:

img

Does this process feel familiar? Because I mentioned it when I wrote the CPU cache article .

Series "Illustrated Redis" articles

Data Types

Guess you like

Origin blog.csdn.net/qq_34827674/article/details/125869674