Redis study [8] issue summary

A, Redis data expiration policies and garbage collection policy


     Redis has expired for the use of data and delay tactics periodically delete delete binding, but both have drawbacks; due to the periodic inspection of all key has expired will cause performance problems, so regular deletion policy using a random sample, while in operation Key will determine whether the former has expired, such as expired immediately deleted; this strategy will lead to some of the key has expired also accumulate in memory, so that the redis server memory usage is high, and therefore requires a combination of maxmemory- redis.conf policy configured to use, when the memory is recovered when redis server policy is insufficient memory to write new data,
# noeviction. 1: represents a direct error;
# 2-LUR AllKeys: means to delete all keys in the key in accordance with the LRU;
#. 3 AllKeys -random: represents the random delete key in all keys in;
# 4-volatile the LRU / random-volatile / volatile-ttl is used when redis server acts as both a cache and when DB, expressed in a set of keys expire date for deletion , ttl represents the delete key has expired earlier time.

 

Second, the solution penetration Redis cache and cache avalanche problems


     Cache penetration and avalanche can be seen as a problem, but varying degrees of severity; when a request arrives redis found no corresponding cache data, and then send the data request to DB, if we can get to the data problem would remain in the cache penetration on, DB acquired data is cached on Redis; if there is no corresponding data DB, and when such a request reaches a certain magnitude and complete consumption of all the DB resources, resulting in DB connection abnormality arises cache avalanche problems .
Cache ideas to solve the problem of penetration available in the following, whether from the DB to find the corresponding value (no value on the as null), are recorded in a cache redis record; in Dao layer maintains a BitMap, with a bit whether recording the corresponding key has a corresponding value, so as to avoid redundant DB operation; background thread dedicated to Redis data update is about to expire, thus avoiding cache penetration.
Avalanche ideas to solve the problem cache available in the following, is added in the DB Connection mutex, so that when a large number of cache request failure to queue requests to DB data; a data set is provided with the same expiration of a random time value, to avoid collective failure data; double buffer or a multiple caching strategies need to meet cache warming.

 

2.1 What is the cache avalanche?

If the cache is concentrated over a period of time has expired, a lot of cache penetration occurs, all queries fall on the database, resulting in a cache avalanche

Since the original cache invalidation, the new cache yet reached all requests for access to the cache during which should go to query the database, while the database enormous pressure on the CPU and memory, can cause serious database downtime

2.2), what is the solution to prevent caching avalanche?

1) lock queue

mutex mutex to solve, Redis of SETNX to set a mutex key, when the operation returns successfully, before proceeding to load the database and back in the cache, otherwise, you get the entire cache retry method

2) Data preheat

After preheating the cache line is on the system, the relevant data is directly loaded into the cache buffer system. This can be avoided when the user requests, first query the database, and then the data cache issues. Users to directly query cache data previously been preheated. Through cache reload mechanism, to update the cache in advance, before the impending large concurrent access to manually trigger a different load the cache key

3) Double caching policy

The original cache C1, C2 is a copy of the cache, when C1 fails, you can visit C2, C1 cache expiration time is set short-term, C2 is set for long-term

4) regularly update the cache policy

Effectiveness of less demanding cache, loading start the initialization container, or remove a timed task updates the cache

5) Set different expiry time for the point in time of a cache miss as uniform as possible

 

Third, the cache breakdown

1) What is the breakdown cache?

In the usual concurrent systems, the large number of requests simultaneously query a key, the key at this time just fails, it will lead to a lot of requests to hit the database above. This phenomenon we call cache breakdown

2), would bring any problems

Cause excessive database requests at a time, pressure surge

3) how to solve

The above phenomenon is multiple threads to query this data in the database, then we can use a mutex lock on the data of the first query request to lock it

Other threads come this far can not get a lock wait, ranking a thread to query the data and then do the cache. The threads come back cache has been found, and go directly to the cache

 

Fourth, the cache penetration

1) What is the cache penetrate?

Cache penetration refers to the user query data, not in a database, naturally there will not be in the cache. This results in the user's query when the corresponding key is not found in the cache of value, every time to re-query the database again, and then return empty (the equivalent of useless conducted two inquiries). Such requests will bypass the cache Direct Access database

2) What solutions to prevent caching penetration?

1) Cache null value

If a query returns the data is empty (data not whether, or system failure exists) we still see the empty cache results, but its expiration time will be very short, no longer than five minutes. Stored by default for this setting to the cache, so the second time to get there is value in the cache, but will not continue to access the database

2) using Bloom filters BloomFilter

Advantages: small memory footprint, the bit storage; special high-performance, using the hash key key exists or not is determined

The hash all possible data to a bitmap large enough, one must not exist this bitmap data will be blocked off, thus avoiding queries pressure on the underlying storage system

In BloomFilter add a layer, go BloomFilter to query key exists in the cache before the query time, if there is no direct return, there is again the query cache, the cache did not go to query the database

Fifth, several common cache mode

1、Cache Aside

After applying the query data, the data is read first from the Cache cache, the cache if not, then read data from the database to obtain data in the database, this data will be placed in the cache Cache

If the application you want to update some data, but also to go to update data in the database, the update is complete, let the data through the instruction cache Cache in failure.

1) Why not here to update the database after the finish, followed by Cache to cache the data is also modified it?

Mainly because to do so, there are two write operations of the event, and in the case of concurrent fear will lead to dirty data, for example: If there are two simultaneous requests, request A request execution and B, concurrent. A request to read data, the request B is going to update the data. The initial state of the cache is no data, when a request to read data A, ready to back the time of writing, at the moment, just to update data request B, after the expiry of the database update, went to update the cache, then go down the request A write cache data is old, and belongs to the dirty data

2) Cache Aside mode then there is no dirty data problem?

It may also produce dirty data in extreme cases. For example, while two request with A and B request execution, concurrent. A request to read data, the request is going to write data B. If the initial state cache does not have this data, it requests A cache found no data in the database will be read to the data, read data preparation write-back cache, at this time, the request B is going to write data request after writing the data in the database B, the cache invalidation went provided. This time, the request because the A read old data in the database before, began to write data to the cache, it is written at this time to enter the old data. So eventually it will lead to inconsistent data in the cache data to the database, resulting in dirty data

2、Read/Write Through

Application data to be read and update data directly access the cache service

Cache synchronization service to update the data to the database

Lower probability of dirty data, but it is strongly dependent caching, caching services for greater stability requirements

3、Write Behind模式

Application data to be read and update data directly access the cache service

Caching services asynchronously update data into the database (via asynchronous task)

Speed ​​and efficiency is very high, but the consistency of the data is poor, the situation may have lost data, to achieve more complex logic

Published 22 original articles · won praise 9 · views 8812

Guess you like

Origin blog.csdn.net/ljm_c_bok/article/details/104830069