Cache penetration, avalanche, breakdown in Redis

Overview:

  1. Cache penetration: A large number of requests for keys that do not exist at all will increase the pressure on the application server
  2. Cache avalanche: a large number of keys in redis expire collectively, the pressure on the database increases, and the server crashes
  3. Cache breakdown: A hotspot key in redis expires (a large number of users access the hotspot key, but the hotspot key expires), and the access pressure of the database surges instantly

The root cause of the emergence of the three: the Redis hit rate drops, and the request is directly hit on the DB.

        Under normal circumstances, a large number of resource requests will be responded by redis, and only a small part of requests that do not get a response from redis will request DB, so that the pressure on DB is very small and it can work normally

solution: 

1. Cache penetration

        The data corresponding to the key does not exist in redis . Every time a request for this key is not obtained from the cache, the request is forwarded to the database. If the number of visits is large, the database may be overwhelmed. For example, if you use a non-existent user ID to obtain user information, neither the redis cache nor the database, if a hacker exploits this vulnerability to attack, the database may be overwhelmed (the hacker accesses data that does not exist, which will cause a lot of pressure on the server).

 

solution: 

A data that must not exist, because the cache is passively written when it misses, and for fault tolerance, if the data cannot be found from the storage layer, it will not be written to the cache, which will cause this non-existing data to be requested every time Querying at the storage layer loses the meaning of caching

  • Cache empty values: If the data returned by a query is empty (whether the data does not exist or not), we still cache the empty result (null), which can relieve the pressure on database access, and then set the expiration time of the empty result to be Very short, no longer than five minutes. (only as a simple emergency solution)
  • Set an accessible list (white list): Use the bitmaps type to define an accessible list. The list id is used as the offset of the bitmaps. Each visit is compared with the id in the bitmap. If the access id is not in the bitmaps, intercept it. Access not allowed.
  • Bloom filter: Hash all possible data into a large enough bitmaps, and a data that must not exist will be intercepted by this bitmaps, thus avoiding the query pressure on the underlying storage system
  • Data verification: Similar to the interception of user permissions, the invalid access of id=-3872 is directly intercepted, and these requests are not allowed to reach Redis and DB.
  • Real-time monitoring :  When it is found that the hit rate of Redis begins to decrease rapidly, it is necessary to check the access objects and the accessed data, and cooperate with the operation and maintenance personnel to set a blacklist to restrict services

Precautions:

  1. When using a null value as a cache, the expiration time set by the key should not be too long to prevent taking up too many redis resources
  2. Null value caching is a passive defense method. When a hacker violently requests a lot of non-existing data, a large number of null values ​​need to be written to Redis, which may lead to insufficient memory usage of Redis.
  3. Using the Bloom filter, you can judge whether the resource exists when the user accesses it, and directly deny access if it does not exist
  4. The Bloom filter has certain errors, so it generally needs to cooperate with some interface traffic restrictions (regulating the frequency of user access within a period of time), permission verification, blacklist, etc. to solve the problem of cache penetration 

2. Cache avalanche

        The data corresponding to a large number of keys exists, but expires in redis . If there are a large number of concurrent requests at this time, these requests will generally load data from the backend DB and set it back to the cache when they find that the cache expires. At this time, a large number of concurrent requests may be Instantly overwhelm the back-end DB. The cache avalanche is aimed at many key failures, causing redis to fail to hit, and the database pressure surges; cache breakdown means that a certain popular key fails, causing redis to fail to hit, and the database pressure surges.

 

solution: 

  • Build a multi-level cache architecture: nginx cache + redis cache + other caches (ehcache, etc.), the program design is more complicated
  • Use locks or queues: Use locks or queues to ensure that there will not be a large number of threads reading and writing to the database at one time, so as to avoid a large number of concurrent requests falling on the underlying storage system when failure occurs. Low efficiency, not suitable for high concurrency situations
  • Set the expiration flag to update the cache: record whether the cache data is expired (set the advance amount), if it expires, it will trigger to notify another thread to update the actual key cache in the background
  • Set the cache flag: record whether the cached data is expired, if it expires, it will trigger to notify another thread to update the actual key in the background
  • Distribute the expiration time: By using automatic random number generation, the key expiration time is randomized to prevent collective expiration

3. Cache breakdown:

The reason for the cache avalanche: a hot key in redis expires, but at this time there are a large number of users accessing the expired key

  • Redis is running normally, and there is no large number of expirations (it cannot be accessed after expiration, if it misses, it needs to access the database)
  • Database access pressure surges instantly

solution: 

  • Set up hot data in advance: Before redis peak access, store some hot data in redis in advance, and increase the duration of these hot data keys
  • Monitor data and make timely adjustments :  monitor which data is popular on site, and adjust the key expiration time in real time
  • Locking mechanism: Only one request can obtain a mutex, and then query the data in the DB and return it to Redis, and then all requests can get responses from Redis

Guess you like

Origin blog.csdn.net/zhoushimiao1990/article/details/130658714