Resolve redis related cache problems such as cache avalanche, cache penetration, and cache breakdown

Cache avalanche, cache penetration, cache warm-up, cache update, cache degradation...

1. Cache avalanche ( multiple ke y cache invalidation )

Concept: Simply put, because the original cache is invalid, the new cache has not yet reached the period (for example: we set the cache with the same expiration time, and a large area of ​​cache expires at the same time), all requests that should have access to the cache go The database is queried, which will cause huge pressure on the database CPU and memory, and severely cause database downtime . Thus formed a series of chain reactions, causing the entire system to collapse.

Solution: Most system designers consider using locks (the most solution) or queue guarantees to ensure that there will not be a large number of threads reading and writing to the database at one time, so as to avoid a large number of concurrent requests falling to the bottom when it fails On the storage system. There is also a simple solution to spread the cache expiration time from time to time .

2. Cache breakdown ( a hot key cache is invalid )

A. Data that is not in the cache but in the database. If it is hot data, there will be a large number of requests at the moment when the cache expires. These requests will penetrate the DB, causing a large amount of DB requests and increasing pressure.

B. The difference from the cache avalanche is that here is cached for a certain key, while the latter is a lot of keys.

C. Prevention: Set hotspots to not expire data, timed tasks to update the cache regularly, or set up mutual exclusion locks

3. Cache penetration ( query does not exist data )

a. Query a non-existent data, because the cache is missed, and out of fault tolerance considerations, such as initiating a non-existent data with id=-1 (hacking attack)

b. If the data cannot be queried from the database, it cannot be written to the cache, which will cause every request to be queried to the database, thus losing the meaning of the cache. There is a large amount of data that does not exist in the query, and the database may be down. This is also a way for hackers to frequently attack applications with non-existent keys.

Solution:

a. Bloom filter (the most common), which hashes all possible data into a sufficiently large bitmap, and a data that must not exist will be intercepted by this bitmap, thus avoiding the query pressure on the underlying storage system .

b. There is also a simpler and more rude method. If the data returned by a query is empty (whether the data does not exist or the system is faulty), we still cache the empty result, but its expiration time will be very short. The maximum length is no more than five minutes. The default value set directly is stored in the cache, so that the second time to get the value from the cache, and will not continue to access the database, this method is the simplest and rude.

c. Add verification at the interface layer to verify data rationality

Fourth, cache warm-up

concept:

Cache warm-up is to directly load related cache data into the cache system after the system is online. In this way, you can avoid the problem of querying the database first and then caching the data when the user requests it! The user directly queries the pre-heated cache data!

Solutions:

1. Write a cache to refresh the page directly, and manually operate it when you go online;

2. The amount of data is not large, and it can be loaded automatically when the project is started;

3. Refresh the cache regularly;

Five, cache update

In addition to the cache invalidation strategy that comes with the cache server (Redis has 6 strategies to choose from by default), we can also customize the cache elimination according to specific business needs. There are two common strategies:

(1) Clean up the expired cache regularly;

(2) When there is a user request, it is judged whether the cache used for this request has expired, and if it expires, it will go to the underlying system to get new data and update the cache.

Both have their advantages and disadvantages. The disadvantage of the first is that it is troublesome to maintain a large number of cached keys. The disadvantage of the second is that the cache is invalid every time a user requests it, and the logic is relatively complicated! You can weigh the specific scheme according to your own application scenarios.

Six, cache degradation

When the traffic increases sharply and the service has problems (such as slow response time or no response), it is still necessary to ensure that the service is still available, even if the service is damaged. The system can automatically degrade based on some key data, or it can be configured with switches to achieve manual degradation. The ultimate goal of downgrading is to ensure that the core services are available. And some services cannot be downgraded (such as adding to shopping cart, settlement)

Set the plan at the reference log level:

(1) General: For example, some services occasionally time out due to network jitter or the service is online, and they can be automatically degraded;

(2) Warning : Some services fluctuate in the success rate within a period of time (for example, between 95% and 100%), and they can be automatically or manually degraded, and an alarm will be sent;

(3) Error : For example, the availability rate is lower than 90%, or the database connection pool is burst, or the traffic suddenly increases to the maximum threshold that the system can withstand. At this time, it can be automatically downgraded or manually downgraded according to the situation;

(4) Serious errors : For example, the data is wrong due to special reasons, and urgent manual downgrade is required at this time. The purpose of service downgrading is to prevent Redis service failures, resulting in an avalanche problem in the database. Therefore, for unimportant cached data, a service degradation strategy can be adopted. For example, a common practice is that Redis does not query the database, but directly returns the default value to the user.

Guess you like

Origin blog.csdn.net/weixin_45496190/article/details/108166363