Redis knowledge points

1. Cache Avalanche

We can simply understand the cache avalanche as: due to the invalidation of the original cache, the new cache has not yet expired (for example: when we set the cache with the same expiration time, a large area of ​​cache expires at the same time), all the caches that should have been accessed All requests will query the database, which will cause huge pressure on the database CPU and memory, and will seriously cause the database to crash. A series of chain reactions are thus formed, causing the entire system to collapse.

The cache is normally obtained from Redis. The schematic diagram is as follows:

Alibaba P8 technical experts study the distributed cache problem in detail

The schematic diagram of the moment of cache invalidation is as follows:

Alibaba P8 technical experts study the distributed cache problem in detail

Solution for cache avalanche:

(1) In this case, when the amount of concurrency is not particularly large, the most used solution is to lock and queue. The pseudo code is as follows:

Lock queuing is only to reduce the pressure on the database, and does not improve system throughput. Assuming that under high concurrency, the key is locked during cache reconstruction, which means that 999 of the 1000 requests are blocked. It will also cause the user to wait for a timeout, which is a temporary solution!

Note: The solution to lock queuing is the concurrency problem of the distributed environment, and it is possible to solve the problem of distributed locks; the thread will also be blocked, and the user experience is very poor! Therefore, it is rarely used in real high concurrency scenarios!

(2) Add a corresponding cache mark to each cached data, and record whether the cache is invalid. If the cache mark is invalid, update the data cache. The example pseudo code is as follows:

explain:

1. Cache mark: record whether the cached data has expired. If it expires, it will trigger a notification to another thread to update the cache of the actual key in the background;

2. Cached data: its expiration time is 1 times longer than the time of the cache mark, for example: the mark cache time is 30 minutes, and the data cache is set to 60 minutes. In this way, when the cache tag key expires, the actual cache can still return the old data to the caller, and the new cache will not be returned until another thread is updated in the background.

Regarding the solutions to cache crashes, three solutions are proposed here: using locks or queues, setting expired flags to update the cache, setting different cache expiration times for keys, and a solution called "second-level cache". Interested readers can do their own research.

2. Cache penetration

Cache penetration refers to user query data, which is not available in the database, and naturally not in the cache. In this way, when the user queries, it cannot be found in the cache, and every time the user has to go to the database to query again, and then return empty (equivalent to two useless queries). In this way, the request bypasses the cache and directly checks the database, which is also a frequently mentioned cache hit rate problem.

 Cache penetration solution:

(1) The Bloom filter is used to hash all possible data into a large enough bitmap, and a certain non-existent data will be intercepted by this bitmap, thus avoiding the query pressure on the underlying storage system.

(2) If the data returned by a query is empty (whether the data does not exist or the system fails), we still cache the empty result, but its expiration time will be very short, no longer than five minutes. The default value directly set is stored in the cache, so that the second time you get the value from the cache, you will not continue to access the database. This method is the most simple and rude!

 

 The empty results are also cached, so that the same request can directly return empty next time, which can avoid cache penetration caused when the query value is empty. At the same time, you can also set a separate cache area to store null values, pre-check the key to be queried, and then release it to the normal cache processing logic behind.

 3. Cache preheating

 Cache preheating is to directly load the relevant cache data into the cache system in advance after the system goes online. Avoid the problem of querying the database first and then caching the data when the user requests it! The user directly queries the pre-warmed cached data!

 Cache warm-up solution:

(1) Write a cache refresh page directly, and manually operate it when going online;

(2) The amount of data is not large, and it can be automatically loaded when the project is started;

(3) Refresh the cache regularly;

Fourth, cache update

In addition to the cache invalidation strategy that comes with the cache server (Redis has 6 strategies to choose from by default), we can also customize the cache elimination according to specific business needs. There are two common strategies:

(1) Regularly clear the expired cache;

(2) When a user requests, then judge whether the cache used by the request has expired. If it expires, go to the underlying system to get new data and update the cache.

Both have their own advantages and disadvantages. The first disadvantage is that it is more troublesome to maintain a large number of cached keys. The second disadvantage is that every time a user requests, the cache must be judged invalid, and the logic is relatively complex! Which scheme to use, you can weigh according to your own application scenarios.

5. Cache Degradation

When traffic spikes, services experience issues (such as slow or unresponsive response times), or non-core services affect the performance of core processes, there is still a need to ensure that services are still available, even at a loss. The system can automatically downgrade according to some key data, or it can be configured with switches to achieve manual downgrade.

The ultimate goal of downgrading is to keep core services available, even at a loss. And some services cannot be downgraded (eg add to cart, checkout).

Before downgrading the system, you need to sort out the system to see if the system can be left behind to protect the commander; to sort out which ones must be protected to the death and which ones can be downgraded; for example, you can refer to the log level setting plan:

(1) General: For example, some services occasionally time out due to network jitter or the service is going online, and can be automatically downgraded;

(2) Warning: Some services fluctuate in the success rate for a period of time (such as between 95 and 100%), which can be automatically downgraded or manually downgraded, and an alarm will be sent;

(3) Error: For example, the availability rate is lower than 90%, or the database connection pool is overwhelmed, or the access volume suddenly increases to the maximum threshold that the system can bear, at this time, it can be automatically downgraded or manually downgraded according to the situation;

(4) Serious errors: For example, due to special reasons, the data is wrong, and emergency manual downgrade is required at this time.

 

References:

http://m.toutiao11.com/group/6533812974807679495/?iid=7163591049&app=news_article×tamp=1521327601&tt_from=mobile_qq&utm_source=mobile_qq&utm_medium=toutiao_ios&utm_campaign=client_share

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325153510&siteId=291194637