redis cache exception

Cache avalanche

  1. What is a cache avalanche?
    Cache avalanche refers to a large area of ​​cache failure at the same time, so subsequent requests will fall on the database, causing the database to withstand a large number of requests in a short period of time and crash.
  2. solution
    1. The expiration time of cached data is set randomly to prevent a large amount of data from expiring at the same time.
    2. Generally, when the amount of concurrency is not particularly high, the most used solution is to lock and queue.
      Fake code:
      /*
      注意:加锁排队只是为了减轻数据库的压力,并没有提高系统吞吐量。
      假设在高并发下,缓存重建期间key是锁着的,这是过来1000个请求999个都在阻塞的。
      同样会导致用户等待超时,这是个治标不治本的方法!
      加锁排队的解决方式分布式环境的并发问题,有可能还要解决分布式锁的问题;
      线程还会被阻塞,用户体验很差!因此,在真正的高并发场景下很少使用!
      */
      //伪代码
      public object GetProductListNew() {
              
              
          int cacheTime = 30;
          String cacheKey = "product_list";
          String lockKey = cacheKey;
      
          String cacheValue = CacheHelper.get(cacheKey);
          if (cacheValue != null) {
              
              
              return cacheValue;
          } else {
              
              
              synchronized(lockKey) {
              
              
                  cacheValue = CacheHelper.get(cacheKey);
                  if (cacheValue != null) {
              
              
                      return cacheValue;
                  } else {
              
              
      	            //这里一般是sql查询数据
                      cacheValue = GetProductListFromDB(); 
                      CacheHelper.Add(cacheKey, cacheValue, cacheTime);
                  }
              }
              return cacheValue;
          }
      }
      
    3. Add a corresponding cache mark to each cached data, record whether the cache is invalid, if the cache mark is invalid, update the data cache.
      Fake code:
      /* 解释说明:
      1、缓存标记:记录缓存数据是否过期,如果过期会触发通知另外的线程在后台去更新实际key的缓存;
      2、缓存数据:它的过期时间比缓存标记的时间延长1倍,例:标记缓存时间30分钟,数据缓存设置为60分钟。 
      这样,当缓存标记key过期后,实际缓存还能把旧数据返回给调用端,直到另外的线程在后台更新完成后,才会返回新缓存。
      
      关于缓存崩溃的解决方法,这里提出了三种方案:
      	使用锁或队列、设置过期标志更新缓存、
      	为key设置不同的缓存失效时间,
      	还有一各被称为“二级缓存”的解决方法,有兴趣的读者可以自行研究。
      */
      //伪代码
      public object GetProductListNew() {
              
              
          int cacheTime = 30;
          String cacheKey = "product_list";
          //缓存标记
          String cacheSign = cacheKey + "_sign";
      
          String sign = CacheHelper.Get(cacheSign);
          //获取缓存值
          String cacheValue = CacheHelper.Get(cacheKey);
          if (sign != null) {
              
              
              return cacheValue; //未过期,直接返回
          } else {
              
              
              CacheHelper.Add(cacheSign, "1", cacheTime);
              ThreadPool.QueueUserWorkItem((arg) -> {
              
              
      			//这里一般是 sql查询数据
                  cacheValue = GetProductListFromDB(); 
      	        //日期设缓存时间的2倍,用于脏读
      	        CacheHelper.Add(cacheKey, cacheValue, cacheTime * 2);                 
              });
              return cacheValue;
          }
      } 
      

Cache penetration

  1. What is cache penetration?
    Cache penetration refers to data that is not in the cache or in the database, causing all requests to fall on the database, causing the database to crash due to a large number of requests in a short period of time.
  2. solution
    1. The interface layer adds verification, such as user authentication verification, id is used for basic verification, and id<=0 is directly intercepted;
    2. The data that cannot be retrieved from the cache is also not retrieved in the database. At this time, the key-value pair can also be written as key-null, and the effective time of the cache can be set short, such as 30 seconds (setting too long will cause normal conditions Can not be used). This can prevent attacking users from repeatedly using the same id brute force attack
    3. Using bloom filters, hash all possible data into a bitmap that is large enough, and a data that must not exist will be intercepted by this bitmap, thereby avoiding the query pressure on the underlying storage system

Cache breakdown

  1. What is cache breakdown?
    Cache breakdown refers to the data that is not in the cache but in the database (usually because the cache time expires). At this time, because there are so many concurrent users, and the data is not read in the read cache at the same time, it goes to the database to fetch it at the same time. Data, causing the pressure of the database to increase instantly, causing excessive pressure. Unlike cache avalanche, cache breakdown refers to concurrently checking the same piece of data. Cache avalanche means that different data has expired, and many data cannot be found, so you can check the database.
  2. solution
    1. Set hotspot data to never expire.
    2. Add mutex

Cache warm-up

  1. What is cache warm-up?
    Cache warm-up means that after the system is online, the relevant cache data is directly loaded into the cache system. In this way, you can avoid the problem of querying the database first and then caching the data when the user requests it! The user directly queries the pre-heated cache data!
  2. solution
    1. Write a cache to refresh the page directly, and do it manually when you go online;
    2. The amount of data is not large, and it can be loaded automatically when the project is started;
    3. Refresh the cache regularly;

Cache degradation

https://www.iteye.com/blog/1181731633-2370315

When the traffic increases sharply, the service has problems (such as slow response time or unresponsive), or non-core services affect the performance of the core process, it is still necessary to ensure that the service is still available, even if it is detrimental to the service. The system can automatically degrade based on some key data, or it can be configured with switches to achieve manual degradation.

The ultimate goal of cache degradation is to ensure that core services are available, even if they are lossy. And some services cannot be downgraded (such as adding to shopping cart, settlement).

Before downgrading, you need to sort out the system to see if the system can lose the pawn and protect the commander; thus sort out which must be protected and which can be downgraded; for example, you can refer to the log level setting plan:

  1. General: For example, some services occasionally time out due to network jitter or the service is online, and they can be automatically degraded;
  2. Warning: Some services fluctuate in the success rate within a period of time (for example, between 95% and 100%), and they can be automatically or manually degraded, and an alarm will be sent;
  3. Error: For example, the availability rate is lower than 90%, or the database connection pool is burst, or the traffic suddenly increases to the maximum threshold that the system can withstand. At this time, it can be automatically downgraded or manually downgraded according to the situation;
  4. Serious error: For example, the data is wrong due to special reasons, and urgent manual downgrade is required at this time.

The purpose of service downgrading is to prevent Redis service failures, resulting in an avalanche problem in the database. Therefore, for unimportant cached data, a service degradation strategy can be adopted. For example, a common practice is that Redis does not query the database, but directly returns the default value to the user.

Hot data and cold data

https://www.jianshu.com/p/053ba529bf02

Cache hot key

A Key in the cache (such as a promotional item), when it expires at a certain point in time, there are a large number of concurrent requests for this Key at this point in time. These requests will generally load data from the back-end DB when the cache expires. Set back to the cache. At this time, a large concurrent request may instantly overwhelm the back-end DB.

solution

Lock the cache query, if the KEY does not exist, lock it, then check the DB into the cache, and then unlock; if other processes find a lock, they wait, and then wait for the unlock to return the data or enter the DB query

Guess you like

Origin blog.csdn.net/weixin_44533129/article/details/112734436