redis Cache four major problems

Design a caching system, problems do not consider is: Cache penetrate, and when the avalanche breakdown cache failure.

Reception request to take back existing cache data, to take a direct result of the return, whichever is less than when taken from the database, the database taken to update the cache, and returns the result, the database did not get to, that directly returns an empty result.
Here Insert Picture Description

penetrate

描述:
       缓存穿透是指缓存和数据库中都没有的数据,而用户不断发起请求。由于缓存是不命中时被动写的,并且出于容错		 	   考虑,如果从存储层查不到数据则不写入缓存,这将导致这个不存在的数据每次请求都要到存储层去查询,失去了缓存的意义。
在流量大时,可能DB就挂掉了,要是有人利用不存在的key频繁攻击我们的应用,这就是漏洞。
如发起为id为“-1”的数据或id为特别大不存在的数据。这时的用户很可能是攻击者,攻击会导致数据库压力过大。

Solution:
the interface layer increases the checkout, the user authentication check, id foundation check, id <= 0, direct interception;
fail to get data from the cache, the database does not get to this time may be the key-value pairs written as key-null, the cache valid time point can be set shorter, such as 5 seconds (set too long leads to normally would not be able to use). This prevents attacks repeatedly use the same user id violent attack

breakdown

描述:
      缓存击穿是指缓存中没有但数据库中有的数据(一般是缓存时间到期),这时由于并发用户特别多,同时读缓存没
      读到数据,又同时去数据库去取数据,引起数据库压力瞬间增大,造成过大压力。

Solution:
1, set the hot data never expires.
2, the interface with the current limiting fuse, degraded. Important interface limiting policy must do to prevent a malicious user interfaces brush, while preparing to downgrade when certain service interface is not available when the fuse conduct, failure to quickly return mechanism.
3, the Bloom filter. bloomfilter is similar to a hash set, for quickly judge whether an element exists in the collection, the typical scenario is a key to quickly determine whether there is a particular container, there is no direct return. Bloom filter key hash algorithm is that the size of the container and,
4, mutexes (mutex key)
industry commonly used approach is to use the mutex. Briefly, that is, when a cache miss (decision is null out), is not immediately to the load db, but the first caching tools used with some success in the operation returns to the operation value (such as the Redis SETNX or Memcache the ADD) to set a mutex key, when the operation returns successfully, then the operation load db and back in the cache; otherwise, you get to retry the entire cache method.
SETNX, is an abbreviation of "SET if Not eXists", that is, only when it does not exist is set, you can use it to achieve the effect of the lock.

public String get(key) {
      String value = redis.get(key);
      if (value == null) { //代表缓存值过期
          //设置3min的超时,防止del操作失败的时候,下次缓存过期一直不能load db
      if (redis.setnx(key_mutex, 1, 3 * 60) == 1) {  //代表设置成功
               value = db.get(key);
                      redis.set(key, value, expire_secs);
                      redis.del(key_mutex);
              } else {  //这个时候代表同时候的其他线程已经load db并回设到缓存了,这时候重试获取缓存值即可
                      sleep(50);
                      get(key);  //重试
              }
          } else {
              return value;      
          }
 }

avalanche

  描述:
      缓存雪崩是指缓存中数据大批量到过期时间,而查询数据量巨大,引起数据库压力过大甚至down机。和缓存击穿不同的是,        
      缓存击穿指并发查同一条数据,缓存雪崩是不同数据都过期了,很多数据都查不到从而查数据库。

Solution:
expiration time cached data set randomly, to prevent the same time a lot of data expired phenomenon.
If the cache database is distributed deployment, the hotspot data is evenly distributed and made different cache database.
Set hot data never expires.

Data consistency

Here Insert Picture Description
Read caching step is generally no problem, but when it comes to data update: update the database and cache, it is easy to data consistency between the cache (Redis) and database (MySQL) appears.

Whether the first to write MySQL database, and then delete Redis cache; or delete the cache, write libraries have data inconsistencies may occur. As an example:

1. If you delete the cache Redis, has not had time to write database MySQL, another thread to read, find the cache is empty, then go to the database to read data written to the cache, then the cache is dirty.

2. If the first to write a library before deleting cache, write down the thread library, not deleted the cache, data inconsistencies can also occur.

Because writing and reading are concurrent, can not guarantee the sequence, there will be inconsistencies in the data cache and database problems.

Tathagata solve? Here are two solutions, the easier issues first, combining business and technology costs choose to use.

Cache and database consistency Solutions

1. The first embodiment: a time delay double deletion strategy

Redis.del are carried out before and after the write library (key) operation, and set a reasonable time-out

public void write(String key,Object data){

redis.delKey(key);

db.updateData(data);

Thread.sleep(500);

redis.delKey(key);

}
那么,这个500毫秒怎么确定的,具体该休眠多久呢?

需要评估自己的项目的读数据业务逻辑的耗时。这么做的目的,就是确保读请求结束,写请求可以删除读请求造成的缓存脏数据。

当然这种策略还要考虑redis和数据库主从同步的耗时。最后的的写数据的休眠时间:则在读数据业务逻辑的耗时基础上,加几百ms即可。比如:休眠1秒。

3.设置缓存过期时间

从理论上来说,给缓存设置过期时间,是保证最终一致性的解决方案。所有的写操作以数据库为准,只要到达缓存过期时间,则后面的读请求自然会从数据库中读取新值然后回填缓存。

4.该方案的弊端

结合双删策略+缓存超时设置,这样最差的情况就是在超时时间内数据存在不一致,而且又增加了写请求的耗时。


参见:https://www.jianshu.com/p/61c6f30dc043
https://blog.csdn.net/qq_16803227/article/details/92001895

Released six original articles · won praise 0 · Views 210

Guess you like

Origin blog.csdn.net/jingshuipengpeng/article/details/105302152