Cache practice sharing of Internet applications

Click to read the original text

Abstract: ##Introduction In today's Internet applications, cache as a sharp tool plays a pivotal role in the performance of the application. The use of cache can be said to be ubiquitous. From the access path of application requests, user user -> browser cache -> reverse proxy cache -> WEB server cache -> application cache -> database cache, etc., almost every entry Links are flooded with cache usage, of course switches, network adapters, and hard drives also have caches, but this is not the scope of our discussion. The "cache" we are discussing today is naturally "exchanging space for time"

Introduction

In today's Internet applications, caching plays a pivotal role in application performance as a sharp tool. The use of cache can be said to be ubiquitous. From the access path of application requests, user user -> browser cache -> reverse proxy cache -> WEB server cache -> application cache -> database cache, etc., almost every entry Links are flooded with cache usage, of course switches, network adapters, and hard drives also have caches, but this is not the scope of our discussion. The "cache" we are discussing today is naturally the algorithm of "exchanging space for time". Cache is to temporarily store some data in some places, which may be memory or hard disk. In short, the purpose is to avoid certain time-consuming operations. Our common time-consuming operations, such as database query, some data calculation results, or to reduce the pressure on the server. In fact, the pressure reduction is also due to the query or calculation. Although it is short and time-consuming, the operation is very frequent, and the accumulation is also very long, resulting in serious queues and other situations, and the server cannot withstand it.

cache medium

Although from the perspective of hardware media, it is nothing more than memory and hard disk, but technically, it can be divided into memory, hard disk files, and databases.

Storing the cache in memory is the fastest option without additional I/O overhead, but the disadvantage of memory is that there is no persistent physical disk. Once the application breaks down abnormally and restarts, the data is difficult or impossible to recover.
Hard disks Generally speaking, many caching frameworks use memory and hard disks in combination. When the memory allocation space is full or in abnormal situations, the memory space data can be passively or actively persisted to the hard disk to free up space or back up data. the goal of.
数据库前面有提到，增加缓存的策略的目的之一就是为了减少数据库的I/O压力。这里所指的数据库只是简单的key-value存储结构的特殊NOSQL数据库（如BerkeleyDB和Redis），响应速度和吞吐量都远远高于我们常用的关系型数据库等。

缓存命中率

缓存命中率通常指的是缓存查询命中总数与缓存查询总数的比率，应用缓存命中率越高越好，这是衡量缓存使用是否良好的重要指标。

缓存回收策略

缓存的回收策略在分布式缓存中类型主要有如下几种
1. 基于时间
TTL:存活期，一条缓存自创建时间起多久后失效
TTI:空闲期，一条缓存自最后读取或更新起多久后失效
2. 基于空间
通常指的是设置存储空间大小，比如TAIR申请时空间大小，超过这个阈值后，会按照一定的策略算法移除数据。
3. 基于容量
通常指的是设置缓存条目数量大小，超过这个阈值后，会按照一定的策略算法移除数据。

缓存系统的整体回收首先根据时间进行移除过期数据，如果超过空间或者容量设置的阈值，会根据相应的算法移除数据,移除数据的算法主要有LRU、FIFO、LFU，最常用的算法是LRU，分布式缓存memcached、redis以及tair都支持该算法，本地缓存guava cache、ehcache也同样支持LRU算法。

数据淘汰算法简要介绍：

FIFO(first in first out)
先进先出策略，最先进入缓存的数据在缓存空间不够的情况下（超出最大元素限制）会被优先被清除掉，以腾出新的空间接受新的数据。策略算法主要比较缓存元素的创建时间。在数据实效性要求场景下可选择该类策略，优先保障最新数据可用。
LFU(less frequently used)
最少使用策略，无论是否过期，根据元素的被使用次数判断，清除使用次数较少的元素释放空间。策略算法主要比较元素的hitCount（命中次数）。在保证高频数据有效性场景下，可选择这类策略。
LRU(least recently used)
最近最少使用策略，无论是否过期，根据元素最后一次被使用的时间戳，清除最远使用时间戳的元素释放空间。策略算法主要比较元素最近一次被get使用时间。在热点数据场景下较适用，优先保证热点数据的有效性。

缓存使用场景与分类

使用缓存的目的提高系统的整体性能，缓存的工作机制是先从缓存中读取数据，如果没有，则再从慢速设备上读取实际数据并同步到缓存。那些经常查询的数据、频繁访问的数据、热点数据、IO瓶颈数据、计算昂贵的数据、符合五分钟法则和局部性原理的数据都可以进行缓存。

在互联网应用中常见的缓存的场景主要：

数据库缓存：随着业务量的上升，数据库存储的数据量越来越大，并发请求逐渐增大，随之而来的问题就是数据库系统的负载升高，响应延迟下降，严重的时候，甚至有可能因此而导致服务中断，这时启用缓存利器可以提高系统性能。
临时数据存储：应用程序需要维护大量临时数据，例如计数器、分布式锁、用户session等，将临时数据存储在分布式缓存中，可以降低内存管理的开销，改进应用程序工作负载。

在互联网应用中，从应用与缓存耦合度角度缓存主要分为本地缓存、分布式缓存两大类。

本地缓存：指的是在应用中的缓存组件，其最大的优点是应用和cache是在同一个进程内部，请求缓存非常快速，没有过多的网络开销等，在单应用不需要集群支持或者集群情况下各节点无需互相通知的场景下使用本地缓存较合适；同时，它的缺点也是应为缓存跟应用程序耦合，多个应用程序无法直接的共享缓存，各应用或集群的各节点都需要维护自己的单独缓存，对内存是一种浪费。
Guava Cache、Ehcache、MapDB都可以实现JAVA堆内存本地缓存，谈堆内存其实JAVA还支持堆外内存，Ehcache 3.x、MapDB 3.x也同样支持堆外内存，堆外内存意味着把内存对象分配在Java虚拟机的堆以外的内存，这些内存直接受操作系统管理（而不是虚拟机），Netty就是使用堆外内存来管理内存，建议慎用堆外内存，使用不当容易导致OOM，关于堆外内存与堆内存接下来不做重点介绍。
分布式缓存：指的是与应用分离的缓存组件或服务，其最大的优点是自身就是一个独立的应用，与本地应用隔离，多个应用可直接的共享缓存，像memcached、redis、tair都是分布式缓存。

缓存使用实践

缓存的使用也是讲究一定技巧性，如果使用不当会导致数据一致性问题、缓存被穿透导致应用雪崩等。
上面讲到局部性原理，简单介绍下与缓存相关的局部性原理：

时间局部性（temporal locality）：数据将被多次访问
空间局部性（spatial locality）：邻近数据将被访问

基于局部性原理，缓存在设计上需要考虑许多的因素：

缓存关联性（cache associativity）
写策略（writing policy）
替换策略（cache replacement）
缓存一致性（cache coherency）
cache失效可能引发的dog-piling效应（cache stampede）
.....

接下来将会根据以上几点介绍缓存使用的一些实践。

缓存与DB数据一致性

数据的更新与缓存同步，没有高科技含量，但要做好并不容易，有些场景需要做到实时一致性，有些场景需要做到最终一致性。

如果要做到强一致性，可以采取以下方案：

数据库更新后，删除缓存。
这种方式的优点是实现简单，缺点是删除缓存后，如果有多个查询请求并发过来，都发现缓存中没数据，都会将请求落到数据库上，导致数据库压力瞬间增加。
数据库更新后，更新缓存。
这是对删除方式的改进，但也有缺点，写入前要多一次查询，在部分场景下是没法使用的，比如分页查询场景，各种请求参数组合很多，应用无法知道有多少种key，自然无法主动写入，只能等缓存失效。

以上两种实时同步缓存机制，先操作数据库然后操作缓存，因异构数据存储无法通过事务保证一致。当然缓存涉及到网络IO开销，如果连接分布式缓存超时也需要考虑，否则会出现事务超时，导致应用线程挂起。

如果要做到最终一致性，可以采取以下方案：

MQ异步刷新、定时刷新采用MQ异步消息机制刷新，如果更新失败要有适当的补偿机制。所有需要更新的对象存储到一张定时任务表，定时任务扫描任务表异步更新。这两种更新机制不能保证查询缓存同DB的一致性，但是能够保证最终一致性。
自动失效合理设置缓存失效时间，需根据业务场景设置每个缓存的失效时间，一致性要求越高，自然失效时间也要越短。

缓存并发

缓存过期后将尝试从后端数据库获取数据，这是一个看似合理的流程。但是，在高并发场景下，有可能多个请求并发的去从数据库获取数据，对后端数据库造成极大的冲击，甚至导致 “雪崩”现象。此外，当某个缓存key在被更新时，同时也可能被大量请求在获取，这也会导致一致性的问题。那如何避免类似问题呢？我们会想到类似"锁"的机制(可重入锁)，在缓存更新或者过期的情况下，先尝试获取到锁，当更新或者从数据库获取完成后再释放锁，其他的请求只需要牺牲一定的等待时间，即可直接从缓存中继续获取数据。

缓存被穿透

在高并发场景下，如果某一个key被高并发访问，没有被命中，出于对容错性考虑，会尝试去从后端数据库中获取，从而导致了大量请求达到数据库，而当该key对应的数据本身就是空的情况下，这就导致数据库中并发的去执行了很多不必要的查询操作，从而导致巨大冲击和压力。

我们在应用中使用缓存的时候，很可能就是使用的如下代码所表示的逻辑的方式。　先获取缓存中的数据，如果为空则查询数据库或者其他方式获取数据，然后再存入缓存，返回数据，如下伪代码。

data=cache.get(key);
if(data=null || !isValid(data)){
　　　sql="SELECT ......";
     data=db.query(sql);
　　　//data可能为null
     if(data ！= null){
        cache.set(key,data,expire);
      }
 }
return data;

相信大多数人会认为这段代码没有问题，很多人也是这么去写的。

问题：
当key的内容在数据库也不存在时，那么上面代码中的data始终为null，缓存中也始终没有数据，如果这个key的请求突然变得很大（很多情况下都会发生，比如查询请求不存在的数据），那么将会有大量的请求绕过缓存，直接到了后端数据库，对数据库的IOQPS造成过大的冲击，最后很可能导致系统崩溃。

解决方案：
1. 缓存空对象
解决这个问题的办法就是当数据库查询到null时，我们也应该把null进行相应的缓存。
比如数据库返回的是一个list，那么我们可以存入一个空的list来处理cache.set(key,new List(),expire)。
同时，也需要保证缓存数据的时效性。这种方式实现起来成本较低，比较适合命中不高，但可能被频繁更新的数据。
2. 单独做过滤处理
对所有可能对应数据为空的key进行统一的存放，并在请求前做拦截，这样避免请求穿透到后端数据库。这种方式实现起来相对复杂，比较适合命中不高，但是更新不频繁的数据。

比较常用的方案是通过Bloom Filter提前拦截，Bloom Filter是一个空间效率很高的随机数据结构，它由一个位数组和一组hash映射函数组成。Bloom Filter可以用于检索一个元素是否在一个集合中，它的优点是空间效率和查询时间都远远超过一般的算法，缺点是有一定的误识别率。因此Bloom Filter不适合那些“零错误”的应用场合。而在能容忍低错误率的应用场合下,Bloom Filter通过极少的错误换取了存储空间的极大节省。
像Guava Cache、Google的bigtable都有类似bloomfilter实现，关于bloomfilter具体可以参考：https://www.javacodegeeks.com/2012/11/bloom-filter-implementation-in-java-on-github.html

以上两种方案均可以保障的缓存被穿透问题，第一种方案更简单，但需要额外占一些缓存空间。第二种方案复杂一些，但是占用缓存空间少。

热点缓存

比如key XXX对应的数据访问量特别大，但是XXX在缓存中是有失效时间的。一旦缓存失效，会有N多线程并发的去请求数据库，然后更新缓存，这个时候会导致系统压力过大。通常有这么几种解决方法：
1. 加锁，同时只允许一个线程去查询数据库并更新缓存
2. 缓存不加失效时间，但后台有个异步线程定期的去更新它
3. 引入类似于Hystrix的熔断机制，只允许一定量的请求去请求数据库并更新缓存

For example, the cache uses memcached and redis, and the above scheme can be considered for hot data. If you use tair cache, tair provides a hotkey solution. The main principle is to enable the local LocalCache function. Each write operation will automatically force delete the key that exists in the Localcache. After the read operation, it will automatically read from the Localcache. It is obtained from the server and stored in the Localcache after success. In the Hotkey defense system, the client needs to enable the hot-running mode. In this mode, only the keys marked with hotspots can be cached, and the keys that are not hotspots in the Localcache will be gradually eliminated. That is, once this mode of the client is turned on, the working mode of Localcache will be changed forcibly.

The data cache mode usually has two modes: lazy loading and preloading, but it is best to use the preloading mode for hotspot caches. I have encountered cases where the application system hangs directly after it is released, and some hotspot data queries are in high concurrency scenarios. Before it is loaded, the request traffic is pouring in.

cache large objects

In some scenarios, we want to cache some large objects, because it is very expensive to generate a large object once, we need to generate it once and use it as many times as possible to improve QPS.
The usual solution is to compress the data. Compression can be done on the client side, and the cache server can also compress it. Like memcached, it needs to be compressed on the client side. It can be compressed using the gzip compression algorithm. Like tair, if the value exceeds a certain value The threshold server will automatically compress the value.

Click to read the original text