深入理解Redis数据淘汰策略!

  1.  

    在 redis 中,允许用户设置最大使用内存大小 server.maxmemory,在内存限定的情况下是很有用的。譬如,在一台 8G 机子上部署了 4 个 redis 服务点,每一个服务点分配 1.5G 的内存大小,减少内存紧张的情况,由此获取更为稳健的服务。

    redis中当内存超过限制时,按照配置的策略,淘汰掉相应的kv,使得内存可以继续留有足够的空间保存新的数据。redis 确定驱逐某个键值对后,会删除这个数据并,并将这个数据变更消息发布到本地(AOF 持久化)和从机(主从连接)。

    redis的conf文件中有对该机制的一份很好的解释:

    # Don't use more memory than the specified amount of bytes.

    # When the memory limit is reached Redis will try to remove keys

    # accordingly to the eviction policy selected (see maxmemmory-policy).

    #

    # If Redis can't remove keys according to the policy, or if the policy is

    # set to 'noeviction', Redis will start to reply with errors to commands

    # that would use more memory, like SET, LPUSH, and so on, and will continue

    # to reply to read-only commands like GET.

    #

    # This option is usually useful when using Redis as an LRU cache, or to set

    # an hard memory limit for an instance (using the 'noeviction' policy).

    #

    # WARNING: If you have slaves attached to an instance with maxmemory on,

    # the size of the output buffers needed to feed the slaves are subtracted

    # from the used memory count, so that network problems / resyncs will

    # not trigger a loop where keys are evicted, and in turn the output

    # buffer of slaves is full with DELs of keys evicted triggering the deletion

    # of more keys, and so forth until the database is completely emptied.

    #

    # In short... if you have slaves attached it is suggested that you set a lower

    # limit for maxmemory so that there is some free RAM on the system for slave

    # output buffers (but this is not needed if the policy is 'noeviction').

    #

    # maxmemory <bytes>

    注意:在redis按照master-slave使用时,其maxmeory应设置的比实际物理内存稍小一些,给slave output buffer留有足够的空间。

    redis 提供 6种数据淘汰策略:

    # maxmemory <bytes>

    # MAXMEMORY POLICY: how Redis will select what to remove when maxmemory

    # is reached. You can select among five behaviors:

    #

    # volatile-lru -> remove the key with an expire set using an LRU algorithm

    # allkeys-lru -> remove any key accordingly to the LRU algorithm

    # volatile-random -> remove a random key with an expire set

    # allkeys-random -> remove a random key, any key

    # volatile-ttl -> remove the key with the nearest expire time (minor TTL)

    # noeviction -> don't expire at all, just return an error on write operations

    #

    # Note: with any of the above policies, Redis will return an error on write

    # operations, when there are not suitable keys for eviction.

    #

    # At the date of writing this commands are: set setnx setex append

    # incr decr rpush lpush rpushx lpushx linsert lset rpoplpush sadd

    # sinter sinterstore sunion sunionstore sdiff sdiffstore zadd zincrby

    # zunionstore zinterstore hset hsetnx hmset hincrby incrby decrby

    # getset mset msetnx exec sort

    #

    # The default is:

    #

    # maxmemory-policy noeviction

    # LRU and minimal TTL algorithms are not precise algorithms but approximated

    # algorithms (in order to save memory), so you can tune it for speed or

    # accuracy. For default Redis will check five keys and pick the one that was

    # used less recently, you can change the sample size using the following

    # configuration directive.

    #

    # The default of 5 produces good enough results. 10 Approximates very closely

    # true LRU but costs a bit more CPU. 3 is very fast but not very accurate.

    #

    # maxmemory-samples 5

  2. volatile-lru:从设置了过期时间的数据集中,选择最近最久未使用的数据释放;
  3. allkeys-lru:从数据集中(包括设置过期时间以及未设置过期时间的数据集中),选择最近最久未使用的数据释放;
  4. volatile-random:从设置了过期时间的数据集中,随机选择一个数据进行释放;
  5. allkeys-random:从数据集中(包括了设置过期时间以及未设置过期时间)随机选择一个数据进行入释放;
  6. volatile-ttl:从设置了过期时间的数据集中,选择马上就要过期的数据进行释放操作;
  7. noeviction:不删除任意数据(但redis还会根据引用计数器进行释放),这时如果内存不够时,会直接返回错误。
  8. 默认的内存策略是noeviction,在Redis中LRU算法是一个近似算法,默认情况下,Redis随机挑选5个键,并且从中选取一个最近最久未使用的key进行淘汰,在配置文件中可以通过maxmemory-samples的值来设置redis需要检查key的个数,但是栓查的越多,耗费的时间也就越久,但是结构越精确(也就是Redis从内存中淘汰的对象未使用的时间也就越久~),设置多少,综合权衡。
  9. 其缓存管理功能,由redis.c文件中的freeMemoryIfNeeded函数实现。如果maxmemory被设置,则在每次进行命令执行之前,该函数均被调用,用以判断是否有足够内存可用,释放内存或返回错误。如果没有找到足够多的内存,程序主逻辑将会阻止设置了REDIS_COM_DENYOOM flag的命令执行,对其返回command not allowed when used memory > ‘maxmemory’的错误消息。

    int freeMemoryIfNeeded(void) {

    size_t mem_used, mem_tofree, mem_freed;

    int slaves = listLength(server.slaves);

    /* Remove the size of slaves output buffers and AOF buffer from the

    * count of used memory. */ 计算占用内存大小时,并不计算slave output buffer和aof buffer,因此maxmemory应该比实际内存小,为这两个buffer留足空间。

    mem_used = zmalloc_used_memory();

    if (slaves) {

    listIter li;

    listNode *ln;

    listRewind(server.slaves,&li);

    while((ln = listNext(&li))) {

    redisClient *slave = listNodeValue(ln);

    unsigned long obuf_bytes = getClientOutputBufferMemoryUsage(slave);

    if (obuf_bytes > mem_used)

    mem_used = 0;

    else

    mem_used -= obuf_bytes;

    }

    }

    if (server.appendonly) {

    mem_used -= sdslen(server.aofbuf);

    mem_used -= sdslen(server.bgrewritebuf);

    }

    /* Check if we are over the memory limit. */

    if (mem_used <= server.maxmemory) return REDIS_OK;

    if (server.maxmemory_policy == REDIS_MAXMEMORY_NO_EVICTION)

    return REDIS_ERR; /* We need to free memory, but policy forbids. */

    /* Compute how much memory we need to free. */

    mem_tofree = mem_used - server.maxmemory;

    mem_freed = 0;

    while (mem_freed < mem_tofree) {

    int j, k, keys_freed = 0;

    for (j = 0; j < server.dbnum; j++) {

    long bestval = 0; /* just to prevent warning */

    sds bestkey = NULL;

    struct dictEntry *de;

    redisDb *db = server.db+j;

    dict *dict;

    if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_LRU ||

    server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_RANDOM)

    {

    dict = server.db[j].dict;

    } else {

    dict = server.db[j].expires;

    }

    if (dictSize(dict) == 0) continue;

    /* volatile-random and allkeys-random policy */

    if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_RANDOM ||

    server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_RANDOM)

    {

    de = dictGetRandomKey(dict);

    bestkey = dictGetEntryKey(de);

    }//如果是random delete,则从dict中随机选一个key

    /* volatile-lru and allkeys-lru policy */

    else if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_LRU ||

    server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_LRU)

    {

    for (k = 0; k < server.maxmemory_samples; k++) {

    sds thiskey;

    long thisval;

    robj *o;

    de = dictGetRandomKey(dict);

    thiskey = dictGetEntryKey(de);

    /* When policy is volatile-lru we need an additonal lookup

    * to locate the real key, as dict is set to db->expires. */

    if (server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_LRU)

    de = dictFind(db->dict, thiskey); //因为dict->expires维护的数据结构里并没有记录该key的最后访问时间

    o = dictGetEntryVal(de);

    thisval = estimateObjectIdleTime(o);

    /* Higher idle time is better candidate for deletion */

    if (bestkey == NULL || thisval > bestval) {

    bestkey = thiskey;

    bestval = thisval;

    }

    }//为了减少运算量,redis的lru算法和expire淘汰算法一样,都是非最优解,lru算法是在相应的dict中,选择maxmemory_samples(默认设置是3)份key,挑选其中lru的,进行淘汰

    }

    /* volatile-ttl */

    else if (server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_TTL) {

    for (k = 0; k < server.maxmemory_samples; k++) {

    sds thiskey;

    long thisval;

    de = dictGetRandomKey(dict);

    thiskey = dictGetEntryKey(de);

    thisval = (long) dictGetEntryVal(de);

    /* Expire sooner (minor expire unix timestamp) is better

    * candidate for deletion */

    if (bestkey == NULL || thisval < bestval) {

    bestkey = thiskey;

    bestval = thisval;

    }

    }//注意ttl实现和上边一样,都是挑选出maxmemory_samples份进行挑选

    }

    /* Finally remove the selected key. */

    if (bestkey) {

    long long delta;

    robj *keyobj = createStringObject(bestkey,sdslen(bestkey));

    propagateExpire(db,keyobj); //将del命令扩散给slaves

    /* We compute the amount of memory freed by dbDelete() alone.

    * It is possible that actually the memory needed to propagate

    * the DEL in AOF and replication link is greater than the one

    * we are freeing removing the key, but we can't account for

    * that otherwise we would never exit the loop.

    *

    * AOF and Output buffer memory will be freed eventually so

    * we only care about memory used by the key space. */

    delta = (long long) zmalloc_used_memory();

    dbDelete(db,keyobj);

    delta -= (long long) zmalloc_used_memory();

    mem_freed += delta;

    server.stat_evictedkeys++;

    decrRefCount(keyobj);

    keys_freed++;

    /* When the memory to free starts to be big enough, we may

    * start spending so much time here that is impossible to

    * deliver data to the slaves fast enough, so we force the

    * transmission here inside the loop. */

    if (slaves) flushSlavesOutputBuffers();

    }

    }//在所有的db中遍历一遍,然后判断删除的key释放的空间是否足够

    if (!keys_freed) return REDIS_ERR; /* nothing to free... */

    }

    return REDIS_OK;

    }

    注意:此函数是在执行特定命令之前进行调用的,并且在当前占用内存低于限制后即返回OK。因此可能在后续执行命令后,redis占用的内存就超过了maxmemory的限制。因此,maxmemory是redis执行命令所需保证的最大内存占用,而非redis实际的最大内存占用。(在不考虑slave buffer和aof buffer的前提下)

    LRU 数据淘汰机制

    在服务器配置中保存了 lru 计数器 server.lrulock,会定时(redis 定时程序 serverCorn())更新,server.lrulock 的值是根据 server.unixtime 计算出来的。

    另外,从 struct redisObject 中可以发现,每一个 redis 对象都会设置相应的 lru。可以想象的是,每一次访问数据的时候,会更新 redisObject.lru。

    LRU 数据淘汰机制是这样的:

    在数据集中随机挑选几个键值对,取出其中 lru 最小的键值对淘汰。所以,你会发现,redis

    并不是保证取得所有数据集中最近最少使用(LRU)的键值对,而只是随机挑选的几个键值对中的。

    在redis.h中声明的redisObj定义的如下:

    #define REDIS_LRU_BITS 24

    #define REDIS_LRU_CLOCK_MAX ((1<<REDIS_LRU_BITS)-1) /* Max value of obj->lru */

    #define REDIS_LRU_CLOCK_RESOLUTION 1000 /* LRU clock resolution in ms */

    typedef struct redisObject {<br>  //存放的对象类型

    unsigned type:4;

    //内容编码

    unsigned encoding:4;

    //与server.lruclock的时间差值

    unsigned lru:REDIS_LRU_BITS; /* lru time (relative to server.lruclock) */\

    //引用计数算法使用的引用计数器

    int refcount;

    //数据指针

    void *ptr;

    } robj;

    从redisObject结构体的定义中可以看出,在Redis中存放的对象不仅会有一个引用计数器,还会存在一个server.lruclock,这个变量会在定时器中每次刷新时,调用getLRUClock获取当前系统的毫秒数,作为LRU时钟数,该计数器总共占用24位,最大可以表示的值为24个1,即((1<< REDIS_LRU_BITS) - 1)=2^24 - 1,单位是毫秒,你可以算一下这么多毫秒,可以表示多少年~~

    server.lruclock在redis.c中运行的定时器中进行更新操作,代码如下(redis.c中的定时器被配置中100ms执行一次)

    int serverCron(struct aeEventLoop *eventLoop, long long id, void *clientData) {

    .....

    run_with_period(100) trackOperationsPerSecond();

    /* We have just REDIS_LRU_BITS bits per object for LRU information.

    * So we use an (eventually wrapping) LRU clock.

    *

    * Note that even if the counter wraps it's not a big problem,

    * everything will still work but some object will appear younger

    * to Redis. However for this to happen a given object should never be

    * touched for all the time needed to the counter to wrap, which is

    * not likely.

    *

    * Note that you can change the resolution altering the

    * REDIS_LRU_CLOCK_RESOLUTION define. */

    server.lruclock = getLRUClock();

    ....

    return 1000/server.hz;

    }

    看到这,再看看Redis中创建对象时,如何对redisObj中的unsigned lru进行赋值操作的,代码位于object.c中,如下所示

    robj *createObject(int type, void *ptr) {

    robj *o = zmalloc(sizeof(*o));

    o->type = type;

    o->encoding = REDIS_ENCODING_RAW;

    o->ptr = ptr;

    o->refcount = 1;

    //很关键的一步,Redis中创建的每一个对象,都记录下该对象的LRU时钟

    /* Set the LRU to the current lruclock (minutes resolution). */

    o->lru = LRU_CLOCK();

    return o;

    }

    该代码中最为关键的一句就是o->lru=LRU_CLOCK(),这是一个定义,看一下这个宏定义的实现,代码如下所示

    #define LRU_CLOCK() ((1000/server.hz <= REDIS_LRU_CLOCK_RESOLUTION) ? server.lruclock : getLRUClock())

    其中REDIS_LRU_CLOCK_RESOLUTION为1000,可以自已在配置文件中进行配置,表示的是LRU算法的精度,在这里我们就可以看到server.lruclock的用处了,如果定时器执行的频率高于LRU算法的精度时,可以直接将server.lruclock直接在对象创建时赋值过去,避免了函数调用的内存开销以及时间开销~

    有了上述的基础,下面就是最为关键的部份了,REDIS中LRU算法,这里以volatile-lru为例(选择有过期时间的数据集进行淘汰),在Redis中命令的处理时,会调用processCommand函数,在ProcessCommand函数中,当在配置文件中配置了maxmemory时,会调用freeMemoryIfNeeded函数,释放不用的内存空间,

    以下是freeMemoryIfNeeded函数的关于LRU相关部份的源代码,其他代码类似

    //不同的策略,操作的数据集不同

    if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_LRU ||

    server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_RANDOM)

    {

    dict = server.db[j].dict;

    } else {//操作的是设置了过期时间的key集

    dict = server.db[j].expires;

    }

    if (dictSize(dict) == 0) continue;

    /* volatile-random and allkeys-random policy */

    //随机选择进行淘汰

    if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_RANDOM ||

    server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_RANDOM)

    {

    de = dictGetRandomKey(dict);

    bestkey = dictGetKey(de);

    }

    /* volatile-lru and allkeys-lru policy */

    //具体的LRU算法

    else if (server.maxmemory_policy == REDIS_MAXMEMORY_ALLKEYS_LRU ||

    server.maxmemory_policy == REDIS_MAXMEMORY_VOLATILE_LRU)

    {

    struct evictionPoolEntry *pool = db->eviction_pool;

    while(bestkey == NULL) {

    //选择随机样式,并从样本中作用LRU算法选择需要淘汰的数据

    evictionPoolPopulate(dict, db->dict, db->eviction_pool);

    /* Go backward from best to worst element to evict. */

    for (k = REDIS_EVICTION_POOL_SIZE-1; k >= 0; k--) {

    if (pool[k].key == NULL) continue;

    de = dictFind(dict,pool[k].key);

    sdsfree(pool[k].key);

    //将pool+k+1之后的元素向前平移一个单位

    memmove(pool+k,pool+k+1,

    sizeof(pool[0])*(REDIS_EVICTION_POOL_SIZE-k-1));

    /* Clear the element on the right which is empty

    * since we shifted one position to the left. */

    pool[REDIS_EVICTION_POOL_SIZE-1].key = NULL;

    pool[REDIS_EVICTION_POOL_SIZE-1].idle = 0;

    //选择了需要淘汰的数据

    if (de) {

    bestkey = dictGetKey(de);

    break;

    } else {

    /* Ghost... */

    continue;

    }

    }

    }

    }

    看了上面的代码,也许你还在奇怪,说好的,LRU算法去哪去了呢,再看看这个函数evictionPoolPopulate的实现吧

    #define EVICTION_SAMPLES_ARRAY_SIZE 16

    void evictionPoolPopulate(dict *sampledict, dict *keydict, struct evictionPoolEntry *pool) {

    int j, k, count;

    //EVICTION_SAMPLES_ARRAY_SIZE最大样本数,默认16

    dictEntry *_samples[EVICTION_SAMPLES_ARRAY_SIZE];

    dictEntry **samples;

    //如果我们在配置文件中配置的samples小于16,则直接使用EVICTION_SAMPLES_ARRAY_SIZE

    if (server.maxmemory_samples <= EVICTION_SAMPLES_ARRAY_SIZE) {

    samples = _samples;

    } else {

    samples = zmalloc(sizeof(samples[0])*server.maxmemory_samples);

    }

    #if 1 /* Use bulk get by default. */

    //从样本集中随机获取server.maxmemory_samples个数据,存放在

    count = dictGetRandomKeys(sampledict,samples,server.maxmemory_samples);

    #else

    count = server.maxmemory_samples;

    for (j = 0; j < count; j++) samples[j] = dictGetRandomKey(sampledict);

    #endif

    for (j = 0; j < count; j++) {

    unsigned long long idle;

    sds key;

    robj *o;

    dictEntry *de;

    de = samples[j];

    key = dictGetKey(de);

    if (sampledict != keydict) de = dictFind(keydict, key);

    o = dictGetVal(de);

    //计算LRU时间

    idle = estimateObjectIdleTime(o);

    k = 0;

    //选择de在pool中的正确位置,按升序进行排序,升序的依据是其idle时间

    while (k < REDIS_EVICTION_POOL_SIZE &&

    pool[k].key &&

    pool[k].idle < idle) k++;

    if (k == 0 && pool[REDIS_EVICTION_POOL_SIZE-1].key != NULL) {

    /* Can't insert if the element is < the worst element we have

    * and there are no empty buckets. */

    continue;

    } else if (k < REDIS_EVICTION_POOL_SIZE && pool[k].key == NULL) {

    /* Inserting into empty position. No setup needed before insert. */

    } else {

    //移动元素,memmove,还有空间可以插入新元素

    if (pool[REDIS_EVICTION_POOL_SIZE-1].key == NULL) {

    memmove(pool+k+1,pool+k,

    sizeof(pool[0])*(REDIS_EVICTION_POOL_SIZE-k-1));

    } else {//已经没有空间插入新元素时,将第一个元素删除

    /* No free space on right? Insert at k-1 */

    k--;

    /* Shift all elements on the left of k (included) to the

    * left, so we discard the element with smaller idle time. */

    //以下操作突出了第K个位置

    sdsfree(pool[0].key);

    memmove(pool,pool+1,sizeof(pool[0])*k);

    }

    }

    //在第K个位置插入

    pool[k].key = sdsdup(key);

    pool[k].idle = idle;

    }

    //执行到此之后,pool中存放的就是按idle time升序排序

    if (samples != _samples) zfree(samples);

    }

    看了上面的代码,LRU时钟的计算并没有包括在内,那么在看一下LRU算法的时钟计算代码吧,LRU时钟计算代码在object.c中的estimateObjectIdleTime这个函数中,代码如下~~

    //精略估计LRU时间

    unsigned long long estimateObjectIdleTime(robj *o) {

    unsigned long long lruclock = LRU_CLOCK();

    if (lruclock >= o->lru) {

    return (lruclock - o->lru) * REDIS_LRU_CLOCK_RESOLUTION;

    } else {//这种情况一般不会发生,发生时证明redis中键的保存时间已经wrap了

    return (lruclock + (REDIS_LRU_CLOCK_MAX - o->lru)) *

    REDIS_LRU_CLOCK_RESOLUTION;

    }

    }

    TTL 数据淘汰机制

    redis 数据集数据结构中保存了键值对过期时间的表,即 redisDb.expires。和 LRU 数据淘汰机制类似。

    TTL 数据淘汰机制是这样的:

    从过期时间的表中随机挑选几个键值对,取出其中 ttl 最大的键值对淘汰。同样你会发现,redis

    并不是保证取得所有过期时间的表中最快过期的键值对,而只是随机挑选的几个键值对中的。

    点击链接加入群聊【java高级架构进阶】:https://jq.qq.com/?_wv=1027&k=5vdrS9H 群内提供免费架构视频资料,高并发,分布式,Spring,MyBatis,Netty源码分析和大数据等多个知识点,还有大牛解答疑问,欢迎踊跃进群交流!

猜你喜欢

转载自blog.csdn.net/suixinsuoyu12519/article/details/82627826