LRU elimination strategy analysis of Redis

RedisUse as a cache, in some scenarios to consider spatial memory consumption issues. RedisDeletes outdated keys to free up space, delete outdated key strategy in two ways:

  • Inert Delete: every time you get the key from the key space are made to check whether the key expires, if expired, then it deletes the key; if not expired, return the key.
  • Deleted regularly: Every so often, the program checks the database once, delete the expired keys inside.

In addition, Redisyou can also turn on LRUfeature to automatically eliminate some pairs.

LRU algorithm

When you need to eliminate the data from the cache, we hope to eliminate those data can no longer be used in the future, the future will be those that retain data frequently accessed, but the biggest problem is the cache and can not predict the future. One solution is through LRUprediction: the possibility of frequently accessed data recently accessed the future is greater. Data cache generally have such access Distribution: part of the data has most of the visits. When the access mode is rarely changed, can record data for each of the last access time, data has minimal idle time can be considered in the future most likely to be accessed.

Examples The following access mode, A visited once every 5s, B visited once every 2s, C, and D each 10s access time, |which represents the calculated idle time of cut-off point:

~~~~~A~~~~~A~~~~~A~~~~A~~~~~A~~~~~A~~|
~~B~~B~~B~~B~~B~~B~~B~~B~~B~~B~~B~~B~|
~~~~~~~~~~C~~~~~~~~~C~~~~~~~~~C~~~~~~|
~~~~~D~~~~~~~~~~D~~~~~~~~~D~~~~~~~~~D|

It can be seen LRUfor a very good A, B, C work perfectly predict the future is to access the probability of B> A> C, but D has predicted a minimum of idle time.

However, in general, LRUthe algorithm is already a good enough performance of the algorithm

LRU configuration parameters

RedisConfiguration and LRUrelated There are three:

  • maxmemory: Configure Redisthe specified limits when the memory size for storing data, for example 100m. When the cache memory consumption exceeds this value, the trigger data eliminated. When the data is set to 0, it represents the amount of data in the cache is not limited, i.e., LRU is not effective. The default value of the system 64 0,32-bit memory system default limit is 3GB
  • maxmemory_policy: Elimination strategy post-trigger data out of
  • maxmemory_samples: Random sampling accuracy, then the number key is withdrawn. The larger the value configuration, closer to the true LRU algorithm, but the larger the value, the corresponding consumption is high, have an impact on performance, the default value of 5 samples.

Elimination strategy

That phase-out strategy maxmemory_policyassignment are the following:

  • noeviction: If the cached data exceeds the maxmemorylimit value, and the client is a command (write command most, but a few instructions and exceptions DEL) causes the memory allocation, returning an error response to the client
  • allkeys-lru: Take all the keys are LRUeliminated
  • volatile-lru: Only set the expiration time to take the key LRUout of
  • allkeys-random: Random recycle all of the key
  • volatile-random: Random recovery key expiration time set
  • volatile-ttl: Only out of a key set an expiration time --- time to live out of TTL(Time To Live)a smaller bond

volatile-lru, volatile-randomAnd volatile-ttlthree out of the policy are not using the full amount of data, it may not eliminate enough memory space. In the absence of key expired or not set a timeout attribute keys, these three strategies and noevictionsimilar.

General rule of thumb:

  • Use allkeys-lrustrategy: when the request is expected in line with a power of distribution (Pareto rule, etc.), such as part of a subset of the elements are accessed more than any other of the other elements, you can choose this strategy.
  • Use allkeys-random: the cycle of continuous access to all the keys, or are expected to request the distribution of the mean (probability of all elements are accessed almost)
  • Use volatile-ttl: To adopt this strategy, the cache object TTLvalues differ best

volatile-lruAnd volatile-randomstrategy, when you want to use a single Redisinstance to simultaneously implement caching and persistence of some keys out of a collection of frequently used useful. Not set an expiration time of key persistence saved, set the expiration time to participate in key caching eliminated. But generally run two instances of a better way to solve this problem.

Set the expiration time for the key is the need to consume memory, so use allkeys-lruthis strategy to save more space, because you can not set an expiration time for the key to this strategy.

Approximate LRU algorithm

We know that the LRUalgorithm requires a doubly linked list of recently accessed order to record the data, but for saving memory considerations, Redisthe LRUalgorithm is not complete implementation. RedisAnd select key will not be accessed for the longest time to recover, instead it will try to run a similar LRUalgorithm, by sampling a small number key and then recovered most of them long-lost key to be accessed. By adjusting the number of samples per time recovery maxmemory-samples, accuracy adjustment algorithm may be implemented.

According Redisto the authors, each Redis Objectmay be extruded 24 bits of space, but it is not enough to store 24 bits of two pointers, a stored time stamp is low enough, Redis Objectin seconds, when the object is stored in units of new or updated unix time, also that is LRU clock, 24 bits of data needed to overflow, then 194 days, while the cached data is updated very frequently enough.

RedisThe space key is placed in a hash table, all keys from a least-recently selected key is accessed, the need to store a data structure of the source information, which is obviously uneconomical. Initially, Redisonly randomly selected three key, which is then eliminated, then the algorithm to improve the N个keypolicies, the default is 5.

RedisAfter 3.0 has improved the performance of the algorithm, a candidate will provide the key pool, which by default has 16 key, in accordance with the free time sorted, the new key will only be pooldissatisfied or idle time is greater than poolthe smallest, to get into the pool.

Real LRUalgorithm approximation LRUalgorithm by the image following comparison:

Light gray ribbon is the object has been eliminated, the gray band is not out of the objects, the object is green with newly added. As can be seen, maxmemory-sampleswhen the value of 5 Redis 3.0results than Redis 2.8better. Using 10 sample size Redis 3.0approximation LRUalgorithm is very close to the theoretical performance.

When data access mode is very close to the power of distribution, which is concentrated in the most accessible part of the key, LRUapproximation algorithms handled very well.

In the process of simulation experiments, we found that if power of access mode distribution, real LRUalgorithms, and approximation LRUalgorithms almost no difference.

LRU source code analysis

RedisThe keys and values are redisObjectobjects:

typedef struct redisObject {
    unsigned type:4;
    unsigned encoding:4;
    unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
                            * LFU data (least significant 8 bits frequency
                            * and most significant 16 bits access time). */
    int refcount;
    void *ptr;
} robj;

unsignedThe low 24 bits of the lrurecord redisObjof the LRU time.

When Redis commands to access cached data, it will call the function lookupKey:

robj *lookupKey(redisDb *db, robj *key, int flags) {
    dictEntry *de = dictFind(db->dict,key->ptr);
    if (de) {
        robj *val = dictGetVal(de);

        /* Update the access time for the ageing algorithm.
         * Don't do it if we have a saving child, as this will trigger
         * a copy on write madness. */
        if (server.rdb_child_pid == -1 &&
            server.aof_child_pid == -1 &&
            !(flags & LOOKUP_NOTOUCH))
        {
            if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
                updateLFU(val);
            } else {
                val->lru = LRU_CLOCK();
            }
        }
        return val;
    } else {
        return NULL;
    }
}

This function is in the policy LRU(非LFU)when the object is updated lruvalue, set LRU_CLOCK()value:

/* Return the LRU clock, based on the clock resolution. This is a time
 * in a reduced-bits format that can be used to set and check the
 * object->lru field of redisObject structures. */
unsigned int getLRUClock(void) {
    return (mstime()/LRU_CLOCK_RESOLUTION) & LRU_CLOCK_MAX;
}

/* This function is used to obtain the current LRU clock.
 * If the current resolution is lower than the frequency we refresh the
 * LRU clock (as it should be in production servers) we return the
 * precomputed value, otherwise we need to resort to a system call. */
unsigned int LRU_CLOCK(void) {
    unsigned int lruclock;
    if (1000/server.hz <= LRU_CLOCK_RESOLUTION) {
        atomicGet(server.lruclock,lruclock);
    } else {
        lruclock = getLRUClock();
    }
    return lruclock;
}

LRU_CLOCK()Depending LRU_CLOCK_RESOLUTION(默认值1000), LRU_CLOCK_RESOLUTIONon behalf of the LRUaccuracy of the algorithm, namely a LRUunit is long. server.hzBehalf of the server refresh frequency, if the server's time to update the value of precision than the LRUprecision value is smaller, LRU_CLOCK()the direct use of server time, reduce overhead.

RedisEntrance processing command is processCommand:

int processCommand(client *c) {

    /* Handle the maxmemory directive.
     *
     * Note that we do not want to reclaim memory if we are here re-entering
     * the event loop since there is a busy Lua script running in timeout
     * condition, to avoid mixing the propagation of scripts with the
     * propagation of DELs due to eviction. */
    if (server.maxmemory && !server.lua_timedout) {
        int out_of_memory = freeMemoryIfNeededAndSafe() == C_ERR;
        /* freeMemoryIfNeeded may flush slave output buffers. This may result
         * into a slave, that may be the active client, to be freed. */
        if (server.current_client == NULL) return C_ERR;

        /* It was impossible to free enough memory, and the command the client
         * is trying to execute is denied during OOM conditions or the client
         * is in MULTI/EXEC context? Error. */
        if (out_of_memory &&
            (c->cmd->flags & CMD_DENYOOM ||
             (c->flags & CLIENT_MULTI && c->cmd->proc != execCommand))) {
            flagTransaction(c);
            addReply(c, shared.oomerr);
            return C_OK;
        }
    }
}

Only some of the release of the memory space, freeMemoryIfNeededAndSafefor the release of the memory function:

int freeMemoryIfNeeded(void) {
    /* By default replicas should ignore maxmemory
     * and just be masters exact copies. */
    if (server.masterhost && server.repl_slave_ignore_maxmemory) return C_OK;

    size_t mem_reported, mem_tofree, mem_freed;
    mstime_t latency, eviction_latency;
    long long delta;
    int slaves = listLength(server.slaves);

    /* When clients are paused the dataset should be static not just from the
     * POV of clients not being able to write, but also from the POV of
     * expires and evictions of keys not being performed. */
    if (clientsArePaused()) return C_OK;
    if (getMaxmemoryState(&mem_reported,NULL,&mem_tofree,NULL) == C_OK)
        return C_OK;

    mem_freed = 0;

    if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION)
        goto cant_free; /* We need to free memory, but policy forbids. */

    latencyStartMonitor(latency);
    while (mem_freed < mem_tofree) {
        int j, k, i, keys_freed = 0;
        static unsigned int next_db = 0;
        sds bestkey = NULL;
        int bestdbid;
        redisDb *db;
        dict *dict;
        dictEntry *de;

        if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) ||
            server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL)
        {
            struct evictionPoolEntry *pool = EvictionPoolLRU;

            while(bestkey == NULL) {
                unsigned long total_keys = 0, keys;

                /* We don't want to make local-db choices when expiring keys,
                 * so to start populate the eviction pool sampling keys from
                 * every DB. */
                for (i = 0; i < server.dbnum; i++) {
                    db = server.db+i;
                    dict = (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) ?
                            db->dict : db->expires;
                    if ((keys = dictSize(dict)) != 0) {
                        evictionPoolPopulate(i, dict, db->dict, pool);
                        total_keys += keys;
                    }
                }
                if (!total_keys) break; /* No keys to evict. */

                /* Go backward from best to worst element to evict. */
                for (k = EVPOOL_SIZE-1; k >= 0; k--) {
                    if (pool[k].key == NULL) continue;
                    bestdbid = pool[k].dbid;

                    if (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) {
                        de = dictFind(server.db[pool[k].dbid].dict,
                            pool[k].key);
                    } else {
                        de = dictFind(server.db[pool[k].dbid].expires,
                            pool[k].key);
                    }

                    /* Remove the entry from the pool. */
                    if (pool[k].key != pool[k].cached)
                        sdsfree(pool[k].key);
                    pool[k].key = NULL;
                    pool[k].idle = 0;

                    /* If the key exists, is our pick. Otherwise it is
                     * a ghost and we need to try the next element. */
                    if (de) {
                        bestkey = dictGetKey(de);
                        break;
                    } else {
                        /* Ghost... Iterate again. */
                    }
                }
            }
        }

        /* volatile-random and allkeys-random policy */
        else if (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM ||
                 server.maxmemory_policy == MAXMEMORY_VOLATILE_RANDOM)
        {
            /* When evicting a random key, we try to evict a key for
             * each DB, so we use the static 'next_db' variable to
             * incrementally visit all DBs. */
            for (i = 0; i < server.dbnum; i++) {
                j = (++next_db) % server.dbnum;
                db = server.db+j;
                dict = (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM) ?
                        db->dict : db->expires;
                if (dictSize(dict) != 0) {
                    de = dictGetRandomKey(dict);
                    bestkey = dictGetKey(de);
                    bestdbid = j;
                    break;
                }
            }
        }

        /* Finally remove the selected key. */
        if (bestkey) {
            db = server.db+bestdbid;
            robj *keyobj = createStringObject(bestkey,sdslen(bestkey));
            propagateExpire(db,keyobj,server.lazyfree_lazy_eviction);
            /* We compute the amount of memory freed by db*Delete() alone.
             * It is possible that actually the memory needed to propagate
             * the DEL in AOF and replication link is greater than the one
             * we are freeing removing the key, but we can't account for
             * that otherwise we would never exit the loop.
             *
             * AOF and Output buffer memory will be freed eventually so
             * we only care about memory used by the key space. */
            delta = (long long) zmalloc_used_memory();
            latencyStartMonitor(eviction_latency);
            if (server.lazyfree_lazy_eviction)
                dbAsyncDelete(db,keyobj);
            else
                dbSyncDelete(db,keyobj);
            latencyEndMonitor(eviction_latency);
            latencyAddSampleIfNeeded("eviction-del",eviction_latency);
            latencyRemoveNestedEvent(latency,eviction_latency);
            delta -= (long long) zmalloc_used_memory();
            mem_freed += delta;
            server.stat_evictedkeys++;
            notifyKeyspaceEvent(NOTIFY_EVICTED, "evicted",
                keyobj, db->id);
            decrRefCount(keyobj);
            keys_freed++;

            /* When the memory to free starts to be big enough, we may
             * start spending so much time here that is impossible to
             * deliver data to the slaves fast enough, so we force the
             * transmission here inside the loop. */
            if (slaves) flushSlavesOutputBuffers();

            /* Normally our stop condition is the ability to release
             * a fixed, pre-computed amount of memory. However when we
             * are deleting objects in another thread, it's better to
             * check, from time to time, if we already reached our target
             * memory, since the "mem_freed" amount is computed only
             * across the dbAsyncDelete() call, while the thread can
             * release the memory all the time. */
            if (server.lazyfree_lazy_eviction && !(keys_freed % 16)) {
                if (getMaxmemoryState(NULL,NULL,NULL,NULL) == C_OK) {
                    /* Let's satisfy our stop condition. */
                    mem_freed = mem_tofree;
                }
            }
        }

        if (!keys_freed) {
            latencyEndMonitor(latency);
            latencyAddSampleIfNeeded("eviction-cycle",latency);
            goto cant_free; /* nothing to free... */
        }
    }
    latencyEndMonitor(latency);
    latencyAddSampleIfNeeded("eviction-cycle",latency);
    return C_OK;

cant_free:
    /* We are here if we are not able to reclaim memory. There is only one
     * last thing we can try: check if the lazyfree thread has jobs in queue
     * and wait... */
    while(bioPendingJobsOfType(BIO_LAZY_FREE)) {
        if (((mem_reported - zmalloc_used_memory()) + mem_freed) >= mem_tofree)
            break;
        usleep(1000);
    }
    return C_ERR;
}

/* This is a wrapper for freeMemoryIfNeeded() that only really calls the
 * function if right now there are the conditions to do so safely:
 *
 * - There must be no script in timeout condition.
 * - Nor we are loading data right now.
 *
 */
int freeMemoryIfNeededAndSafe(void) {
    if (server.lua_timedout || server.loading) return C_OK;
    return freeMemoryIfNeeded();
}

Several phase-out strategy maxmemory_policyis implemented in the function inside.

When employed LRU, it is possible to see, from the database starts 0 (default 16), according to different policies, selection redisDbof dict(全部键)or expires(有过期时间的键)for pond candidate key update pool, poolupdate strategy is evictionPoolPopulate:

void evictionPoolPopulate(int dbid, dict *sampledict, dict *keydict, struct evictionPoolEntry *pool) {
    int j, k, count;
    dictEntry *samples[server.maxmemory_samples];

    count = dictGetSomeKeys(sampledict,samples,server.maxmemory_samples);
    for (j = 0; j < count; j++) {
        unsigned long long idle;
        sds key;
        robj *o;
        dictEntry *de;

        de = samples[j];
        key = dictGetKey(de);

        /* If the dictionary we are sampling from is not the main
         * dictionary (but the expires one) we need to lookup the key
         * again in the key dictionary to obtain the value object. */
        if (server.maxmemory_policy != MAXMEMORY_VOLATILE_TTL) {
            if (sampledict != keydict) de = dictFind(keydict, key);
            o = dictGetVal(de);
        }

        /* Calculate the idle time according to the policy. This is called
         * idle just because the code initially handled LRU, but is in fact
         * just a score where an higher score means better candidate. */
        if (server.maxmemory_policy & MAXMEMORY_FLAG_LRU) {
            idle = estimateObjectIdleTime(o);
        } else if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
            /* When we use an LRU policy, we sort the keys by idle time
             * so that we expire keys starting from greater idle time.
             * However when the policy is an LFU one, we have a frequency
             * estimation, and we want to evict keys with lower frequency
             * first. So inside the pool we put objects using the inverted
             * frequency subtracting the actual frequency to the maximum
             * frequency of 255. */
            idle = 255-LFUDecrAndReturn(o);
        } else if (server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL) {
            /* In this case the sooner the expire the better. */
            idle = ULLONG_MAX - (long)dictGetVal(de);
        } else {
            serverPanic("Unknown eviction policy in evictionPoolPopulate()");
        }

        /* Insert the element inside the pool.
         * First, find the first empty bucket or the first populated
         * bucket that has an idle time smaller than our idle time. */
        k = 0;
        while (k < EVPOOL_SIZE &&
               pool[k].key &&
               pool[k].idle < idle) k++;
        if (k == 0 && pool[EVPOOL_SIZE-1].key != NULL) {
            /* Can't insert if the element is < the worst element we have
             * and there are no empty buckets. */
            continue;
        } else if (k < EVPOOL_SIZE && pool[k].key == NULL) {
            /* Inserting into empty position. No setup needed before insert. */
        } else {
            /* Inserting in the middle. Now k points to the first element
             * greater than the element to insert.  */
            if (pool[EVPOOL_SIZE-1].key == NULL) {
                /* Free space on the right? Insert at k shifting
                 * all the elements from k to end to the right. */

                /* Save SDS before overwriting. */
                sds cached = pool[EVPOOL_SIZE-1].cached;
                memmove(pool+k+1,pool+k,
                    sizeof(pool[0])*(EVPOOL_SIZE-k-1));
                pool[k].cached = cached;
            } else {
                /* No free space on right? Insert at k-1 */
                k--;
                /* Shift all elements on the left of k (included) to the
                 * left, so we discard the element with smaller idle time. */
                sds cached = pool[0].cached; /* Save SDS before overwriting. */
                if (pool[0].key != pool[0].cached) sdsfree(pool[0].key);
                memmove(pool,pool+1,sizeof(pool[0])*k);
                pool[k].cached = cached;
            }
        }

        /* Try to reuse the cached SDS string allocated in the pool entry,
         * because allocating and deallocating this object is costly
         * (according to the profiler, not my fantasy. Remember:
         * premature optimizbla bla bla bla. */
        int klen = sdslen(key);
        if (klen > EVPOOL_CACHED_SDS_SIZE) {
            pool[k].key = sdsdup(key);
        } else {
            memcpy(pool[k].cached,key,klen+1);
            sdssetlen(pool[k].cached,klen);
            pool[k].key = pool[k].cached;
        }
        pool[k].idle = idle;
        pool[k].dbid = dbid;
    }
}

RedisRandomly selected maxmemory_samplesnumber of key, the key is then calculated idle time idle time, when the condition (idle time is larger than a certain bonds in the pool) pool can enter. After the pool update, you eliminate the biggest pool of idle time key.

estimateObjectIdleTimeUsed to calculate the Redistarget idle time:

/* Given an object returns the min number of milliseconds the object was never
 * requested, using an approximated LRU algorithm. */
unsigned long long estimateObjectIdleTime(robj *o) {
    unsigned long long lruclock = LRU_CLOCK();
    if (lruclock >= o->lru) {
        return (lruclock - o->lru) * LRU_CLOCK_RESOLUTION;
    } else {
        return (lruclock + (LRU_CLOCK_MAX - o->lru)) *
                    LRU_CLOCK_RESOLUTION;
    }
}

Idle time that is substantially subject lruand the global LRU_CLOCK()difference is multiplied by the precision LRU_CLOCK_RESOLUTION, the second conversion for milliseconds.

Reference links

Guess you like

Origin www.cnblogs.com/linxiyue/p/10945216.html