Redis
Use as a cache, in some scenarios to consider spatial memory consumption issues. Redis
Deletes outdated keys to free up space, delete outdated key strategy in two ways:
- Inert Delete: every time you get the key from the key space are made to check whether the key expires, if expired, then it deletes the key; if not expired, return the key.
- Deleted regularly: Every so often, the program checks the database once, delete the expired keys inside.
In addition, Redis
you can also turn on LRU
feature to automatically eliminate some pairs.
LRU algorithm
When you need to eliminate the data from the cache, we hope to eliminate those data can no longer be used in the future, the future will be those that retain data frequently accessed, but the biggest problem is the cache and can not predict the future. One solution is through LRU
prediction: the possibility of frequently accessed data recently accessed the future is greater. Data cache generally have such access Distribution: part of the data has most of the visits. When the access mode is rarely changed, can record data for each of the last access time, data has minimal idle time can be considered in the future most likely to be accessed.
Examples The following access mode, A visited once every 5s, B visited once every 2s, C, and D each 10s access time, |
which represents the calculated idle time of cut-off point:
~~~~~A~~~~~A~~~~~A~~~~A~~~~~A~~~~~A~~|
~~B~~B~~B~~B~~B~~B~~B~~B~~B~~B~~B~~B~|
~~~~~~~~~~C~~~~~~~~~C~~~~~~~~~C~~~~~~|
~~~~~D~~~~~~~~~~D~~~~~~~~~D~~~~~~~~~D|
It can be seen LRU
for a very good A, B, C work perfectly predict the future is to access the probability of B> A> C, but D has predicted a minimum of idle time.
However, in general, LRU
the algorithm is already a good enough performance of the algorithm
LRU configuration parameters
Redis
Configuration and LRU
related There are three:
maxmemory
: ConfigureRedis
the specified limits when the memory size for storing data, for example100m
. When the cache memory consumption exceeds this value, the trigger data eliminated. When the data is set to 0, it represents the amount of data in the cache is not limited, i.e., LRU is not effective. The default value of the system 64 0,32-bit memory system default limit is 3GBmaxmemory_policy
: Elimination strategy post-trigger data out ofmaxmemory_samples
: Random sampling accuracy, then the number key is withdrawn. The larger the value configuration, closer to the true LRU algorithm, but the larger the value, the corresponding consumption is high, have an impact on performance, the default value of 5 samples.
Elimination strategy
That phase-out strategy maxmemory_policy
assignment are the following:
noeviction
: If the cached data exceeds themaxmemory
limit value, and the client is a command (write command most, but a few instructions and exceptions DEL) causes the memory allocation, returning an error response to the clientallkeys-lru
: Take all the keys areLRU
eliminatedvolatile-lru
: Only set the expiration time to take the keyLRU
out ofallkeys-random
: Random recycle all of the keyvolatile-random
: Random recovery key expiration time setvolatile-ttl
: Only out of a key set an expiration time --- time to live out ofTTL(Time To Live)
a smaller bond
volatile-lru
, volatile-random
And volatile-ttl
three out of the policy are not using the full amount of data, it may not eliminate enough memory space. In the absence of key expired or not set a timeout attribute keys, these three strategies and noeviction
similar.
General rule of thumb:
- Use
allkeys-lru
strategy: when the request is expected in line with a power of distribution (Pareto rule, etc.), such as part of a subset of the elements are accessed more than any other of the other elements, you can choose this strategy. - Use
allkeys-random
: the cycle of continuous access to all the keys, or are expected to request the distribution of the mean (probability of all elements are accessed almost) - Use
volatile-ttl
: To adopt this strategy, the cache objectTTL
values differ best
volatile-lru
And volatile-random
strategy, when you want to use a single Redis
instance to simultaneously implement caching and persistence of some keys out of a collection of frequently used useful. Not set an expiration time of key persistence saved, set the expiration time to participate in key caching eliminated. But generally run two instances of a better way to solve this problem.
Set the expiration time for the key is the need to consume memory, so use allkeys-lru
this strategy to save more space, because you can not set an expiration time for the key to this strategy.
Approximate LRU algorithm
We know that the LRU
algorithm requires a doubly linked list of recently accessed order to record the data, but for saving memory considerations, Redis
the LRU
algorithm is not complete implementation. Redis
And select key will not be accessed for the longest time to recover, instead it will try to run a similar LRU
algorithm, by sampling a small number key and then recovered most of them long-lost key to be accessed. By adjusting the number of samples per time recovery maxmemory-samples
, accuracy adjustment algorithm may be implemented.
According Redis
to the authors, each Redis Object
may be extruded 24 bits of space, but it is not enough to store 24 bits of two pointers, a stored time stamp is low enough, Redis Object
in seconds, when the object is stored in units of new or updated unix time
, also that is LRU clock
, 24 bits of data needed to overflow, then 194 days, while the cached data is updated very frequently enough.
Redis
The space key is placed in a hash table, all keys from a least-recently selected key is accessed, the need to store a data structure of the source information, which is obviously uneconomical. Initially, Redis
only randomly selected three key, which is then eliminated, then the algorithm to improve the N个key
policies, the default is 5.
Redis
After 3.0 has improved the performance of the algorithm, a candidate will provide the key pool
, which by default has 16 key, in accordance with the free time sorted, the new key will only be pool
dissatisfied or idle time is greater than pool
the smallest, to get into the pool.
Real LRU
algorithm approximation LRU
algorithm by the image following comparison:
Light gray ribbon is the object has been eliminated, the gray band is not out of the objects, the object is green with newly added. As can be seen, maxmemory-samples
when the value of 5 Redis 3.0
results than Redis 2.8
better. Using 10 sample size Redis 3.0
approximation LRU
algorithm is very close to the theoretical performance.
When data access mode is very close to the power of distribution, which is concentrated in the most accessible part of the key, LRU
approximation algorithms handled very well.
In the process of simulation experiments, we found that if power of access mode distribution, real LRU
algorithms, and approximation LRU
algorithms almost no difference.
LRU source code analysis
Redis
The keys and values are redisObject
objects:
typedef struct redisObject {
unsigned type:4;
unsigned encoding:4;
unsigned lru:LRU_BITS; /* LRU time (relative to global lru_clock) or
* LFU data (least significant 8 bits frequency
* and most significant 16 bits access time). */
int refcount;
void *ptr;
} robj;
unsigned
The low 24 bits of the lru
record redisObj
of the LRU time.
When Redis commands to access cached data, it will call the function lookupKey
:
robj *lookupKey(redisDb *db, robj *key, int flags) {
dictEntry *de = dictFind(db->dict,key->ptr);
if (de) {
robj *val = dictGetVal(de);
/* Update the access time for the ageing algorithm.
* Don't do it if we have a saving child, as this will trigger
* a copy on write madness. */
if (server.rdb_child_pid == -1 &&
server.aof_child_pid == -1 &&
!(flags & LOOKUP_NOTOUCH))
{
if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
updateLFU(val);
} else {
val->lru = LRU_CLOCK();
}
}
return val;
} else {
return NULL;
}
}
This function is in the policy LRU(非LFU)
when the object is updated lru
value, set LRU_CLOCK()
value:
/* Return the LRU clock, based on the clock resolution. This is a time
* in a reduced-bits format that can be used to set and check the
* object->lru field of redisObject structures. */
unsigned int getLRUClock(void) {
return (mstime()/LRU_CLOCK_RESOLUTION) & LRU_CLOCK_MAX;
}
/* This function is used to obtain the current LRU clock.
* If the current resolution is lower than the frequency we refresh the
* LRU clock (as it should be in production servers) we return the
* precomputed value, otherwise we need to resort to a system call. */
unsigned int LRU_CLOCK(void) {
unsigned int lruclock;
if (1000/server.hz <= LRU_CLOCK_RESOLUTION) {
atomicGet(server.lruclock,lruclock);
} else {
lruclock = getLRUClock();
}
return lruclock;
}
LRU_CLOCK()
Depending LRU_CLOCK_RESOLUTION(默认值1000)
, LRU_CLOCK_RESOLUTION
on behalf of the LRU
accuracy of the algorithm, namely a LRU
unit is long. server.hz
Behalf of the server refresh frequency, if the server's time to update the value of precision than the LRU
precision value is smaller, LRU_CLOCK()
the direct use of server time, reduce overhead.
Redis
Entrance processing command is processCommand
:
int processCommand(client *c) {
/* Handle the maxmemory directive.
*
* Note that we do not want to reclaim memory if we are here re-entering
* the event loop since there is a busy Lua script running in timeout
* condition, to avoid mixing the propagation of scripts with the
* propagation of DELs due to eviction. */
if (server.maxmemory && !server.lua_timedout) {
int out_of_memory = freeMemoryIfNeededAndSafe() == C_ERR;
/* freeMemoryIfNeeded may flush slave output buffers. This may result
* into a slave, that may be the active client, to be freed. */
if (server.current_client == NULL) return C_ERR;
/* It was impossible to free enough memory, and the command the client
* is trying to execute is denied during OOM conditions or the client
* is in MULTI/EXEC context? Error. */
if (out_of_memory &&
(c->cmd->flags & CMD_DENYOOM ||
(c->flags & CLIENT_MULTI && c->cmd->proc != execCommand))) {
flagTransaction(c);
addReply(c, shared.oomerr);
return C_OK;
}
}
}
Only some of the release of the memory space, freeMemoryIfNeededAndSafe
for the release of the memory function:
int freeMemoryIfNeeded(void) {
/* By default replicas should ignore maxmemory
* and just be masters exact copies. */
if (server.masterhost && server.repl_slave_ignore_maxmemory) return C_OK;
size_t mem_reported, mem_tofree, mem_freed;
mstime_t latency, eviction_latency;
long long delta;
int slaves = listLength(server.slaves);
/* When clients are paused the dataset should be static not just from the
* POV of clients not being able to write, but also from the POV of
* expires and evictions of keys not being performed. */
if (clientsArePaused()) return C_OK;
if (getMaxmemoryState(&mem_reported,NULL,&mem_tofree,NULL) == C_OK)
return C_OK;
mem_freed = 0;
if (server.maxmemory_policy == MAXMEMORY_NO_EVICTION)
goto cant_free; /* We need to free memory, but policy forbids. */
latencyStartMonitor(latency);
while (mem_freed < mem_tofree) {
int j, k, i, keys_freed = 0;
static unsigned int next_db = 0;
sds bestkey = NULL;
int bestdbid;
redisDb *db;
dict *dict;
dictEntry *de;
if (server.maxmemory_policy & (MAXMEMORY_FLAG_LRU|MAXMEMORY_FLAG_LFU) ||
server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL)
{
struct evictionPoolEntry *pool = EvictionPoolLRU;
while(bestkey == NULL) {
unsigned long total_keys = 0, keys;
/* We don't want to make local-db choices when expiring keys,
* so to start populate the eviction pool sampling keys from
* every DB. */
for (i = 0; i < server.dbnum; i++) {
db = server.db+i;
dict = (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) ?
db->dict : db->expires;
if ((keys = dictSize(dict)) != 0) {
evictionPoolPopulate(i, dict, db->dict, pool);
total_keys += keys;
}
}
if (!total_keys) break; /* No keys to evict. */
/* Go backward from best to worst element to evict. */
for (k = EVPOOL_SIZE-1; k >= 0; k--) {
if (pool[k].key == NULL) continue;
bestdbid = pool[k].dbid;
if (server.maxmemory_policy & MAXMEMORY_FLAG_ALLKEYS) {
de = dictFind(server.db[pool[k].dbid].dict,
pool[k].key);
} else {
de = dictFind(server.db[pool[k].dbid].expires,
pool[k].key);
}
/* Remove the entry from the pool. */
if (pool[k].key != pool[k].cached)
sdsfree(pool[k].key);
pool[k].key = NULL;
pool[k].idle = 0;
/* If the key exists, is our pick. Otherwise it is
* a ghost and we need to try the next element. */
if (de) {
bestkey = dictGetKey(de);
break;
} else {
/* Ghost... Iterate again. */
}
}
}
}
/* volatile-random and allkeys-random policy */
else if (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM ||
server.maxmemory_policy == MAXMEMORY_VOLATILE_RANDOM)
{
/* When evicting a random key, we try to evict a key for
* each DB, so we use the static 'next_db' variable to
* incrementally visit all DBs. */
for (i = 0; i < server.dbnum; i++) {
j = (++next_db) % server.dbnum;
db = server.db+j;
dict = (server.maxmemory_policy == MAXMEMORY_ALLKEYS_RANDOM) ?
db->dict : db->expires;
if (dictSize(dict) != 0) {
de = dictGetRandomKey(dict);
bestkey = dictGetKey(de);
bestdbid = j;
break;
}
}
}
/* Finally remove the selected key. */
if (bestkey) {
db = server.db+bestdbid;
robj *keyobj = createStringObject(bestkey,sdslen(bestkey));
propagateExpire(db,keyobj,server.lazyfree_lazy_eviction);
/* We compute the amount of memory freed by db*Delete() alone.
* It is possible that actually the memory needed to propagate
* the DEL in AOF and replication link is greater than the one
* we are freeing removing the key, but we can't account for
* that otherwise we would never exit the loop.
*
* AOF and Output buffer memory will be freed eventually so
* we only care about memory used by the key space. */
delta = (long long) zmalloc_used_memory();
latencyStartMonitor(eviction_latency);
if (server.lazyfree_lazy_eviction)
dbAsyncDelete(db,keyobj);
else
dbSyncDelete(db,keyobj);
latencyEndMonitor(eviction_latency);
latencyAddSampleIfNeeded("eviction-del",eviction_latency);
latencyRemoveNestedEvent(latency,eviction_latency);
delta -= (long long) zmalloc_used_memory();
mem_freed += delta;
server.stat_evictedkeys++;
notifyKeyspaceEvent(NOTIFY_EVICTED, "evicted",
keyobj, db->id);
decrRefCount(keyobj);
keys_freed++;
/* When the memory to free starts to be big enough, we may
* start spending so much time here that is impossible to
* deliver data to the slaves fast enough, so we force the
* transmission here inside the loop. */
if (slaves) flushSlavesOutputBuffers();
/* Normally our stop condition is the ability to release
* a fixed, pre-computed amount of memory. However when we
* are deleting objects in another thread, it's better to
* check, from time to time, if we already reached our target
* memory, since the "mem_freed" amount is computed only
* across the dbAsyncDelete() call, while the thread can
* release the memory all the time. */
if (server.lazyfree_lazy_eviction && !(keys_freed % 16)) {
if (getMaxmemoryState(NULL,NULL,NULL,NULL) == C_OK) {
/* Let's satisfy our stop condition. */
mem_freed = mem_tofree;
}
}
}
if (!keys_freed) {
latencyEndMonitor(latency);
latencyAddSampleIfNeeded("eviction-cycle",latency);
goto cant_free; /* nothing to free... */
}
}
latencyEndMonitor(latency);
latencyAddSampleIfNeeded("eviction-cycle",latency);
return C_OK;
cant_free:
/* We are here if we are not able to reclaim memory. There is only one
* last thing we can try: check if the lazyfree thread has jobs in queue
* and wait... */
while(bioPendingJobsOfType(BIO_LAZY_FREE)) {
if (((mem_reported - zmalloc_used_memory()) + mem_freed) >= mem_tofree)
break;
usleep(1000);
}
return C_ERR;
}
/* This is a wrapper for freeMemoryIfNeeded() that only really calls the
* function if right now there are the conditions to do so safely:
*
* - There must be no script in timeout condition.
* - Nor we are loading data right now.
*
*/
int freeMemoryIfNeededAndSafe(void) {
if (server.lua_timedout || server.loading) return C_OK;
return freeMemoryIfNeeded();
}
Several phase-out strategy maxmemory_policy
is implemented in the function inside.
When employed LRU
, it is possible to see, from the database starts 0 (default 16), according to different policies, selection redisDb
of dict(全部键)
or expires(有过期时间的键)
for pond candidate key update pool
, pool
update strategy is evictionPoolPopulate
:
void evictionPoolPopulate(int dbid, dict *sampledict, dict *keydict, struct evictionPoolEntry *pool) {
int j, k, count;
dictEntry *samples[server.maxmemory_samples];
count = dictGetSomeKeys(sampledict,samples,server.maxmemory_samples);
for (j = 0; j < count; j++) {
unsigned long long idle;
sds key;
robj *o;
dictEntry *de;
de = samples[j];
key = dictGetKey(de);
/* If the dictionary we are sampling from is not the main
* dictionary (but the expires one) we need to lookup the key
* again in the key dictionary to obtain the value object. */
if (server.maxmemory_policy != MAXMEMORY_VOLATILE_TTL) {
if (sampledict != keydict) de = dictFind(keydict, key);
o = dictGetVal(de);
}
/* Calculate the idle time according to the policy. This is called
* idle just because the code initially handled LRU, but is in fact
* just a score where an higher score means better candidate. */
if (server.maxmemory_policy & MAXMEMORY_FLAG_LRU) {
idle = estimateObjectIdleTime(o);
} else if (server.maxmemory_policy & MAXMEMORY_FLAG_LFU) {
/* When we use an LRU policy, we sort the keys by idle time
* so that we expire keys starting from greater idle time.
* However when the policy is an LFU one, we have a frequency
* estimation, and we want to evict keys with lower frequency
* first. So inside the pool we put objects using the inverted
* frequency subtracting the actual frequency to the maximum
* frequency of 255. */
idle = 255-LFUDecrAndReturn(o);
} else if (server.maxmemory_policy == MAXMEMORY_VOLATILE_TTL) {
/* In this case the sooner the expire the better. */
idle = ULLONG_MAX - (long)dictGetVal(de);
} else {
serverPanic("Unknown eviction policy in evictionPoolPopulate()");
}
/* Insert the element inside the pool.
* First, find the first empty bucket or the first populated
* bucket that has an idle time smaller than our idle time. */
k = 0;
while (k < EVPOOL_SIZE &&
pool[k].key &&
pool[k].idle < idle) k++;
if (k == 0 && pool[EVPOOL_SIZE-1].key != NULL) {
/* Can't insert if the element is < the worst element we have
* and there are no empty buckets. */
continue;
} else if (k < EVPOOL_SIZE && pool[k].key == NULL) {
/* Inserting into empty position. No setup needed before insert. */
} else {
/* Inserting in the middle. Now k points to the first element
* greater than the element to insert. */
if (pool[EVPOOL_SIZE-1].key == NULL) {
/* Free space on the right? Insert at k shifting
* all the elements from k to end to the right. */
/* Save SDS before overwriting. */
sds cached = pool[EVPOOL_SIZE-1].cached;
memmove(pool+k+1,pool+k,
sizeof(pool[0])*(EVPOOL_SIZE-k-1));
pool[k].cached = cached;
} else {
/* No free space on right? Insert at k-1 */
k--;
/* Shift all elements on the left of k (included) to the
* left, so we discard the element with smaller idle time. */
sds cached = pool[0].cached; /* Save SDS before overwriting. */
if (pool[0].key != pool[0].cached) sdsfree(pool[0].key);
memmove(pool,pool+1,sizeof(pool[0])*k);
pool[k].cached = cached;
}
}
/* Try to reuse the cached SDS string allocated in the pool entry,
* because allocating and deallocating this object is costly
* (according to the profiler, not my fantasy. Remember:
* premature optimizbla bla bla bla. */
int klen = sdslen(key);
if (klen > EVPOOL_CACHED_SDS_SIZE) {
pool[k].key = sdsdup(key);
} else {
memcpy(pool[k].cached,key,klen+1);
sdssetlen(pool[k].cached,klen);
pool[k].key = pool[k].cached;
}
pool[k].idle = idle;
pool[k].dbid = dbid;
}
}
Redis
Randomly selected maxmemory_samples
number of key, the key is then calculated idle time idle time
, when the condition (idle time is larger than a certain bonds in the pool) pool can enter. After the pool update, you eliminate the biggest pool of idle time key.
estimateObjectIdleTime
Used to calculate the Redis
target idle time:
/* Given an object returns the min number of milliseconds the object was never
* requested, using an approximated LRU algorithm. */
unsigned long long estimateObjectIdleTime(robj *o) {
unsigned long long lruclock = LRU_CLOCK();
if (lruclock >= o->lru) {
return (lruclock - o->lru) * LRU_CLOCK_RESOLUTION;
} else {
return (lruclock + (LRU_CLOCK_MAX - o->lru)) *
LRU_CLOCK_RESOLUTION;
}
}
Idle time that is substantially subject lru
and the global LRU_CLOCK()
difference is multiplied by the precision LRU_CLOCK_RESOLUTION
, the second conversion for milliseconds.