1. Redis rehash problem review
After a brief review, after the key threshold is exceeded, additional hash table memory size is required:
Number of key values | Need additional hash table memory size |
---|---|
134,217,728 | 2GB |
67,108,864 | 1GB |
33,554,432 | 512.0MB |
16,777,216 | 256.0MB |
8,388,608 | 128.0MB |
4,194,304 | 64.0MB |
2,097,152 | 32.0MB |
1,048,576 | 16.0MB |
524,288 | 8.0MB |
As the number of keys increases, the additional memory required for rehash increases, resulting in higher availability risks (mass eviction, Redis synchronization) and data loss (eviction) risks.
2. Redis 6.2&7+ Rehash related optimization
Rehash has been optimized after Redis 6.2:
Limit the main db dictionaries expansion to prevent key eviction (#7954)
In the past big dictionary rehashing could result in massive data eviction.Now this rehashing is delayed (up to a limit), which can result in performance loss due to hash collisions.
Simple translation:
在大的hash表(键值数多)场景下,rehash会延迟执行防止大量数据丢失和可用性问题。
3. Experiment
1. Version selection
-
Redis 6.0.15
-
Redis 7.0.11
2. Experimental conditions
-
maxmemory = 5.6GB
-
The first injection: 67,100,000 keys, observe the memory
-
Second push: 2,000,000 keys, watch for availability and eviction
3. Start the experiment
(1) Redis 6.0.15
The first injection: 67,100,000 keys, observe the memory:
maxmemory: 5.6GB
used_memory_human: 5.5GB
dbsize: 67,100,000
Second push: 2,000,000 keys, watch for availability and eviction
-
The client execution timed out, the timeout was nearly 30 seconds or more, redis-cli --latency-history
min: 0, max: 1, avg: 0.06 (1485 samples) -- 15.01 seconds range min: 0, max: 36511, avg: 45.63 (801 samples) -- 44.61 seconds range
-
Evict a large number of keys: more than 5 million
evicted_keys:5,909,376
-
Peak memory: one more GB for rehash
used_memory_peak_human:6.50G
(2) Redis 7.0.11
The first injection: 67,100,000 keys, observe the memory:
maxmemory: 5.6GB
used_memory_human: 5.5GB
dbsize: 67,100,000
Second push: 2,000,000 keys, watch for availability and eviction
-
Client execution without timeout, redis-cli --latency-history
min: 0, max: 2, avg: 0.05 (1484 samples) -- 15.01 seconds range min: 0, max: 3, avg: 0.24 (1454 samples) -- 15.00 seconds range
-
Normal eviction (0.1GB left)
evicted_keys:152485
evicted_keys:443253
evicted_keys:751165
evicted_keys:1058191
evicted_keys:1367445
evicted_keys:1662485
evicted_keys:1662485
-
Peak memory: no rehash
used_memory_peak_human:5.50G
4. Experimental comparison
Version | Whether rehash occurs | critical timeout |
---|---|---|
6.0.15 | have | Yes, unavailable for more than 30 seconds |
7.0.11 | none | None, normal eviction |
4. Code analysis
1. dict:
The newly added expandAllowed determines whether dict is currently rehash
typedef struct dictType {
....
int (*expandAllowed)(size_t moreMem, double usedRatio);
....
} dictType;
When dict is expanding, it will add a new judgment dictTypeExpandAllowed
/* Expand the hash table if needed */
static int _dictExpandIfNeeded(dict *d)
{
......
if (!dictTypeExpandAllowed(d))
return DICT_OK;
......
}
If expandAllowed of dict is empty, rehash is allowed, otherwise expandAllowed is executed
static int dictTypeExpandAllowed(dict *d) {
if (d->type->expandAllowed == NULL) return 1;
return d->type->expandAllowed(
DICTHT_SIZE(_dictNextExp(d->ht_used[0] + 1)) * sizeof(dictEntry*),
(double)d->ht_used[0] / DICTHT_SIZE(d->ht_size_exp[0]));
}
2. redis dict
dbDictType and dbExpiresDictType of redis
/* Db->dict, keys are sds strings, vals are Redis objects. */
dictType dbDictType = {
dictSdsHash, /* hash function */
NULL, /* key dup */
NULL, /* val dup */
dictSdsKeyCompare, /* key compare */
dictSdsDestructor, /* key destructor */
dictObjectDestructor, /* val destructor */
dictExpandAllowed, /* allow to expand */
dictEntryMetadataSize /* size of entry metadata in bytes */
};
/* Db->expires */
dictType dbExpiresDictType = {
dictSdsHash, /* hash function */
NULL, /* key dup */
NULL, /* val dup */
dictSdsKeyCompare, /* key compare */
NULL, /* key destructor */
NULL, /* val destructor */
dictExpandAllowed /* allow to expand */
};
can be seen:
-
The usage rate of dict is greater than the hash load factor: rehash can be done
-
dict usage is less than hash load factor:
-
If maxmemory=0, you can do rehash
-
If the currently used memory + rehash requires memory less than maxmemory, you can do rehash
-
If the current object memory (used memory - additional memory (various buffers)) + rehash requires memory less than maxmemory, you can do rehash
-
int dictExpandAllowed(size_t moreMem, double usedRatio) {
if (usedRatio <= HASHTABLE_MAX_LOAD_FACTOR) {
return !overMaxmemoryAfterAlloc(moreMem);
} else {
return 1;
}
}
/* Return 1 if used memory is more than maxmemory after allocating more memory,
* return 0 if not. Redis may reject user's requests or evict some keys if used
* memory exceeds maxmemory, especially, when we allocate huge memory at once. */
int overMaxmemoryAfterAlloc(size_t moremem) {
if (!server.maxmemory) return 0; /* No limit. */
/* Check quickly. */
size_t mem_used = zmalloc_used_memory();
if (mem_used + moremem <= server.maxmemory) return 0;
size_t overhead = freeMemoryGetNotCountedMemory();
mem_used = (mem_used > overhead) ? mem_used - overhead : 0;
return mem_used + moremem > server.maxmemory;
}
V. Conclusion
-
Redis6.2+ version effectively solves the usability and eviction problems caused by rehash under large dict
-
Redis7 is currently relatively stable (released in April 2022), and 7.0.11 has been widely used online.
-
The new version only does the delayed processing of rehash, it does not stop doing rehash, and the performance problems that may arise due to the load factor need to be properly paid attention to.
-
Redis6.2+ has also made some optimizations for eviction, which will be introduced later: Incremental eviction processing