Redis article exploring the scan and command keys

Foreword

Redis has a classic question, in the case of the huge amount of data, do things like look up information Key's meet certain rules, there are two ways here,

A, keys command

Simple and crude, to achieve greater expense due to the single-threaded Redis this feature, keys command is executed blocking the way, keys is a way to achieve the complexity of traversal is O (n), the more Redis library key, to find longer produce the blocking time.

keys *, keys codehole * are all key queries and query prefix codehole the key. Features too violent, poor performance, search the entire redis;

Disadvantages:

1, there is no offset, limit parameters, satisfy all the conditions of the discharge-time key, if there is a key example hundreds w satisfies the condition, when you do not see the end of the string full screen brush, you know uncomfortable .

2, keys algorithm is traversal algorithm complexity is O (n), if there are more than ten million of the key examples, this directive will lead to Redis service Caton, Redis read and write all other instructions will be delayed even time out error, because Redis is single-threaded program, executes all the instructions, other instructions must be executed before they can continue over the keys until the current instruction.

Two, scan command

Find a way to achieve non-blocking key value, in most cases it can replace command keys, optional stronger

scan command features:

1, although the complexity is O (n), but it is a step-wise through the cursor, the thread will not be blocked;

2, to provide limit parameter controls the maximum number of results returned each time, only a hint limit, the returned results may be more or less;

3, like keys, it also provides a pattern matching;

4, the server cursor does not need to save the state, the only state of the cursor is to scan the cursor back to the client an integer;

5, returned results may overlap, the client needs to repeat, this is very important;
under normal circumstances, the use of scan, no problem, if you are rehash, will result in rereading
example, there are four barrels, read 0,1 when rehash occurs, will be 0 to 4, 1 to 5 is caused reread,
Redis high bit increments traversed, as follows, when 02, traversed after 0426 must be traversed to
but there may occur as volume reduction repeated key

00  0
10  2
01  1
11  3
 
000 0
100 4
010 2
110 6
001 1
101 7
011 3
111 8

6, if during traversal data modification, data can not traverse to change is uncertain;

7, a single result returned is empty does not mean the end of the traverse, but depends on the cursor returned value is zero

Three, keys, scan command specific usage

For example, the query beginning key111 What's key here?

If you use command keys, perform the keys key1111 *, check out all at once.
Here Insert Picture Description

Similarly, if a scan command, withscan 0 match key1111* count 20

Here Insert Picture Description
scan syntax is: SCAN cursor [MATCH pattern] [COUNT count]The default COUNT value is 10.

SCAN command is a cursor-based iterators. This means that the command is invoked every time a need to use this as a call to return the cursor cursor parameters of the invocation, in order to continue the iterative process before.

As used herein scan 0 match key1111* count 20command to complete this inquiry, somewhat surprisingly, did not start using a query to the results, the scan command from the principle point of view.

scan when traversing key, 0 represents the first time, key1111 * according to the model represents the beginning of key1111 match, count 20 is not representative of the output of the 20 qualified key, but limited the number of slots dictionary a single pass through the server (approximately equal).

So, what is also called data slot? This slot is not Redis cluster slot? the answer is negative. In fact, the figure has been given the answer.

If it says the number of "dictionary slot" is a cluster of slot, but also know the number of slot cluster is 16384, then after traversing 16,384 slots, it must be able to traverse all the key information above clearly see that when traversing 20,000 the number of slots dictionary when the cursor is still not completed through the results, so the dictionary does not mean that the concept of slot groove in the cluster.

经过测试,在scan的时候,究竟遍历多大的COUNT值能完全match到符合条件的key,跟具体对象的key的个数有关,
如果以超过key个数的count来scan,必定会一次性就查找到所有符合条件的key,比如在key个数为10W个的情况下,一次遍历20w个字典槽,肯定能完全遍历出来结果。

Here Insert Picture Description

四、探究scan底层源码

Redis的结构

Redis使用了Hash表作为底层实现,原因不外乎高效且实现简单。说到Hash表,很多Java程序员第一反应就是HashMap。没错,Redis底层key的存储结构就是类似于HashMap那样数组+链表的结构。其中第一维的数组大小为2n(n>=0)。每次扩容数组长度扩大一倍。

scan命令就是对这个一维数组进行遍历。每次返回的游标值也都是这个数组的索引。limit参数表示遍历多少个数组的元素,将这些元素下挂接的符合条件的结果都返回。因为每个元素下挂接的链表大小不同,所以每次返回的结果数量也就不同。

SCAN的遍历顺序

关于scan命令的遍历顺序,我们可以用一个小栗子来具体看一下。

127.0.0.1:6379> keys *
1) "db_number"
2) "key1"
3) "myKey"
127.0.0.1:6379> scan 0 MATCH * COUNT 1
1) "2"
2) 1) "db_number"
127.0.0.1:6379> scan 2 MATCH * COUNT 1
1) "1"
2) 1) "myKey"
127.0.0.1:6379> scan 1 MATCH * COUNT 1
1) "3"
2) 1) "key1"
127.0.0.1:6379> scan 3 MATCH * COUNT 1
1) "0"
2) (empty list or set)

我们的Redis中有3个key,我们每次只遍历一个一维数组中的元素。如上所示,SCAN命令的遍历顺序是

0->2->1->3

这个顺序看起来有些奇怪。我们把它转换成二进制就好理解一些了。

00->10->01->11

我们发现每次这个序列是高位加1的。普通二进制的加法,是从右往左相加、进位。而这个序列是从左往右相加、进位的。这一点我们在redis的源码中也得到印证。

在dict.c文件的dictScan函数中对游标进行了如下处理

v = rev(v);
v++;
v = rev(v);

意思是,将游标倒置,加一后,再倒置,也就是我们所说的“高位加1”的操作。

这里大家可能会有疑问了,为什么要使用这样的顺序进行遍历,而不是用正常的0、1、2……这样的顺序呢,这是因为需要考虑遍历时发生字典扩容与缩容的情况(不得不佩服开发者考虑问题的全面性)。

我们来看一下在SCAN遍历过程中,发生扩容时,遍历会如何进行。加入我们原始的数组有4个元素,也就是索引有两位,这时需要把它扩充成3位,并进行rehash。

img

rehash

原来挂接在xx下的所有元素被分配到0xx和1xx下。在上图中,当我们即将遍历10时,dict进行了rehash,这时,scan命令会从010开始遍历,而000和100(原00下挂接的元素)不会再被重复遍历。

再来看看缩容的情况。假设dict从3位缩容到2位,当即将遍历110时,dict发生了缩容,这时scan会遍历10。这时010下挂接的元素会被重复遍历,但010之前的元素都不会被重复遍历了。所以,缩容时还是可能会有些重复元素出现的。

总结:scan遍历顺序采用高位进位加法来遍历,进位的方向是从高位到低位,原因是考虑到字典的扩容和缩容时避免槽位的遍历重复和遗漏。

Redis的rehash

rehash是一个比较复杂的过程,为了不阻塞Redis的进程,它采用了一种渐进式的rehash的机制。

/* 字典 */
typedef struct dict {
    // 类型特定函数
    dictType *type;
    // 私有数据
    void *privdata;
    // 哈希表
    dictht ht[2];
    // rehash 索引
    // 当 rehash 不在进行时,值为 -1
    int rehashidx; /* rehashing not in progress if rehashidx == -1 */
    // 目前正在运行的安全迭代器的数量
    int iterators; /* number of iterators currently running */
} dict;

在Redis的字典结构中,有两个hash表,一个新表,一个旧表。在rehash的过程中,redis将旧表中的元素逐步迁移到新表中,接下来我们看一下dict的rehash操作的源码。

/* Performs N steps of incremental rehashing. Returns 1 if there are still
 * keys to move from the old to the new hash table, otherwise 0 is returned.
 *
 * Note that a rehashing step consists in moving a bucket (that may have more
 * than one key as we use chaining) from the old to the new hash table, however
 * since part of the hash table may be composed of empty spaces, it is not
 * guaranteed that this function will rehash even a single bucket, since it
 * will visit at max N*10 empty buckets in total, otherwise the amount of
 * work it does would be unbound and the function may block for a long time. */
int dictRehash(dict *d, int n) {
    int empty_visits = n*10; /* Max number of empty buckets to visit. */
    if (!dictIsRehashing(d)) return 0;

    while(n-- && d->ht[0].used != 0) {
        dictEntry *de, *nextde;

        /* Note that rehashidx can't overflow as we are sure there are more
         * elements because ht[0].used != 0 */
        assert(d->ht[0].size > (unsigned long)d->rehashidx);
        while(d->ht[0].table[d->rehashidx] == NULL) {
            d->rehashidx++;
            if (--empty_visits == 0) return 1;
        }
        de = d->ht[0].table[d->rehashidx];
        /* Move all the keys in this bucket from the old to the new hash HT */
        while(de) {
            uint64_t h;

            nextde = de->next;
            /* Get the index in the new hash table */
            h = dictHashKey(d, de->key) & d->ht[1].sizemask;
            de->next = d->ht[1].table[h];
            d->ht[1].table[h] = de;
            d->ht[0].used--;
            d->ht[1].used++;
            de = nextde;
        }
        d->ht[0].table[d->rehashidx] = NULL;
        d->rehashidx++;
    }

    /* Check if we already rehashed the whole table... */
    if (d->ht[0].used == 0) {
        zfree(d->ht[0].table);
        d->ht[0] = d->ht[1];
        _dictReset(&d->ht[1]);
        d->rehashidx = -1;
        return 0;
    }

    /* More to rehash... */
    return 1;
}

通过注释我们就能了解到,rehash的过程是以bucket为基本单位进行迁移的。所谓的bucket其实就是我们前面所提到的一维数组的元素。每次迁移一个列表。下面来解释一下这段代码。

  • 首先判断一下是否在进行rehash,如果是,则继续进行;否则直接返回。
  • 接着就是分n步开始进行渐进式rehash。同时还判断是否还有剩余元素,以保证安全性。
  • 在进行rehash之前,首先判断要迁移的bucket是否越界。
  • 然后跳过空的bucket,这里有一个empty_visits变量,表示最大可访问的空bucket的数量,这一变量主要是为了保证不过多的阻塞Redis。
  • 接下来就是元素的迁移,将当前bucket的全部元素进行rehash,并且更新两张表中元素的数量。
  • After each migration a bucket, you need to point to the old table bucket NULL.
  • Finally, determine what the entire migration is complete, and if so, reclaim space, reset rehash indexes, or tell the caller still not migrated data.

Since Redis using progressive rehash mechanism, therefore, scan command need to scan old table and new table, the results returned to the client.

Summary: Redis expansion: move the old data to the new set of new, redis be progressive rehash, while retaining the old and the new array

Reference article:

https://www.jianshu.com/p/be15dc89a3e8

https://blog.csdn.net/zanpengfei/article/details/83691841

http://jinguoxing.github.io/redis/2018/09/04/redis-scan/

Published 107 original articles · won praise 14 · views 40000 +

Guess you like

Origin blog.csdn.net/belongtocode/article/details/103969400