Ordered set of redis high-performance data structure

background

We have already talked about two data structures , today we will talk about the most distinctive data structure zset (ordered list) in redis

ZSET

Introduction

The zset ordered list obviously means an ordered and non-repetitive data structure, which is similar to the combination of sortset and hashmap in Java, but it is implemented in redis through two underlying data structures. One is the ziplist compressed list, and the other is the most classic data structure skipList in redis.

Choice of underlying data structure

Insert the data structure selection for the first time

  1. When using the ZDD command to add the first element to an empty key, the program determines which coded ordered set to create by checking the first element of the input.
  2. A ziplist will be created if the following conditions are met
    • The value of server property server.zset_max_ziplist_entries is greater than 0
    • The member length of the element is less than the value of the server attribute server.zset_max_ziplist_value (default 64)
  3. If the above conditions are not met, use skiplist to achieve encoding.

Post code conversion

  1. When you first select ziplist, it will be converted to skipList in the following two cases.
    • The elements saved by ziplist exceed the value of server attribute server.zset_max_ziplist_entries (the default value is 128)
    • The length of the member of the newly added element is greater than the value of the server attribute server.zset_max_ziplist_value (the default value is 64
      ),
  2. So do we think about why we need to switch? We have already talked about it when we summarized the Hash object . The ziplist is a storage space next to it, and there is no reserved space. The advantage of the ziplist is to save space, but the expansion of the capacity is one of the main reasons that affect its performance. Let's look at how skipList solves these problems next?

SKIPLIST (skip list)

Introduction

  1. Because redis’s skipList is ordered, it needs a hash structure to store the corresponding relationship between value and score. On the other hand, it needs to provide the function of sorting according to score, and it can also specify the range of score to obtain the function of value list. Is the function of the jump table
  2. The skip list is a kind of randomized data, proposed by William Pugh in the paper "Skip lists: a probabilistic alternative to balanced trees". The skip list stores elements in a hierarchical linked list in an orderly manner, which is efficient and efficient. The balance tree is comparable-operations such as search, delete, and add can be completed in logarithmic expected time, and the implementation of the jump table is much simpler and more intuitive than the balance tree.

Basic data structure

Insert picture description here

  1. The above figure is a schematic diagram of the jump list. Only four layers are drawn in the figure. The Redis jump table has 64 layers, which means that it
    can hold up to 2^64 elements. The structure corresponding to each kv block is like the zslnode structure in the following code. The kv
    header is also this structure, but the value field is a null value-invalid, and the score is
    Double.MIN_VALUE, which is used for bottoming. The kv uses pointers to string together to form a doubly linked list structure, which is
    arranged in an orderly manner, from small to large. Different kv layer heights may be different, the higher the number of layers, the less kv. The kvs of the same layer
    will use pointers to string together. The traversal of each layer element starts from the kv header.
  2. A clearer skip list implementation article: https://lotabout.me/2018/skip-list/
  3. After understanding the implementation of the jump table, let’s compare it with the ziplist. It must be that the currency ziplist consumes internal space, but its query efficiency is very high. And he does not need contiguous memory space, so he is more friendly to memory. When there is 120KB of discontinuous memory remaining in the memory, you can use linked lists to store them, but you can’t store them using data structures like ziplist, because they need Is continuous 120KB.

Source code

typedef struct zskiplist {

    // 头节点,尾节点
    struct zskiplistNode *header, *tail;

    // 节点数量
    unsigned long length;

    // 目前表内节点的最大层数
    int level;

} zskiplist;
typedef struct zskiplistNode {

    // member 对象
    robj *obj;

    // 分值
    double score;

    // 后退指针
    struct zskiplistNode *backward;

    // 层
    struct zskiplistLevel {

        // 前进指针
        struct zskiplistNode *forward;

        // 这个层跨越的节点数量
        unsigned int span;

    } level[];

} zskiplistNode;
Find
  • Imagine what happens if the jump list has only one level? Insert delete operations need to locate to the appropriate location node (to locate
    the last one than "I" small element, which is the first one than the previous one "I" large elements), the positioning of efficiency certainly better than
    poor, complexity It will be O(n) because it needs to be traversed one by one. Maybe you think of binary search, but the
    structure of binary search can only be an ordered array. After the jump list has a multi-layer structure, the complexity of the positioning algorithm will be reduced to
    O(lg(n)).
  • Read this article to quickly understand: https://lotabout.me/2018/skip-list/
Insert source code
/* Insert a new node in the skiplist. Assumes the element does not already
* exist (up to the caller to enforce that). The skiplist takes ownership
* of the passed SDS string 'ele'. */
zskiplistNode *zslInsert(zskiplist *zsl, double score, sds ele) {
// 存储搜索路径
zskiplistNode *update[ZSKIPLIST_MAXLEVEL], *x;
// 存储经过的节点跨度
unsigned int rank[ZSKIPLIST_MAXLEVEL];
int i, level;
serverAssert(!isnan(score));
x = zsl->header;
// 逐步降级寻找目标节点,得到「搜索路径」
for (i = zsl->level-1; i >= 0; i--) {
/* store rank that is crossed to reach the insert position */
rank[i] = i == (zsl->level-1) ? 0 : rank[i+1];
// 如果 score 相等,还需要比较 value
while (x->level[i].forward &&
(x->level[i].forward->score < score ||
(x->level[i].forward->score == score &&
sdscmp(x->level[i].forward->ele,ele) < 0)))
{
rank[i] += x->level[i].span;
第 210 页 共 226 页
Redis 深度历险:核心原理与应用实践 | 钱文品 著
x = x->level[i].forward;
}
update[i] = x;
}
// 正式进入插入过程
/* we assume the element is not already inside, since we allow duplicated
* scores, reinserting the same element should never happen since the
* caller of zslInsert() should test in the hash table if the element is
* already inside or not. */
// 随机一个层数
level = zslRandomLevel();
// 填充跨度
if (level > zsl->level) {
for (i = zsl->level; i < level; i++) {
rank[i] = 0;
update[i] = zsl->header;
update[i]->level[i].span = zsl->length;
}
// 更新跳跃列表的层高
zsl->level = level;
}
// 创建新节点
x = zslCreateNode(level,score,ele);
// 重排一下前向指针
for (i = 0; i < level; i++) {
x->level[i].forward = update[i]->level[i].forward;
update[i]->level[i].forward = x;
/* update span covered by update[i] as x is inserted here */
x->level[i].span = update[i]->level[i].span - (rank[0] - rank[i]);
update[i]->level[i].span = (rank[0] - rank[i]) + 1;
}
/* increment span for untouched levels */
for (i = level; i < zsl->level; i++) {
update[i]->level[i].span++;
}
// 重排一下后向指针
x->backward = (update[0] == zsl->header) ? NULL : update[0];
if (x->level[0].forward)
x->level[0].forward->backward = x;
else
zsl->tail = x;
zsl->length++;
return x;
}

First, we figure out the "search path" in the process of searching for a suitable insertion point, and then we can start to create a new
node. When creating, we need to randomly assign a layer to this node, and then combine the nodes on the search path with this The new nodes
are strung together by the forward and backward pointers. If the height of the new node allocated is higher than the maximum height of the current jump list, the maximum height of the jump list needs
to be updated.

Delete process

The deletion process is similar to the insertion process, and the "search path" needs to be found first. Then
rearrange the forward and backward pointers for the related nodes of each layer . At the same time, pay attention to update the maximum level maxLevel.

Update process
  • When we call the zadd method, if the corresponding value does not exist, it is the insertion process. If this
    value already exists, just adjust the value of score, then an update process is required. Assuming that this new
    score value does not bring about a change in the sorting position, then there is no need to adjust the position, just modify the score value of the element directly
    . But if the sort position is changed, the position must be adjusted. How to adjust the position?
/* Remove and re-insert when score changes. */
if (score != curscore) {
    zskiplistNode *node;
    serverAssert(zslDelete(zs->zsl,curscore,ele,&node));
    znode = zslInsert(zs->zsl,score,node->ele);
    /* We reused the node->ele SDS string, free the node now
    * since zslInsert created a new one. */
    node->ele = NULL;
    zslFreeNode(node);
    /* Note that we did not removed the original element from
    * the hash table representing the sorted set, so we just
    * update the score. */
    dictGetVal(de) = &znode->score; /* Update score ptr. */
    *flags |= ZADD_UPDATED;
}
return 1;

A simple strategy is to delete this element first, and then insert this element, which requires two path searches. This
is how Redis does it. However, when Redis encounters a change in the score value, it deletes and inserts it directly, without judging whether the position
needs adjustment.

to sum up

  • The ZSET data structure of redis has two encoding methods: ziplist skiplist
  • The trigger of ziplist and skiplist can be customized
  • The skip table is a randomized data structure, and the search, add, and delete operations can be completed in the logarithmic expected time.
  • And we probably took a look at the skip underlying data structure of redis

The above article also talks about red-black trees, and bloggers have also seen many interview questions asked like this:

  • Why does the ordered set of redis use skiplist instead of red-black tree?

answer:

Author's original words:

  1. They are not very memory intensive. It’s up to you basically. Changing parameters about the probability of a node to have a given number of levels will make then less memory intensive than btrees.

  2. A sorted set is often target of many ZRANGE or ZREVRANGE operations, that is, traversing the skip list as a linked list. With this operation the cache locality of skip lists is at least as good as with other kind of balanced trees.

  3. They are simpler to implement, debug, and so forth. For instance thanks to the skip list simplicity I received a patch (already in Redis master) with augmented skip lists implementing ZRANK in O(log(N)). It required little changes to the code.
    (Google Translate by yourself, don’t look down on you!!)
    Other answers:

  1. https://www.zhihu.com/question/20202931
  2. https://blog.csdn.net/hebtu666/article/details/102556064

Guess you like

Origin blog.csdn.net/weixin_40413961/article/details/108041318