Introduction to skiplist and skip list of Redis source code analysis

<div class="iteye-blog-content-contain" style="font-size: 14px"><p><div id="Article"> </p>
<p> <p> ;<a href="https://www.2cto.com/ym/" target="_blank" class="keylink">Source code</a>Version: redis-4.0.1</p>  ;</p>
<p> <p>Source location: </p> server.h : Data structure definition of zskiplistNode and zskiplist. t_zset.c: The functions starting with zsl are the operation functions related to SkipList.  </p>
<p> <h1 id="Introduction to a jump table">I. Introduction to the jump table</h1> </p>
<p> <p>Jump The table (SkipList) is actually a data structure to solve the search problem, but it is neither a balanced tree structure nor a Hash structure. Its characteristic is that the elements are ordered. For more explanations about the jump table, you can refer to Teacher Zhang Tielei - Detailed Explanation of Redis Internal Data Structure (6)&

<p>  <p>a) this implementation allows for repeated scores. // allows for repeated scores<br /> b) the comparison is not just by key (our ‘score’) but by satellite data . //When comparing, compare not only the score but also the value of the object<br /> c) there is a back pointer, so it’sa doubly linked list with the back pointers being only at “level 1”./ /There is a back pointer, that is, a doubly linked list is implemented at the first level, allowing back traversal</p> </p>
<p> </blockquote> </p>
<p>  <p> Next, let's take a look at the data structure definition of SkipList. </p> </p>
<p> <h1 id="Second data structure definition">Second, data structure definition</h1> </p>
<p>  <p>

<p>typedef struct zskiplistNode {     </p>
<p>    sds ele;                              //数据域</p>
<p>    double score;                         //分值 </p>
<p>    struct zskiplistNode *backward;       //后向指针,使得跳表第一层组织为双向链表</p>
<p>    struct zskiplistLevel {               //每一个结点的层级</p> <p>        unsigned int span;                //Span of a layer from the next node</p>
<p>        struct zskiplistNode *forward;    //forward node of a layer</p>

<p>    } level[];                            The value is 32, defined by ZSKIPLIST_MAXLEVEL</p>
<p>} zskiplistNode;</p>
<p></pre> </p>
<p> <p> That is, use the above zskiplistNode to organize a SkipList: </p> </p>
<p> <pre class="brush:sql;"></p>
<p>typedef struct zskiplist { </p>
<p>    struct zskiplistNode *header;     //header</p>
<p>    struct zskiplistNode *tail;    //tail</p>
<p>    unsigned long length;
<p>    int level;
<p>} zskiplist;</pre> </p>
<p> <p> The core data structures are the above two. </p> </p>
<p> <h1 id="three create, insert, search, delete, release">three, create, insert, find, delete, release</h1> < /p>
<p> <p>We use the following example to track the code of SkipList, which involves operations such as creation, insertion, search, deletion, release, etc. (ps: Replace the code of the main function in Redis with the following code to test) </p> </p>
<p> <pre class="brush:sql;"></ p>
<p>// The zslGetElementByRank() function needs to be declared, which is used in the main function</p>





<p>    unsigned long ret;</p>
<p>    zskiplistNode *node;</p>
<p>    zskiplist *zsl = zslCreate();</p>
<p> </p>
<p>    zslInsert(zsl, 65.5, sdsnew(&quot;tom&quot;));             //level = 1</p>
<p>    zslInsert(zsl, 87.5, sdsnew(&quot;jack&quot;));            //level = 4</p>
<p>    zslInsert(zsl, 70.0, sdsnew(&quot;alice&quot;));           //level = 3</p>
<p>    zslInsert(zsl, 95.0, sdsnew(&quot;tony&quot;));            //level = 2</p>
<p> </p>
<p>    zrangespec spec = {                      //定义一个区间, 70.0 &lt;= x &lt;= 90.0</p>
<p>            .min = 70.0,</p>
<p>            .max = 90.0,</p>
<p>            .minex = 0,</p>
<p>            .maxex = 0};</p>
<p> </p>
<p>    printf(&quot;zslFirstInRange 70.0 &lt;= x &lt;= 90.0, x is:&quot;);  // 找到符合区间的最小值</p>
<p>    node = zslFirstInRange(zsl, &amp;spec);</p>
<p>    printf(&quot;%s-&gt;%f\n&quot;, node-&gt;ele, node-&gt;score);</p>
<p> </p>
<p>    printf(&quot;zslLastInRange 70.0 &lt;= x &lt;= 90.0, x is:&quot;);   // 找到符合区间的最大值</p>
<p>    node = zslLastInRange(zsl, &amp;spec);</p>
<p>    printf(&quot;%s-&gt;%f\n&quot;, node-&gt;ele, node-&gt;score);</p>
<p> </p>
<p>    printf(&quot;tony's Ranking is :&quot;);                       // 根据分数获取排名</p>
<p>    ret = zslGetRank(zsl, 95.0, sdsnew(&quot;tony&quot;));</p>
<p>    printf(&quot;%lu\n&quot;, ret);</p>
<p> </p>
<p>    printf(&quot;The Rank equal 4 is :&quot;);                     // 根据排名获取分数</p>
<p>    node = zslGetElementByRank(zsl, 4);</p>
<p>    printf(&quot;%s-&gt;%f\n&quot;, node-&gt;ele, node-&gt;score);</p>
<p> </p>
<p>    ret = zslDelete(zsl, 70.0, sdsnew(&quot;alice&quot;), &amp;node);  // 删除元素</p>
<p>    if (ret == 1) {</p>
<p>        printf(&quot;Delete node:%s-&gt;%f success!\n&quot;, node-&gt;ele, node-&gt;score);</p>
<p>    }</p>
<p> </p>
<p>    zslFree(zsl);                                        // 释放zsl</p>
<p> </p>
<p>    return 0;</p>
<p>}</p>
<p> </p>
<p>Out &gt; </p>
<p>zslFirstInRange 70.0 &lt;= x &lt;= 90.0, x is:alice-&gt;70.000000</p>
<p>zslLastInRange 70.0 &lt;= x &lt;= 90.0, x is:jack-&gt;87.500000</p>
<p>tony's Ranking is :4</p>
<p>The Rank equal 4 is :tony-&gt;95.000000</p>
<p>Delete node:alice-&gt;70.000000 success!</pre> Next, we analyze the code line by line. First, zskiplist *zsl = zslCreate(); creates a SkipList. The key point to be concerned about is to initialize zsl -&gt;header is the maximum level of 32, because ZSKIPLIST_MAXLEVEL is defined as 32, this reason is related to the random function of getting Level in SkipList, please refer to the blog link given at the beginning of the article for details. Let's look at the code of zslCreate: </p>
<p> <pre class="brush:sql;"></p>
<p>zskiplist *zslCreate(void) {</p>
< p>    int j;</p>
<p>    zskiplist *zsl;</p>
<p> </p>
<p>    zsl = zmalloc(sizeof( *zsl));
<p>    zsl-&gt;level = 1;                                          // 初始层级定义为1</p>
<p>    zsl-&gt;length = 0;</p>
<p>    zsl-&gt;header = zslCreateNode(ZSKIPLIST_MAXLEVEL,0,NULL);  // 初始化header为32层</p>
<p>    for (j = 0; j &lt; ZSKIPLIST_MAXLEVEL; j++) {</p>
<p>        zsl-&gt;header-&gt;level[j].forward = NULL;</p>
<p>       zsl-&gt;header-&gt;level[j].span = 0;</p>
<p>    }</p>
<p>    zsl-&gt;header-&gt;backward = NULL;    </p>
<p>    zsl-&gt;tail = NULL;                                        // tail目前为NULL</p>
<p>    return zsl; </p>
<p>}</p>
<p> </p>
<p>// zslCreateNode根据传入的level和score以及ele创建一个level层的zskiplistNode</p>
<p>zskiplistNode *zslCreateNode(int level, double score, sds ele) { </p>
<p>    zskiplistNode *zn =</p>
<p>        zmalloc(sizeof(*zn)+level*sizeof(struct zskiplistLevel));</p>
<p>    zn-&gt;score = score;</p>
<p>    zn-&gt;ele = ele;</p>
<p>    return zn;</p>
<p>}</pre> </p>
nbsp;   *    Just like when we insert a node into an ordered singly linked list, we must first find a node with a smaller number than the current number and save it. </p> <p>    *  2: Obtain the level according to the random function and generate a new node</p> <p>    *  The new node is inserted. </p>









<p>    */ </p>
<p> </p>
<p>    zskiplistNode *update[ZSKIPLIST_MAXLEVEL], *x;</p>
<p>    unsigned int rank[ZSKIPLIST_MAXLEVEL];</p>
<p>    int i, level;</p>
<p> </p>
<p>    /* 第一步: 根据目前传入的score找到插入位置x,并且将各层的前置节点保存至rank[]中 */</p>
<p>    serverAssert(!isnan(score));</p>
<p>    x = zsl-&gt;header;</p>
<p>    for (i = zsl-&gt;level-1; i &gt;= 0; i--) {</p>
<p>       /* store rank that is crossed to reach the insert position */</p>
<p>        rank[i] = i == (zsl-&gt;level-1) ? 0 : rank[i+1];</p>
<p>        while (x-&gt;level[i].forward &amp;&amp;</p>
<p>                (x-&gt;level[i].forward-&gt;score &lt; score ||</p>
<p>                    (x-&gt;level[i].forward-&gt;score == score &amp;&amp;</p>
<p>                    sdscmp(x-&gt;level[i].forward-&gt;ele,ele) &lt; 0)))</p>
<p>        {</p>
<p>            rank[i] += x-&gt;level[i].span;</p>
<p>            x = x-&gt;level[i].forward;</p>
<p>        }</p>
<p>        update[i] = x;</p>
<p>    }</p>
<p>    /* we assume the element is not already inside, since we allow duplicated</p>
<p>     * scores, reinserting the same element should never happen since the</p>
<p>     * caller of zslInsert() should test in the hash table if the element is</p>
<p>     * already inside or not. */</p>
<p> </p>
<p>    /* 第二步:获取level,生成新的节点 */</p>
<p>    level = zslRandomLevel();               </p>
<p>    if (level &gt; zsl-&gt;level) {</p>
<p>        for (i = zsl-&gt;level; i &lt; level; i++) {</p>
<p>            rank[i] = 0;</p>
<p>            update[i] = zsl-&gt;header;</p>
<p>            update[i]-&gt;level[i].span = zsl-&gt;length;</p>
<p>        }</p>
<p>        zsl-&gt;level = level;</p>
<p>    }</p>
<p>    x = zslCreateNode(level,score,ele);</p> <p>    for (i = 0; i &lt; level; i++) {</p> <p>    /* Step 3: Modify the pointing of each pointer and insert the new node created*/</p>
<p> </p>


<p>        x-&gt;level[i].forward = update[i]-&gt;level[i].forward;</p>
<p>        update[i]-&gt;level[i].forward = x;</p>
<p> </p>
<p>        /* update span covered by update[i] as x is inserted here */</p>
<p>        x-&gt;level[i].span = update[i]-&gt;level[i].span - (rank[0] - rank[i]);</p>
<p>        update[i]-&gt;level[i].span = (rank[0] - rank[i]) + 1;</p>
<p>    }</p>
<p> </p>
<p>    /* increment span for untouched levels */</p>
<p>    for (i = level; i &lt; zsl-&gt;level; i++) {</p>
<p>        update[i]-&gt;level[i].span++;</p>
<p>    }</p>
<p> </p>
<p>    /* 更新backword的指向 */</p>
<p>    x-&gt;backward = (update[0] == zsl-&gt;header) ? NULL : update[0];</p>
<p>    if (x-&gt;level[0].forward)</p>
<p>        x-&gt;level[0].forward-&gt;backward = x;</p>
<p>    else</p>
<p>        zsl-&gt;tail = x;</p>
<p>    zsl-&gt;length++;</p>
<p>    return x;</p>
<p>}</pre> </p>
<p> <p> It should be noted that the meaning of span is the span of the current node from the next node. The reason why elements can be obtained according to the rank ranking is determined according to the span. update[i] saves the previous node that the i-th layer should insert into the node, which is used when updating the pointer in the third step. The zsl with an element inserted is shown below (level=1): </p> </p>
<p> <p><img alt="Write image description here" src=" /uploadfile/Collfiles/20171114/20171114090518124.jpg" /></p> Then we continue to insert the next three pieces of data, their levels are jack-&gt;4, alice-&gt;3, tony -&gt;2, the zsl at this time is as shown in the figure below, pay attention to the update of the span: </p>
<p> <p><img alt="Write the picture description here" src=" /uploadfile/Collfiles/20171114/20171114090519125.jpg" /></p> Well, the insertion is finally over! Next, let's look at the related operations of search. The above code gives four examples of search, which are: </p>
<p> <br /> 1) Find the smallest element within the specified range< /p>
<


<p> <p>We analyze (1) and (4), (2), (3) are the same. First look at (1), use the zrangespec structure to define a range of 70.0 &lt;= x &lt;= 90.0, the zrangespec structure is as follows: </p> </p>
< p> <pre class="brush:sql;"></p>
<p>typedef struct {</p>
<p>    double min, max;    // definition Min range and max range</p>
<p>    int minex, maxex;   // Whether to include the min and max itself, 0 means inclusive, 1 means not inclusive</p>
<p> } zrangespec;</p>
<p> </p>
<p>/* The code to define the range is as follows*/</p>
<p>zrangespec spec = {          ;

<p>            .max = 90.0,</p>
<p>            .minex = 0,</p>
<p>            .maxex = 0};                 //为结构体元素赋值</pre> </p>
<p> <p>下面调用zslFirstInRange()函数遍历得到了满足70.0 &lt;= x &lt;= 90.0的最小节点,代码如下:</p> </p>
<p> <pre class="brush:sql;"></p>
<p>/* Find the first node that is contained in the specified range.</p>
<p> * Returns NULL when no element is contained in the range. */</p>
<p>zskiplistNode *zslFirstInRange(zskiplist *zsl, zrangespec *range) {</p>
<p>    zskiplistNode *x;</p>
<p>    int i;</p>
<p> </p>
<p>    /* If everything is out of range, return early. */</p>
<p>    if (!zslIsInRange(zsl,range)) return NULL;                // 判断给定的范围是否合法</p>
<p> </p>
<p>    x = zsl-&gt;header;   </p>
<p>    for (i = zsl-&gt;level-1; i &gt;= 0; i--) {                     // 从最高的Level开始 </p>
<p>        /* Go forward while *OUT* of range. */                </p>
<p>        while (x-&gt;level[i].forward &amp;&amp;                         //只要没结束 &amp;&amp; 目前结点的score小于目标score</p>
<p>            !zslValueGteMin(x-&gt;level[i].forward-&gt;score,range))</p>
<p>            // 将结点走到当前的节点</p>
<p>                x = x-&gt;level[i].forward;</p>
<p>    }</p>
<p> </p>
<p>    /* This is an inner range, so the next node cannot be NULL. */</p>
<p>    x = x-&gt;level[0].forward;                                 // 找到了符合的点</p>
<p>    serverAssert(x != NULL);       </p>
<p> </p>
<p>    /* Check if score &lt;= max. */</p>
<p>    if (!zslValueLteMax(x-&gt;score,range)) return NULL;       // 判断返回的值是否小于max值</p>
<p>    return x;</p>
<p>}</pre> </p>
<p> <p>It can be seen that the core idea of ​​traversal is:<br /> (1) High Level -&gt; Low Level<br /> (2) Small score -&gt; Large score<br /> That is, in the process of traversing and comparing from a high level, if the score at this time is less than the value of a certain high level, the node before this node is lowered by a level and continues to traverse forward. We find the one with 70.0. The route is shown in the figure below (red line in the figure): </p> </p>
<p> <p><img alt="Write image description here" src="/uploadfile/Collfiles/20171114/20171114090519126.jpg" /></p> The function zslGetElementByRank() is mainly completed according to the span domain. The code is as follows: </p>
<p> <pre class="brush:sql;"></p>
<p>zskiplistNode * zslGetElementByRank(zskiplist *zsl, unsigned long rank) {</p>
<p>    zskiplistNode *x;</p>
<p>    unsigned long traversed = 0;</p>
<p >    int i;</p>
<p> </p>
<p>    x = zsl-&gt;header;</p>
<p>    for (i = zsl-&gt; level-1; i &gt;= 0; i--) {</p>
<p>        while (x-&gt;level[i].forward &amp;&amp; (traversed + x-&gt;level[i].span) &lt;= rank)</p>
<p>        {</p>
<p>            traversed += x-&gt;level[i].span;</p>
<p>            x = x-&gt;level[i].forward;</p>
<p>        }</p>
<p>        if (traversed == rank) {</p>
<p>            return x;</p>
<p>        }</p>
<p>    }</p>
<p>    return NULL;</p>
<p>}</pre> </p>
<p> <p>The idea of ​​traversal is no different from the previous one. This traversal route is shown in the figure below:</p> </p>
<p> <p><img alt="Write picture description here" src="/uploadfile/Collfiles/20171114/20171114090520128.jpg" /></p> Next, let's look at the Delete() function , ret = zslDelete(zsl, 70.0, sdsnew(&quot;alice&quot;), &amp;node); means to delete the element whose score is 70.0 and the data is alice in zsl, which is also the second element of Redis SkipList Features, comparing an element not only compares the score, but also compares the data, look at the code of zslDelete below: </p>
<p> <pre class="brush:sql;"></p>
<p >int zslDelete(zskiplist *zsl, double score, sds ele, zskiplistNode **node) {</p>
<p>    zskiplistNode *update[ZSKIPLIST_MAXLEVEL], *x;</p>
<p>    int i;</p>
<p> </p>
<p>    x = zsl-& gt;header;</p>
<p>    for (i = zsl-&gt;level-1; i &gt;= 0; i--) {</p>
<p>        while (x-&gt;level[i].forward &amp;&amp;</p>
<p>                (x-&gt;level[i].forward-&gt;score &lt; score ||</p>
<p>                    (x-&gt;level[i].forward-&gt;score == score &amp;&amp;</p>
<p>                     sdscmp(x-&gt;level[i].forward-&gt;ele,ele) &lt; 0)))</p>
<p>        {</p>
<p>            x = x-&gt;level[i].forward;</p>
<p>        }</p>
<p>        update[i] = x;</p>
<p>    }</p>
<p>    /* We may have multiple elements with the same score, what we need</p>
<p>     * is to find the element with both the right score and object. */</p>
<p>    x = x-&gt;level[0].forward;</p>
<p>    if (x &amp;&amp; score == x-&gt;score &amp;&amp; sdscmp(x-&gt;ele,ele) == 0) {</p>
<p>        zslDeleteNode(zsl, x, update);</p>
<p>        if (!node)</p>
<p>            zslFreeNode(x);</p>
<p>       else</p>
<p>            *node = x;</p>
<p>        return 1;</p>
<p>    }</p>
<p>    return 0; /* not found */</p>
<p>}</pre> It should be noted that the fourth parameter of zslDelete() is a zskipListNode ** type. If it is not NULL, then the code will not release it directly after traversing to find the node, but hand over the address to it, and the subsequent release of this space must be handled manually by us. The idea of ​​traversal and comparison is the same as before, in update[], record the node before each layer deletes the node. The while loop compares the conditions, sdscmp(x-&gt;level[i].forward-&gt;ele,ele) &lt; 0 because the insertion function zslInsert() is also inserted according to this logic. Finally need to compare again if (x &amp;&amp; score == x-&gt;score &amp;&amp; sdscmp(x-&gt;ele,ele) == 0) is Because Redis SkipList allows elements of the same score to exist.  </p>
<p> <p> Finally, look at the release function zslFree(zsl), the idea is very simple, because level[0] must be continuous (and a doubly linked list), so from level[ 0] It is enough to traverse and release in turn. </p> </p>
<p> <pre class="brush:sql;"></p>
<p>













<p> <table> </p>
<p>  <thead> </p>
<p>   <tr> </p>
<p>    <th> 操作</th> </p>
<p>    <th> 一般性能</th> </p>
<p>    <th> 最坏性能</th> </p>
<p>   </tr> </p>
<p>  </thead> </p>
<p>  <tbody> </p>
<p>   <tr> </p>
<p>    <td>插入</td> </p>
<p>    <td>O(log n)</td> </p>
<p>    <td>O(n)</td> </p>
<p>   </tr> </p>
<p>   <tr> </p> /tr> </p> <p>   <tr> </p> /tr> </p> <p>   <tr> </p>












Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326504013&siteId=291194637