Redis quick table, compressed table and doubly linked list (focus on quicklist)

Preface

Recently I was reading the book "Redis Design and Implementation". It was really good. I was fascinated by it all at once. Thank you author. However, I found a problem when I was studying. The Redis version 5.0.9 was installed on my server, and the author introduced the Redis version 3.0. In the first part of the data structure and the object chapter, there were some differences, that is The data structure used at the bottom of the list structure exposed by redis. Since there is no record in the book, I checked some information on the Internet and studied it, and then I made a summary for myself as my own notes.

difference

The difference is that before redis 3.2, it used ziplist and linkedlist encoding as the underlying implementation of the list key. After it, a data structure called quicklist was used as its underlying implementation. When using ziplist and linkedlist as the underlying implementation of the list key, there will be a selection criterion between them: when choosing ziplist:
先来介绍下redis3.2之前的版本的知识点:

  • The length of all string elements stored in the list object is less than 64 bytes;
  • The number of elements stored in the list object is less than 512

The above are the conditions that must be met to select ziplist as the underlying implementation. If not, the linkedlist is selected as the underlying implementation.

127.0.0.1:6379> rpush blah "hello" "world" "again"
3
127.0.0.1:6379> object encoding blah
ziplist
127.0.0.1:6379> rpush blah "wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww"
4
127.0.0.1:6379> object encoding blah
linkedlist

再来介绍下redis3.2之后的版本:

This involves the quicklist data structure, there is no record in the book, so I checked the information and summarized it in my blog.

When I installed redis version 5.0.9, the results of the execution of the above instructions will be different

127.0.0.1:6379> rpush blah "hello" "world" "again"
3
127.0.0.1:6379> object encoding blah
quicklist
127.0.0.1:6379> rpush blah "wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww"
4
127.0.0.1:6379> object encoding blah
quicklist

Introduction to quicklist data structure

I won’t introduce ziplist and linkedlist, they are in the book, let’s take a look at quicklist.
The realization of quicklist also relies on ziplist and linkedlist, which is a combination of the two structures. It stores the ziplist in segments, which is divided into quicklistNode nodes for storage. Each quicklistNode points to a ziplist, and then quicklistNodes are connected through two-way pointers. Let's take a look at the general structure:

you may have some questions when you see this structure:

  1. What does this structure mean, why some nodes are ziplist and some are quicklistZF?
  2. Why are there two nodes at the head and tail of the quicklist that are ziplist, and the remaining one in the middle is quicklistZF?
  3. Why is the number of data in the ziplist of each quicklistNode inconsistent?

Why optimize the underlying data structure into a quicklist?

Before solving the above problem, let's first solve a primary problem, that is, why does redis optimize the underlying data structure of the list key into a quicklist?
In fact, this reason can be considered from two aspects:

  1. The structure of ziplist, its internal data storage is a continuous space, in this case, a large memory space is required. For example, we want to store a lot of data, but there is no continuous storage space in the memory that meets the requirements, but there are many discontinuous small spaces (total can meet the requirements).
  2. Let's talk about the structure of linkedlist. It does not require continuous data storage, which can avoid the above drawbacks. However, since this way, each node is allocated a piece of memory, which may cause a lot of memory fragmentation.

Based on the above two considerations, redis has optimized this situation after version 3.2, and the data structure of quicklist came out. Quicklist is actually a segmented ziplist . Why do you say that? In fact, the basic unit of quicklist storage data is quicklistNode, and the content area of ​​each quicklistNode is stored in a ziplist data structure. This is shown in the diagram above.

Why is the number of data in the ziplist of each quicklistNode inconsistent?

Now it is converted to quicklist, but still need to consider, what about the content processing in the ziplist in quicklistNode? How many data do I need to store in a ziplist? Same as the above two points

  1. If the content of the ziplist is allocated less, that is, it is developed in the direction of the linkedlist, and a lot of memory fragments may be generated
  2. But if the content in the ziplist is allocated more, there will be a problem, that is, a large contiguous memory space is required.

The redis designer has already thought about it for us, and its configuration file has such a parameter:list-max-ziplist-size

See this configuration, let's explain this parameter, it can set a positive number or a negative number

  1. When it takes a positive number, it means that the length of the ziplist on the quicklist node is limited according to the number of data items. For example, if we set it to 4, it means that the data items of each ziplist cannot exceed 5 at most
  2. When it takes a negative number, it means that the length of the ziplist on the quicklsit node is limited according to the occupied bytes. At this time, it can only take five values ​​from -1 to -5. The meaning of each value is as follows:
    1), -5: The size of the ziplist on each quicklist node cannot exceed 64kb. (1kb == 1024 byte)
    2), -4: The size of the ziplist on each quicklsit node cannot exceed 32kb.
    3), -3: The size of the ziplist on each quicklsit node cannot exceed 16kb.
    4), -2: The size of the ziplist on each quicklsit node cannot exceed 8kb.
    5), -1: The size of the ziplist on each quicklsit node cannot exceed 4kb.

So there will be the problem that the number of data stored in the ziplist memory is inconsistent. We can also manually set the parameter values.

What does this structure mean, why some nodes are ziplist and some are quicklistZF?

Here is another configuration parameter: list-compress-depth
In fact, when the linked list is very long, the most frequently accessed data is the data at both ends, and the frequency of access in the middle is relatively low, so we can compress the middle nodes to further save space. The list-compress-depth mentioned above is to set this.
The meaning of the value of this parameter is as follows:

  • 0: It is a special value, which means no compression. This is the default value of redis
  • 1: Indicates that one node at each end of the quicklist is not compressed, and the intermediate node is compressed
  • 2: Indicates that two nodes at both ends of the quicklist are not compressed, and the intermediate node is compressed
  • 3: Indicates that there are three nodes at each end of the quicklist that are not compressed, and the intermediate node is compressed
  • …And so on

That is to say, there will be a problem. The nodes at both ends are ziplist, and the intermediate node is quicklistZF.

Let's look at the default configuration of the redis configuration file above.

Simple understanding of source code structure

Let me briefly talk about the storage method of quicklist through the above figure. It has two structures at the bottom, one quicklist and one quicklistNode. Below is the source code I am looking for, you can simply look at it.

typedef struct quicklistNode {
    
    
    struct quicklistNode *prev;
    struct quicklistNode *next;
    unsigned char *zl;
    unsigned int sz;             /* ziplist size in bytes */
    unsigned int count : 16;     /* count of items in ziplist */
    unsigned int encoding : 2;   /* RAW==1 or LZF==2 */
    unsigned int container : 2;  /* NONE==1 or ZIPLIST==2 */
    unsigned int recompress : 1; /* was this node previous compressed? */
    unsigned int attempted_compress : 1; /* node can't compress; too small */
    unsigned int extra : 10; /* more bits to steal for future usage */
} quicklistNode;

typedef struct quicklistLZF {
    
    
    unsigned int sz; /* LZF size in bytes*/
    char compressed[];
} quicklistLZF;

typedef struct quicklist {
    
    
    quicklistNode *head;
    quicklistNode *tail;
    unsigned long count; /* total count of all entries in all ziplists */
    unsigned int len; /* number of quicklistNodes */
    int fill : 16; /* fill factor for individual nodes */
    unsigned int compress : 16; /* depth of end nodes not to compress;0=off */
} quicklist;

The main function of quicklist is to point to the head and tail of the node

to sum up

In general, quicklist is a combination of the advantages of ziplist and linkedlist to further optimize the underlying storage of redis's list keys.

Write at the end

Read more books and practice, and we will discover many interesting things over time.

Guess you like

Origin blog.csdn.net/MarkusZhang/article/details/109000894