"Redis Design and Implementation" study notes (a)

@ (Redis)

A data structure with an object

1.SDS

SDS is a string data structure Redis achieved.
structure:

struct sdshdr {
    int len; //字符串长度
    int free; //未使用的字符串长度
    char buf[] //保存字符
}

Why not use c array of characters

SDS record string length, the length of the string operation is acquired O1
SDS longer apply a memory, for example, if the string length is 5, it will claim more than the memory 5, and the next string modification, there is no need to re-apply the memory. It is possible to reduce the number of applications and free up memory and improve efficiency
Since the character array is the bottom of sds of c, c can be compatible String Functions

2. list

Data structures:
list node

typedef struct listNode {
    struct
     listNode *prev //前节点
     struct listNode * next // 后节点
     void *value; //节点的值

}

List

typedef struct list {
        listNode *head; // 头节点
        listNode *tail ;// 尾节点
        unsigned log len;// 节点长度
    
}

3. Dictionary

Hash table node

typedef struct dictEntry{
    void *key; //键
    union{
        void *val;
        uint64_t u64;
        int64_t s64;
    } v; // 节点的值，可以是指针（val），也可以是数值（u64，s64）
    struct dictEntry *next ; //下一个节点，形成链表

}

Hash table

typedef struct dictht {
    dictEntry **table; //哈希表数组
    unsigned log size; //哈希表大小
    unsigned log sizemask;// 大小掩码，总是等于size-1
    unsigned log used;//已有节点数量

}

dictionary

typedef struct dict {
    dictType *type; //类型特定函数
    void *pricdata; // 私有数据
    dictht ht[2]; // 哈希表
    int trehashidx; // rehash索引，当rehash不在进行时，值为-1
}

A dictionary with two hash table is used to store data 0, 1 for rehash.
When to be stored k0 = v0, the first thing the program calculates a hash value of k0 is, for example 8, if sizemask is 3, size 4, then calculates index = 8 & 3 = 0 (not know & what operation, it should be 8mode4 = 0, remainder). Then go to ht [0] .table data, find the index equal to the hash node 0, and then if there is a hash node, and if so, to see whether the next node is empty, empty until you find the next node, and then create a the new node, the upper node of a next point to the new node
if the value of k0 is acquired, the index calculation, even the node traversal table, until it finds a node key equal to k0.

and so

If a corresponding index list is very long, in fact, it takes the complexity On to traverse, and then get the value of the node
When the number (used value) nodes increases to a certain value of the dictionary, the dictionary will be performed rehash operations, i.e. increasing the size value, the length of the table, which would reduce the length of the linked list node

rehash

rehash process is

When the trigger rehash, modify trehashidx value of the dictionary, it said it is executing rehash
redis will start a new process, traversing dt [0] of all keys, and then recalculate the index, put dt [1], of course, dt [1] in size than dt [0] Large
In the process of rehash, delete the dictionary, search, update operations will be performed on both the inside dt
Insertion is performed only in the dt1
After completion rehash, the execution dt1 dt0, dt1 the blank.

Load factor = used / size

When the load factor is less than 1, performing a shrink operation
Or when the server is not performed BGSAVE BGREWRITEAOF operation, and the load factor is greater than 1, the implementation of extended operation
Or when the server performs BGSAVE BGREWRITEAOF operation, and the load factor is greater than 5, perform an extended operation

Table 4. Jumping

Jump table, mainly used to speed up the traversal speed.
Such as an ordered list, 1,2,3,4,5,50
if you want to find where to node 50, the need for the previous five nodes are traversed once, slower. If you recorded the first node 50 where, or 5 where, the program can skip directly previous node, directly to, or close to the target node.
Note that the list must be ordered.

Jump table structure

header pointing to the head node
tail pointing tail node
level represents the largest number of layers of layers of nodes
length, number of nodes represents

Jump table node structure

typedef struct zskiplistNode {
    struct zskiplistNode *backward;// 上一个节点的指针
    double score;// 节点的分值
    robj *obj; // 节点的存储对象
    struct zkiplistLevel{// 节点的层，是一个列表
        struct zskiplistNode *forward; //下一个节点的指针
        unsigned int span;  //前进的跨度
    } level[];
}

A list of the number of elements, there will be a number of jump table node
when a new node is created, the program according to a power law, and a randomly generated 32 direct value as the size of the level of the array, the size is the height of this node .
Elements, for example, the above 1nodes, there will be a point level 50node, the span is 5. Thus when the program needs to find the time 50, traversing node 1, node 1 Found less than 50, go to level 1 node traversal,

If the first level is 50 node points, then immediately found.
If the first level is 5 points, found that 5 does not meet, to find the next level

5. integers

When a collection (the Set) contains only integer elements, but the small number of integers will be used.

type struct intset {
    uint32_t encoding;//编码方式，整数的大小，例如32位，16位
    uint32_t length;//集合的长度
    int8_t contents[];//集合的元素，类型取决于encoding属性的值

}

upgrade

If the encoding is set to 16, when to insert a 32-bit integer data structure will be upgraded, such upgrading to 32 bits.
But it will not downgrade operation.

6. Packing List

When the length of a short list and a dictionary, it will use the compression list.

Attributes:

Bytes zlbyte, uint32_t type, recording the compressed list
zltail uint32_t recorded from the start address of the last element node how many bytes, can quickly locate the end of this node
zllen nodes
entryX node elements
zlend compression end of the list

Different compression above list is a data structure inside redis struct, packing list is a section of memory data.

1. The configuration of the node, i.e. the structure inside entryX

Previous_entry_length before a node length in bytes
encoding, the saved data types
content, content

If you want to find an element,

First check zltail, the positioning node to the tail
Reading the tail node encoding, to determine the length of content
Read the contents of the content
If the content does not comply
The position of the front previous_entry_length ,, a positioning node, repeat the above procedure

If the new element will be inserted in the head, then update previous_entry_length the previous node.
If you delete an element, before modifying a node previous_entry_length is empty.

A list of the advantages of compression

Simple, can save memory
Reading and writing are relatively simple, you do not need to change too much value

7. Object

There are five objects Redis

String
List
Hash
set
Ordered set

redis by typethe command, you can view the object type of a key

This objects 5 at different times, one will use the data structure described above. Also during operation, the data structure will change.

Command object encodingcan be used to view key data structures

redis 127.0.0.1:6701> set test 'aaa'
OK
redis 127.0.0.1:6701> type test
string
redis 127.0.0.1:6701> object encoding test
"embstr"
redis 127.0.0.1:6701>

embstr is above SDS.

Data structures are used redis

Int integer
SDS enbstr, using the underlying sds
Simple Dynamic String raw, using the underlying sds
Dictionary hashtable
Double-ended list linkedlist
Packing List ziplist
Integers intset
Jumping and dictionaries skiplist

Data structure of objects

typedef struct redisObject {
    unsigned type:4; //对象的类型，对应上面的五种对象
    unsigned encoding:4; //使用的数据结构，对应上面的几种结构
    void *ptr ;//底层的数据结构的指针
}

1. String object

Int raw data structure may be embstr and
if all digital string, int type will be used
if the size of the string is greater than 39 bytes, used to store raw
store if size is less than 39 bytes, use embstr

The difference between raw and embstr is, raw need to apply twice the memory, embstr only once

2. The list of objects

Or it may be a list of objects ziplist linkedlist
If the length of the list elements are less than 64 bytes, but less than the number of elements 512, will be used ziplist, otherwise the linkedlist

3. hash object

Hash object may be ziplist hashtable or
if the length of the key value are less than 64 bytes, but the number of key-value pairs is less than 512, will be used ziplist, otherwise the linkedlist

4. Collection Object

It can be intset or hashtable

When the elements are integer values no more than 512 elements, IntSet used, otherwise the hashtable

5. ordered set

Or data structure may be ziplist skiplist
when the element is less than 128, and the element length is less than 64 bytes, ziplist used, otherwise the skiplist.

6. garbage collection

redis reference counter used to implement the memory recovery

Each object has a property int refcount to record the number of times referenced.
In the business end,

Every reference to the object, you need to call incrRefCount function object, use object references plus a calculation,
Every time the object is not referenced, the object decrRefCount need to call function to make the properties refcount is decremented by 1, if the reference count is less than 1, the memory will be recovered.

7. Object Sharing

Digital 0-9999, these strings target values will be created when it redis start and then shared.

Why not share other values, because if you want to share other values, to create each time, there is to see this object in memory has no value, the query efficiency is very low.

8. Long idling object

Each object has attribute unsigned lru: 22 ;, the time for recording the object was last accessed

By object idletimecommand, you can view a key length of idle time, in fact, the current timestamp minus lru.
When redis reclaim memory, and when the recovery algorithm is volatile-lru or allkeys-lru, priority key length longer recovered during idling.