redis source analysis (c) - the underlying data structures

Article Directory


This article basic data about the Redis:

SDS

SDS (Simple Dynamic String) Redis is the most basic data structure. Literal translation is "simple dynamic strings." Redis own implements a dynamic string, instead of using the string C language.

sds data structure:

struct sdshdr {
    
    // buf 中已占用空间的长度
    int len;

    // buf 中剩余可用空间的长度
    int free;

    // 数据空间
    char buf[];
};

So on a SDS follows:

[Picture dump outside the chain fails, the source station may have a security chain mechanism, it is recommended to save the pictures uploaded directly down (img-t2YCOSKC-1573627290412) ( media / 15663750838907 / 15663751675407.jpg)]
So we see, sds contains three parameters . buf len length, the remaining length of buf, and buf.

Why such a design?

  1. String length may be obtained directly.
    C language, the need to traverse the length of the string acquired by the string pointer, time complexity is O (n), and the length of SDS, direct access from the len complexity is O (1).

  2. To prevent buffer overflow.
    Since the C language string length is not recorded, if increasing the length of a character transmission, if no attention may overflow, the cover next to the character data. For purposes of SDS increase the free length of the string length needs to be verified, and if it will not free the entire expansion buf, prevent overflow.

  3. Modifying the memory caused by the reduced length of the string assigned again.
    As a high-performance memory redis database, a higher respective speed. String is also a great probability of frequent changes. SDS unused space by this parameter, the relationship between the amount and length of the underlying string buf lifted. The length of buf is not the length of the string. Based on this SDS designed to achieve a partial pre-allocated space and an inert released.

    • Pre-allocated
      if modifications SDS, that if len is less than 1MB len = 2 * len + 1byte. This is used to store a null byte.
      If len is greater than 1MB then modified SDS len = 1MB + len + 1byte.
    • Inert release
      shortening SDS string length, redis reducing SDS not immediately share the memory. Just increase the length of the free. While providing outwardly API. Real needs when released, only to shrink again occupied by SDS Memory
  • Binary safe.
    C language string is "\ 0" mark the end of the string. And the length len of SDS is used to mark the end of the string. Therefore, SDS may be stored outside the binary stream any string. Because there may be some binary stream in the stream contains a "\ 0" String caused a premature end. That does not depend on SDS "\ 0" as the basis for the end.

  • Compatible with the C language
    SDS customary use "\ 0" as the end of the administration. Part C language string common API may be used.

List

C language does not list data structure to achieve a so Redis own. Redis in the list are:

typedef struct listNode {

    // 前置节点
    struct listNode *prev;

    // 后置节点
    struct listNode *next;

    // 节点的值
    void *value;

} listNode;

Data structure is very typical of the doubly linked list.

At the same time it provides the following functions for the operation of a bidirectional linked list:

/*
 * 双端链表迭代器
 */
typedef struct listIter {

    // 当前迭代到的节点
    listNode *next;

    // 迭代的方向
    int direction;

} listIter;

/*
 * 双端链表结构
 */
typedef struct list {

    // 表头节点
    listNode *head;

    // 表尾节点
    listNode *tail;

    // 节点值复制函数
    void *(*dup)(void *ptr);

    // 节点值释放函数
    void (*free)(void *ptr);

    // 节点值对比函数
    int (*match)(void *ptr, void *key);

    // 链表所包含的节点数量
    unsigned long len;

} list;

A linked list structure is relatively simple, the following data structure:
[image dump the chain fails, the source station may have security chain mechanism, it is recommended to save the picture down uploaded directly (img-pwkSNd6w-1573627290413) ( media / 15663750838907 / 15663752964435.jpg)]

To summarize nature:

  • Doubly linked list, looking for a node or the next node on a time complexity of O (1).
  • recording head and tail of the list, looking for the head and tail of the time complexity is O (1).
  • Get list of length len time complexity of O (1).

dictionary

Dictionary data structure very similar to java in Hashmap.

Redis dictionary consists of three basic data structures. The bottom of the unit is a hash table node. Structured as follows:

typedef struct dictEntry {
    
    // 键
    void *key;

    // 值
    union {
        void *val;
        uint64_t u64;
        int64_t s64;
    } v;

    // 指向下个哈希表节点,形成链表
    struct dictEntry *next;

} dictEntry;

Indeed hash table node is a single node of the list. Save a little pointer to the next node. key is the key node, v is the value of this node. This can be either a pointer v, or may be a uint64_t int64_t integer. * Next points to the next node.

Through an array of a hash table to each node can be linked:

typedef struct dictht {
    
    // 哈希表数组
    dictEntry **table;

    // 哈希表大小
    unsigned long size;
    
    // 哈希表大小掩码,用于计算索引值
    // 总是等于 size - 1
    unsigned long sizemask;

    // 该哈希表已有节点的数量
    unsigned long used;

} dictht;

By way of illustration we observed:

[Image dump the chain fails, the source station may have security chain mechanism, it is recommended to save the picture down uploaded directly (img-FVEBYd5O-1573627290413) (media / 15663750838907 / 15663753610286.jpg)]

In fact, if the basic data structure of java knowledge of the students will find this data structure and the HashMap java is very similar, that is, plus an array of linked list structure.

The dictionary data structure:

typedef struct dict {

    // 类型特定函数
    dictType *type;

    // 私有数据
    void *privdata;

    // 哈希表
    dictht ht[2];

    // rehash 索引
    // 当 rehash 不在进行时,值为 -1
    int rehashidx; /* rehashing not in progress if rehashidx == -1 */

    // 目前正在运行的安全迭代器的数量
    int iterators; /* number of iterators currently running */

} dict;

DictType which is a set of methods, as follows:

/*
 * 字典类型特定函数
 */
typedef struct dictType {

    // 计算哈希值的函数
    unsigned int (*hashFunction)(const void *key);

    // 复制键的函数
    void *(*keyDup)(void *privdata, const void *key);

    // 复制值的函数
    void *(*valDup)(void *privdata, const void *obj);

    // 对比键的函数
    int (*keyCompare)(void *privdata, const void *key1, const void *key2);

    // 销毁键的函数
    void (*keyDestructor)(void *privdata, void *key);
    
    // 销毁值的函数
    void (*valDestructor)(void *privdata, void *obj);

} dictType;

Dictionary data structure as shown below:
[image dump the chain fails, the source station may have security chain mechanism, it is recommended to save the picture down uploaded directly (img-PI20viEC-1573627290414) ( media / 15663750838907 / 15663754115428.jpg)]

Here we can see a dict has two dictht. Generally only use ht [0], when expansion occurs when rehash of,? Ht [1] will be used.

When we look or how to research a hash structure even when we first consider the insertion of a dict data?

We sort out the logic for inserting data.

  • Key hash value is calculated. Hash map to find?? Position table array.

  • If the data has a key? Existed. That means the occurrence of a hash collision. New node, you will receive before the next node as a node linked list? Pointer.

  • If ?? key multiple collisions occurred, causing the length of the list getting longer and longer. It will make the dictionary queries to slow down. In order to maintain a normal load. Redis would dictionary rehash operations. To increase? Length table array. So we have to look at Redis focus of rehash. Proceed as follows:

    • The? Ht [0] data and type of operation (enlarged or reduced), the allocation size ht [1] is.
    • The rehash ht [0] to data [1] ht.
    • After completion rehash the ht [1] is set to ht [0], to generate a new ht [1] standby.
  • Progressive rehash.

In fact, if a large number of key dictionary, more than ten million, rehash it will be a relatively long time. So in order to be able to continue to provide dictionary services rehash of time. Redis provided to achieve a progressive rehash
rehash the following steps:

  1. Allocation? Ht [1] space, so that the dictionary also holds ht [1] and ht [0].
  2. Maintenance in the dictionary a rehashidx, is set to 0, the dictionary is rehash.
  3. During rehash, each operation of the dictionary designating operation other than addition, will be based ht [0] on rehashidx value pairs corresponding to rehash [1] ht.
  4. As the operation proceeds, ht [0] to the data will all rehash ht [1]. Provided ht [0] is rehashidx -1, ending progressive rehash.
    This ensures that data can be smoothly rehash. Rehash it takes too long to prevent the blocking thread.
  • Rehash performing the process, if the delete and update operations, etc., will be performed on two hash tables. If it is, then find priority in [0] on ht, if not found, go to ht [1] Find. If it is, then it will only insert insert data in ht [1] in. This will ensure that the ht [1] data only to rise, ht [0] is not only reduced by the data.
发布了257 篇原创文章 · 获赞 223 · 访问量 32万+

Guess you like

Origin blog.csdn.net/csdn_kou/article/details/103048966