Redis basics of storage system (1) Data structure and data type

1. Redis data structure

Redis has a total of six data structures, namely simple dynamic strings, linked lists, dictionaries, jump tables, integer sets, and compressed lists.

(1) Simple Dynamic String (SDS)
Redis will only use C strings as literals. In most cases, Redis uses SDS (Simple Dynamic String) as a string representation.

(2) Data structure of SDS:

struct sdshdr {
    // 记录buf数据中已使用字节的数量
    // 等于SDS所保存字符串的长度
    int len;
    
    // 记录buf数组中未使用字节的数量
    int free;;
    
    // 字节数组,用于保存字符串
    char buf[];
}

Insert picture description here
Compared with C strings, SDS has the following advantages:

Constant complexity Obtain the string length to prevent buffer overflow, reduce the number of memory reallocations caused by modifying the string Two, perform security compatible part of the C string function

2. Linked list (list)

The data structure of the linked list:

typedef struct listNode {
    // 前置节点
    struct listNode *prev;
    
    // 后置节点
    struct listNode *next;
    
    // 节点的值
    void *value;
}listNode;

typedef struct list {
    // 表头节点
    listNode *head;
    
    // 表尾节点
    listNode *tail;
    
    // 链表所包含的节点数量
    unsigned long len;
    
    // 节点值复制函数
    void *(*dup)(void *ptr);
    
    // 节点值复制函数
    void (*free)(void *ptr);
    
    // 节点值对比函数
    int (*match)(void *ptr,void *key);
}list;

Insert picture description here

  • Linked lists are widely used to implement various functions of Redis, such as list keys, publish and subscribe, slow queries, and monitors.

  • Each linked list node is represented by a listNode structure, and each node has a pointer to the pre-node and post-node, so Redis's linked list implementation is a double-ended linked list.

  • Each linked list is represented by a list structure, which carries information such as the head node pointer, the end node pointer, and the length of the linked list.

  • Because the pre-node of the head node of the linked list and the post-node of the end node of the linked list both point to NULL, Redis's linked list implementation is an acyclic linked list.

  • By setting different type-specific functions for the linked list, the Redis linked list can be used to store various types of values.

3. Dictionary (dict)

The data structure of the dictionary:

typedef struct dictht {
    // 哈希表数组
    dictEntry **table;
    
    // 哈希表大小
    unsigned long size;
    
    // 哈希表大小掩码,用于计算索引值
    unsigned long sizemask;
    
    // 该哈希表已有节点的数量
    unsigned long used;
}dictht;

typedef struct dictEntry {
    // 键
    void *key;
    
    // 值
    union {
        void *val;
        uint64_tu64;
        int64_ts64;
    }v;
    
    // 指向下一个哈希表节点,形成键表
    struct dictEntry *next;
}dictEntry;

typedef struct dict {
    // 类型特定函数
    dictType *type;
    
    // 私有数据
    void *privdate;
    
    // 哈希表
    dictht ht[2];
    
    // rehash索引
    // 当rehash不在进行时,值为-1
    in trehashidx; /* rehashing not in progress if rehashidx == -1 */
}dict;

typedef struct dictType {
    // 计算哈希值的函数
    unsigned int (*hashFunction)(const void *key);
    
    // 复制键的函数
    void *(*keyDup)(void *privdata, const void *key);
    
    // 复制值的函数
    void *(*valDup)(void *privdata, const void *obj);
    
    // 对比键的函数
    int (*keyCompare)(void *privdata, const void *key1, const void *key2);
    
    // 销毁键的函数
    void (*keyDestructor)(void *privdata, void *key);
    
    // 销毁值的函数
    void (*valDestructor)(void *privdata, void *obj);
}dictType;

Insert picture description here

  • Dictionaries are widely used to implement various functions of Redis, including databases and hash keys.
  • The dictionary in Redis uses a hash table as the underlying implementation. Each dictionary has two hash tables, one is usually used, and the other is only used for rehash.
  • When the dictionary is used as the underlying implementation of the database, or the underlying implementation of the hash key, Redis uses the MurmurHash2 algorithm to calculate the hash value of the key.
  • The hash table uses the key address method to resolve key conflicts, and multiple key-value pairs assigned to the same index will be connected into a singly linked list.
  • When expanding or shrinking the hash table, the program needs to rehash all the key-value pairs contained in the existing hash table into the new hash table, and this rehash process is not completed at once, but progressive Completed.

Four, skip list (skiplist)

The skiplist is an ordered data structure that maintains multiple pointers to other nodes in each node to achieve the purpose of quickly accessing nodes. The time complexity of the skip table query is O(logN), the worst case is O(N), and the processing node can also be referred to through sequential operations.
The data structure of the jump table:

typedef struct zskiplistNode {
    // 层
    struct zskiplistLevel {
        // 前进指针
        struct zskiplistNode * forward;
        
        // 跨度
        unsigned int span;
    } level[];
    
    // 后退指针
    struct zskiplistNode *backward;
    
    // 分值
    double score;
    
    // 成员对象
    robj *obj;
} zskiplistNode;

typedef struct zskiplist {
    // 表头节点和表尾节点
    struct zskiplistNode *header, *tail;
    
    // 表中节点数量
    unsigned long length;
    
    // 表中层数最大的节点的层数
    int level;
} zskiplist;

Insert picture description here

  • The jump table is one of the underlying implementations of an ordered set.
  • Redis's jump list implementation is composed of two structures: zskiplist and zskiplistNode, where zskiplist is used to store jump list information (such as table head node, table tail node, length), and zskiplistNode is used to represent the jump table node.
  • The height of each hop table node is a random number between 1 and 32.
  • In the same hop table, multiple nodes can contain the same score, but the member object of each node must be unique.
  • The nodes in the jump table are sorted according to the size of the score. When the score is the same, the nodes are sorted according to the size of the member object.

Five, integer set (intset)

Integer set (intset) is one of the underlying implementations of set keys. When a set contains only integer value elements, and the number of elements in this set is small, Redis will use integer sets as the underlying implementation of set keys.
The data structure of the integer collection:

typedef struct intset {
    // 编码方式
    uint32_t encoding;
    
    // 集合包含的元素数量
    uint32_t length;
    
    // 保存元素的数组
    int8_t contents[];
} intset;

Insert picture description here

  • Integer collections are one of the underlying implementations of collection keys.
  • The bottom layer of the integer collection is an array. This array saves the collection elements in an orderly and non-repetitive manner. When necessary, the program will change the type of the array according to the type of the newly added element.
  • The upgrade operation brings operational flexibility to the integer set and saves memory as much as possible.
  • Integer sets only support upgrade operations, not downgrade operations.

Six, compressed list (ziplist)

The compressed list (ziplist) is one of the underlying implementations of the list key and the hash key. When a list chain contains only a small number of list items, and each list item is either a small integer value or a string with a relatively short length, then Redis will use a compressed list as the underlying implementation of the list key.
Insert picture description here

  • The compressed list is a sequential data structure developed to save memory.
  • The compressed list is used as one of the underlying implementations of the list key and hash key.
  • The compressed list can contain multiple nodes, and each node can store a byte array or integer value.
  • Adding new nodes to the compressed list, or deleting nodes from the compressed list, may trigger chain update operations, but the probability of such operations is not high.

Seven, Redis data type

In Redis, the data type of the key is a string, but it provides a rich data storage method for developers to use. There are many data types of values. There are five commonly used data types, namely string and list list), dictionary (hash), set (set), ordered set (sortedset).

8. String

The data structure type "string" is very simple, and corresponds to the data structure, which is the simple dynamic string (SDS) in Redis.

Nine, list (list)

The list data type supports storing a set of data. This data type corresponds to two implementation methods, one is a compressed list (ziplist), and the other is a two-way circular linked list.
When the amount of data stored in the list is relatively small, the list can be implemented in a compressed list.

Specifically, the following two conditions must be met at the same time:

  • The single data saved in the list (may be of string type) is less than 64 bytes;
  • The number of data in the list is less than 512.

Dictionary (hash)
dictionary type is used to store a set of data pairs. Each data pair contains two parts of key value. There are also two ways to implement the dictionary type. One is a compressed list, and the other is a hash table.
Similarly, only when the amount of stored data is relatively small, Redis uses compressed lists to implement the dictionary type.

Two conditions must be met:

  • The size of the keys and values ​​stored in the dictionary must be less than 64 bytes;
  • The number of key-value pairs in the dictionary must be less than 512.

10. Set

The data type of collection is used to store a set of unique data. There are two implementations: one is based on an ordered array, and the other is based on a hash table.

When the data to be stored meets the following two conditions at the same time, Redis uses an ordered array to implement the data type of collection.

The stored data are all integers;
the number of stored data elements does not exceed 512.
Sorted set (sortedset) Sorted
set is used to store a set of data, and each data will be accompanied by a score. Through the size of the score, we organize the data into a data structure like a jump table to support the rapid acquisition of data according to the score value and the score interval.

There are also two ways to implement ordered sets: skip lists and compressed lists. The premise of using compressed lists to implement ordered sets:

  • The size of all data must be less than 64 bytes;
  • The number of elements must be less than 128.
    Insert picture description here
    This group shares learning materials for free (C/C++, Linux, golang, Nginx, ZeroMQ, MySQL, Redis, fastdfs, MongoDB, ZK, streaming media, CDN, P2P, K8S, Docker, ffmpeg, TCP/IP, coroutine, DPDK , Embedded) etc.
    Discussions and exchange of information collection please add group Q: 1106675687, know almost
    Tencent classroom free registration link are welcome to apply.

Guess you like

Origin blog.csdn.net/m0_50662680/article/details/109309394