Detailed explanation of redis_5 data structures and their underlying implementation principles

1. Data structure in redis

Redis supports five data types: string (string), hash (hash), list (list), set (unordered set) and zset (ordered set)
insert image description here
In the seckill project, I have used redis Set and Hash structure:

String: A key corresponds to a string, and string is the most basic data type of Redis. (The byte abase framework only implements the string data structure of redis, so if we want to store complex data structures, we can only convert them into strings in json format for storage) list: a key corresponds to a list of strings
, The bottom layer is implemented using a doubly linked list, and it supports many operations supported by a doubly linked list.
Hash:
insert image description here

Set: For example, an instance of Set: A = {'a', 'b', 'c'}, A is the key of the set, 'a', 'b' and 'c' are the members of the set. Unordered, no repeating elements.
SortedSet: A score is added to the set, and the data in the set is ordered.

3. The underlying implementation of redis data structure

string
is implemented using a data type called Simple Dynamic String (SDS).

/*  
 * 保存字符串对象的结构  
 */  
struct sdshdr {
    
      
    int len;  // buf 中已占用空间的长度  
    int free;  // buf 中剩余可用空间的长度
    char buf[];  // 数据空间  
};

Advantages of SDS over C strings:

SDS saves the length of the string, but the C string does not save the length. It is necessary to traverse the entire array (until '\0' is found) to get the length of the string.
When modifying SDS, check whether the given SDS space is sufficient, if not, expand the SDS space first to prevent buffer overflow. The C string does not check whether the string space is sufficient, and it is easy to cause a buffer overflow when calling some functions (such as the strcat string concatenation function).
The mechanism of SDS pre-allocating space can reduce the number of times to reallocate space for strings.

4、 list

Implemented using a doubly linked list.
insert image description here

5、hash

The hash structure is actually a dictionary with many key-value pairs (similar to python's dict type).
Redis's hash table is a dicttht structure:
typedef struct dicttht { dictEntry **table;//Hash table array unsigned long size;//Hash table size unsigned long sizemask;//Hash table size mask, use For calculating the index value unsigned long used;//The number of existing nodes in the hash table }




insert image description here

The structure of the hash table node is as follows:
typeof struct dictEntry{ void *key;//key union{ //The types of values ​​corresponding to different keys may be different, use union to deal with this problem void *val; uint64_tu64; int64_ts64; } struct dictEntry *next; }







One of the methods to resolve hash conflicts is the zipper method.
In order to keep the loading factor of the hash table within a reasonable range, it is necessary to expand or shrink the size of the hash table, which is called rehash. There are a total of two hash table dictht structures in the dictionary, ht[0] is used to store key-value pairs, ht[1] is used to temporarily store data during rehash, usually the hash table it points to is empty and needs to be expanded or contracted ht[0]'s hash table is allocated space for it.

For example, expanding the hash table is to allocate a space twice the size of ht[0] for ht[1], then migrate all the data of ht[0] to ht[1] through rehash, and finally release ht[0] ], make ht[1] become ht[0], and assign an empty hash table to ht[1]. Shrinking hashtables is similar.

Progressive rehash: redis does not specifically find time to perform rehash at one time, but gradually. During rehash, it does not affect external access to ht[0]. It is required to synchronize the corresponding data to ht[1] when modifying the dictionary , when all data transfer is completed, rehash ends.
———————————————

6、set

set can be implemented with intset or dictionary.

Intset
only uses intset when the data are all integer values ​​and the number is less than 512. Intset is an ordered set composed of integers, which can be used for binary search.
insert image description here
Dictionaries
Use dictionaries (zipper method) when the usage conditions of intset are not met. When using dictionaries, set the value to null.
insert image description here

7、 check

Each element in zset contains the data itself and a corresponding score (score).
Classic example: the key of a zset is "math", which represents the grades of the math class, and then a lot of data can be inserted into this key. When entering data, each time you need to enter a name and a corresponding score. Then the name is the data itself, and the score is its score.

The data of zset itself does not allow duplication, but the score allows duplication.

The underlying implementation principle of zset:
insert image description here

When the data is small, use ziplist: ziplist occupies continuous memory, and each element is stored continuously in the form of (data + score), sorted by score from small to large. In order to save memory, the space occupied by each element of ziplist can be different. For large data (long long), more bytes are used for storage, and for small data (short), less bytes are used for storage. Therefore, it is necessary to traverse in order when searching. Ziplist saves memory but has low search efficiency.
When there is a lot of data, use a dictionary + skip table:

Supongo que te gusta

Origin blog.csdn.net/chuige2013/article/details/130257327
Recomendado
Clasificación