Redis data structure (1)-Redis data storage and implementation of String type

1 Introduction

Redis is a non-relational KV database based on memory. Due to the fast read and write response, atomic operations, and various data types String, List, Hash, Set, Sorted Set, which are widely used in projects, today we will discuss how the data structure of Redis is implemented.

2 Data storage

2.1 RedisDB

Redis stores data in redisDb, with a total of 16 dbs from 0 to 15 by default. Each library is an independent space, so you don't have to worry about key conflicts. You can switch db through the select command. Cluster mode uses db0

typedef struct redisDb {
dict *dict; /* The keyspace for this DB */
dict *expires; /* Timeout of keys with a timeout set */
...
} redisDb;
  • dict: database key space, saves all key-value pairs in the database
  • expires: the expiration time of the key, the key of the dictionary is the key, and the value of the dictionary is the UNIX timestamp of the expiration event

2.2 Redis Hash Table Implementation

2.2.1 Hash dictionary dict

The first thing we think of for KV storage is map, which is implemented in Redis through dict. The data structure is as follows:

typedef struct dict {
dictType *type;
void *privdata;
dictht ht[2];
long rehashidx; /* rehashing not in progress if rehashidx == -1 */
unsigned long iterators; /* number of iterators currently running */
} dict;
  • type: The type-specific function is a pointer to a dictType structure. Each dictType structure stores a set of functions for manipulating key-value pairs of a specific type. Redis will set different type-specific functions for dictionaries with different purposes.
  • privdata: Private data holds optional parameters that need to be passed to those type-specific functions
  • ht[2]: Hash table An array containing two items, each item in the array is a dictht hash table, in general, the dictionary only uses the ht[0] hash table, ht[1] ha The hash table will only be used when rehashing the ht[0] hash table
  • rehashidx: rehash index, when rehash is not in progress, the value is -1

There are two characteristics of hash data:

  • Any identical input must yield the same data
  • Different inputs, it is possible to get the same output

According to the characteristics of hash data, there is a problem of hash collision, dict can solve this problem through the function in dictType

typedef struct dictType {
uint64_t (*hashFunction)(const void *key);
int (*keyCompare)(void *privdata, const void *key1, const void *key2);
...
} dictType;
  • hashFunction: The method used to calculate the hash value of the key
  • keyCompare: key value comparison method

2.2.2 Hash table dicttht

dict.h/dicttht represents a hash table with the following structure:

typedef struct dictht {
dictEntry **table;
unsigned long size;
unsigned long sizemask;
unsigned long used;
} dictht;
  • table: array pointer, each element in the array is a pointer to a dict.h/dictEntry structure, and each dictEntry structure holds a key-value pair.
  • size: records the size of the hash table, that is, the size of the table array, the size is always 2^n
  • sizemask: always equal to size - 1, this property, together with the hash value, determines at which index of the table array a key should be placed.
  • used: records the number of existing nodes (key-value pairs) in the hash table.

key-value pair dict.h/dictEntry

typedef struct dictEntry {
void *key;
union {
void *val;
uint64_t u64;
int64_t s64;
double d;
} v;
struct dictEntry *next;
} dictEntry;
  • key: holds the key in the key-value pair (SDS type object)
  • val: holds the value in the key-value pair, which can be a uint64_t integer, an int64_t integer, or a pointer to a value wrapped by redisObject
  • next: Points to the next hash table node, forming a pointer to another hash table node in the linked list. This pointer can connect multiple key-value pairs with the same hash value at one time, so as to solve the problem of key collision (collision). question

When using a hash table, there will be a problem of hash collision. After the hash collision, a linked list is formed in the current array node. When the amount of data exceeds the length of the hash table, there will be a large number of nodes called linked lists. In extreme cases, the time complexity will be From O(1) to O(n); if the data in the hash table continues to decrease, it will cause a waste of space. Redis will expand and contract according to the load factor for these two situations:

  • Load factor: number of hash table saved nodes/hash table size, load_factor = ht[0].used/ht[0].size
  • Extended operation:
  • The server is not currently executing the BGSAVE command or the BGREWRITEAOF command, and the load factor of the hash table is greater than or equal to 1;
  • The server is currently executing the BGSAVE command or the BGREWRITEAOF command, and the load factor of the hash table is greater than or equal to 5;

Shrink operation:

  • When the load factor of the hash table is less than 0.1, the program automatically starts to shrink the hash table.

If Redis is fully expanded during expansion, the client operation cannot be processed in a short time due to the problem of data volume. Therefore, incremental rehash is used to expand the capacity. The steps are as follows:

  1. Holds 2 hash tables at the same time
  2. Set the value of rehashidx to 0, which means that the rehash work officially starts
  3. During the rehash, every time the dictionary is added, deleted, searched or updated, the program will not only perform the specified operations, but also rehash all the key-value pairs on the rehashidx index of the ht[0] hash table to the rehash. ht[1] , when the rehash work is completed, the program increments the value of the rehashidx attribute by one
  4. At a certain point in time, all key-value pairs of ht[0] will be rehashed to ht[1]. At this time, the program sets the value of the rehashidx attribute to -1, indicating that the rehash operation has been completed.

During the progressive rehash, operations such as delete, find, update, etc. of the dictionary will be performed on two hash tables; if a key is searched in the dictionary, the program will first perform operations in ht[0 ] to search, if not found, it will continue to search in ht[1]; the key-value pairs newly added to the dictionary will all be saved in ht[1], and ht[0] will not be searched anymore Any addition operation: This measure ensures that the number of key-value pairs contained in ht[0] will only decrease and not increase (if no operation is performed for a long time, the event polling will perform this operation), and will be updated with the execution of the rehash operation. Eventually it becomes an empty table.

dict.h/redisObject

Typedef struct redisObject {
unsigned type:4;
unsigned encoding:4;
unsigned lru:LRU_BITS;
int refcount;
void *ptr;
}
  • type: 4: Constrains the data type stored in client operations, existing data cannot be modified, 4bit
  • encoding: 4: The encoding mode of the value at the bottom of redis, 4bit
  • lru:LRU_BITS: memory elimination strategy
  • refcount: Manage memory by reference counting method, 4byte
  • ptr: the address pointing to the real storage value, 8byte

The complete structure diagram is as follows:

3 String type

3.1 String type usage scenarios

String Strings exist in three types: String, Integer, Float. There are mainly the following usage scenarios

1) Page dynamic cache For
example, when a dynamic page is generated, the background data can be generated for the first time and stored in the redis string. Access again, no more database requests, and read the page directly from redis. The characteristics are: the first access is relatively slow, and the subsequent access is fast.

2) Data Cache
In the front and back separated development, although some data are stored in the database, the changes are very small. For example, there is a national table. When the front-end initiates a request, if the back-end reads from the relational database every time, it will affect the overall performance of the website.
We can store all the region information in the redis string at the first visit, request again, read the region's json string directly from the database, and return it to the front end.

3) Data statistics
Redis integers can be used to record website visits and downloads of a certain file. (atomic increments and decrements)

4) Limit the number of requests within a time.
For example, a logged-in user requests a SMS verification code, and the verification code is valid within 5 minutes. When the user requests the SMS interface for the first time, the user id is stored in the string that redis has sent the SMS, and the expiration time is set to 5 minutes. When the user requests the short message interface again and finds that there is already a record of the user sending short messages, the short message will not be sent any more.

5) Distributed session
When we use nginx for load balancing, if we each store our own session from the server, then when the server is switched, the session information will be lost because it is not shared, we have to consider The third application is used to store the session. Through our use of relational databases or non-relational databases such as redis. The storage and reading performance of relational databases is far from that of non-relational databases such as redis.

3.2 Implementation of String type - SDS structure

Redis does not directly use C strings to implement the String type. It was implemented through SDS before Redis 3.2.

Typedef struct sdshdr {
int len;
int free;
char buf[];
};
  • len: allocate memory space
  • free: the remaining free allocated space
  • char[]: value actual data

3.3 Differences between SDS and C strings

3.3.1 Query Time Complexity

The complexity of C to get the length of a string is O(N). And SDS records the length by len, changing from O(n) of C to O(1).

3.3.2 Buffer overflow

C strings do not record their own length, which can easily lead to buffer overflow (buffer overflow). The space allocation strategy of SDS completely eliminates the possibility of buffer overflow. When the SDS needs to be modified, it will first check whether the space of the SDS meets the requirements for modification. If not, the space of the SDS will be extended to the place where the modification is performed The required size, and then the actual modification operation is performed, so using SDS does not require manual modification of the space size of the SDS, and there will be no buffer overflow problem.

In SDS, the length of the buf array is not necessarily the number of characters plus one, the array can contain unused bytes, and the number of these bytes is recorded by the free attribute of SDS. Through unused space, SDS implements two optimization strategies: space pre-allocation and lazy space release:

  • Space pre-allocation: When an SDS is modified and the SDS needs to be expanded, the program will not only allocate the space necessary for the modification to the SDS, but also allocate additional unused space for the SDS. Before expanding the SDS space, it will check whether the unused space is enough. If there is enough, the unused space will be used directly without performing memory reallocation. If it is not enough to expand according to the method of (len + addlen(new byte)) * 2, when it is larger than 1M, it will only increase the size by 1M each time. Through this pre-allocation strategy, SDS reduces the number of memory reallocations required to continuously grow a string N times from a certain number of times to a maximum of N times.
  • Lazy space release: lazy space release is used to optimize the string shortening operation of SDS: when the string saved by SDS needs to be shortened, the program does not immediately use memory reallocation to reclaim the extra bytes after shortening, but uses the free attribute Record the number of these bytes and wait for future use.

3.3.3 Binary Safety

The characters in the C string must conform to a certain encoding (such as ASCII, and the string cannot contain null characters except at the end of the string, otherwise the first null character read by the program will be mistaken for the end of the string .

SDS APIs are all binary-safe: the data stored in the buf array by SDS will be processed in a binary manner, and the program will not make any restrictions, filters, or assumptions on the data - the data is in What it looked like when it was written, what it looked like when it was read. Instead of using this array to hold characters, redis uses it to hold a series of binary data.

3.4 SDS structure optimization

The data stored in the String type may contain a large amount of data of this type in several bytes, but the int type of the len and free attributes will occupy 4 bytes and a total of 8 bytes of storage. After 3.2, sdshdr5, sdshdr8, sdshdr16, sdshdr32, and sdshdr64 data will be used according to the size of the string. Structure storage, the specific structure is as follows:

struct __attribute__ ((__packed__)) sdshdr5 {
unsigned char flags; /* 3 lsb of type, and 5 msb of string length */
char buf[];
};
struct __attribute__ ((__packed__)) sdshdr8 {
uint8_t len; /* used */
uint8_t alloc; /* excluding the header and null terminator */
unsigned char flags; /* 3 lsb of type, 5 unused bits */
char buf[];
};
struct __attribute__ ((__packed__)) sdshdr16 {
uint16_t len; /* used */
uint16_t alloc; /* excluding the header and null terminator */
unsigned char flags; /* 3 lsb of type, 5 unused bits */
char buf[];
};
struct __attribute__ ((__packed__)) sdshdr32 {
uint32_t len; /* used */
uint32_t alloc; /* excluding the header and null terminator */
unsigned char flags; /* 3 lsb of type, 5 unused bits */
char buf[];
};
struct __attribute__ ((__packed__)) sdshdr64 {
uint64_t len; /* used */
uint64_t alloc; /* excluding the header and null terminator */
unsigned char flags; /* 3 lsb of type, 5 unused bits */
char buf[];
};
  • unsign char flags: 3bit indicates type, 5bit indicates unused length
  • len: indicates the used length
  • alloc: indicates the size of the allocated space, and the remaining space can be obtained using alloc - len

3.5 Character Set Encoding

RedisObject wraps the stored value value, and optimizes data storage through character set encoding. There are three encoding methods for string type as follows:

  • embstr: The
    CPU reads data according to Cache Line 64byte each time. A redisObject object is 16bytes. To fill the size of 64bytes, it will read 48bytes of data backwards. However, when obtaining the actual data, it is necessary to read the data corresponding to the memory address through the *ptr pointer. The information of an sdshdr8 attribute occupies 4 bytes, and the remaining 44 bytes can be used to store data. If the value is less than 44, byte can get data by reading the cache line once.
  • int:
    If the SDS is less than 20 bits and can be converted to an integer number, the *ptr pointer of redisObject will be stored directly.
  • raw:
    SDS

4 Summary

Redis is used as kv data storage, because the time complexity of search and operation is O(1) and the optimization of rich data types and data structures, understanding these data types and structures is more conducive to our usual use of redis. The next issue will further introduce ZipList, QuickList, and SkipList used by other commonly used data types List, Hash, Set, and Sorted Set. For the unclear and inaccurate places in the article, you are welcome to discuss and exchange.


Author: Sheng Xu

{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4090830/blog/5586124