Analysis of the SDS implementation principle of Redis string objects
Preface
In the previous article, we introduced the 9 data types supported in Redis and their simple use, but they are only limited to use. Starting from this article, we will gradually analyze the underlying storage data structure of each data type. This article will introduce the most important and most frequently used字符串对象
underlying storage structure.
String object
Redis is written in C language, and the strings in C language are binary insecure, so Redis does not directly use C language strings, but writes a new data structure to represent strings. This This data structure is called: Simple dynamic string (SDS) for short .
Why Redis string objects are binary safe
In C language, the string uses a char array (flexible array) to store the string, and the string must end with an empty string'\0'. Strings do not record the length, so if you want to get the length of a string, you must traverse the entire string until it encounters'\0' ('\0' is not included in the length), and the time complexity is O(n ).
Just because the C language uses the empty character'\0' to identify whether the end of the string is reached, it can only save text data, and cannot save binary data such as pictures, audio, video and compressed files, so it is binary insecure.
In order to implement binary-safe strings in Redis, the original C language strings have been improved. The structure of an SDS string is shown below:
struct sdshdr{
int len;//记录buf数组已使用的长度,即SDS的长度(不包含末尾的'\0')
int free;//记录buf数组中未使用的长度
char buf[];//字节数组,用来保存字符串
}
After improvement, if you want to get the length of the SDS in Redis, you don’t need to traverse the buf array. You can get the length by directly reading the len attribute. The time complexity becomes O(1), and the efficiency is greatly improved. Judging the length of the string no longer depends on the null character'\0', so it can store binary data such as pictures, audio, video and compressed files.
PS: However, it should be noted that SDS still follows the convention that C language strings end with'\0'. This is done to facilitate the reuse of some APIs native to C language strings .
In later versions 3.2 Redis, Redis sds and optimized to become the size of the memory in accordance with resolution sdshdr5
, , sdshdr8
, sdshdr16
, sdshdr32
, sdshdr64
are used as storage size: 32 bytes (2 . 5 ), 256 bytes (2 . 8 ), 64KB (2 16 ), 4GB size (2 32 ) and 2 64 sized strings (because the current version of key and value limits the use of 512MB, so it sdshdr64
is not used temporarily). Looking at the comments of the source code, the value sdshdr5
will not be used, but the key will be used, because it sdshdr5
is different from other types, it does not store unused space, so my guess is that it is more suitable for use with a fixed size Scenarios (such as key value) :
Choose any type, and its specific meaning is as follows:
struct __attribute__ ((__packed__)) sdshdr8 {
uint8_t len; //已使用空间大小
uint8_t alloc; //总共申请的空间大小(包括未使用的)
unsigned char flags; //用来表示当前sds类型是sdshdr8还是sdshdr16等
char buf[]; //真实存储字符串的字节数组
};
SDS space allocation strategy
In the C language, because there is no record length inside the string, it is very easy to cause a buffer overflow when the string is expanded .
See the image below, assuming that the picture below is the memory inside of continuous space, you can clearly see, at this time lonely
, and Redis
only two vacancies between two strings, so this time if we want to lonely
string modification are lonelyWolf
, we need four space, this time following this space is not fit, we must re-apply for space, but if that programmers forgot to apply for space, or space applications but still not enough, then there will be back Redis
The string Re
is overwritten.
Similarly, if you want to reduce the length of the string, you also need to re-apply to release the memory, otherwise, the string will always occupy unused space, which will cause memory leaks .
Therefore, the C language to avoid buffer overflow and memory leakage is completely human-made and difficult to control, but these two problems will not occur when using SDS, because when we operate SDS, its internal space allocation strategy will be automatically executed without Man-made operation prevents the above two situations.
Space preallocation
Space pre-allocation refers to when we expand the space of the SDS through the API, if the unused space is not enough, the program will not only allocate the necessary space for the SDS, but also allocate additional unused space and unused space allocation size There are two main situations:
- 1. If the len attribute after expanding the length is less than or equal to 1MB (ie 1024*1024), then an unused space of the same size as the len attribute will be allocated at the same time ( at this time, the used space of the buf array = unused space ).
- 2. If the len attribute after expanding the length is greater than 1MB, then 1MB of unused space will be allocated.
The advantage of implementing the space pre-allocation strategy is that after the unused space is allocated in advance, there is no need to allocate space every time the string is increased, which reduces the number of memory reallocations.
Inert space release
Lazy space release means that when we need to reduce the length of the SDS through the API, the program does not immediately release the unused space, but only updates the value of the free attribute, so that the space can be reserved for the next use. In order to prevent memory overflow, SDS provides the API separately to allow us to truly release memory when necessary.
SDS and C language string difference
Now we summarize the difference between the strings implemented in SDS and C language
C string | SDS |
---|---|
Can only save text data without empty string'\0' | Can save text or binary data, and allow to contain empty string'\0' |
The complexity of getting the string length is O(n) | The complexity of obtaining the length of the string is O(1) |
Manipulating strings may cause buffer overflow | No buffer overflow |
Modifying the length of the string N times requires N times of memory reallocation | Modify the string length N times, at most N times of memory reallocation are required |
All functions related to C strings can be used | You can use some functions related to C strings |
SDS underlying storage object
Having said so much above, many people may think that the bottom layer of Redis is to directly use the SDS data structure for storage, but in fact it is not. Let’s recall that the full name of Redis is remote dictionary service, so all data types in Redis are The corresponding data structure was packaged again, and a dictionary object was created to store it.
dictEntry object
Every time a key-value key-value pair is created, Redis will create two objects, one is the key object, the other is the value object, and in Redis, any object is always packaged into an redisObject
object , and the key object and the value are simultaneously Objects dictEntry
are encapsulated by objects. The following is an dictEntry
object (in the source code dict.h):
typedef struct dictEntry {
void *key;//指向key,即SDS
union {
void *val;//执行value,即5大常用数据类型
uint64_t u64;
int64_t s64;
double d;
} v;
struct dictEntry *next;//指向下一个key-value键值对(哈希值相同的键值对会形成一个链表,这种方式可以解决哈希冲突问题)
} dictEntry;
When we execute the following command:
set name lonely_wolf
You will get an object like this (some irrelevant attributes are omitted):
redisObject
The above redisObject
is our value object (in fact, the key is also a redisObject object). The following is redisObject
the data structure definition of an object (in the source code server.h):
typedef struct redisObject {
unsigned type:4;//对象类型(4位=0.5字节)
unsigned encoding:4;//编码(4位=0.5字节)
unsigned lru:LRU_BITS;//记录对象最后一次被应用程序访问的时间(24位=3字节)
int refcount;//引用计数。等于0时表示可以被垃圾回收(32位=4字节)
void *ptr;//指向底层实际的数据存储结构,如:SDS等(8字节)
} robj;
So, in the end, we can simplify the above picture as shown below ([x] will be selected as a suitable value according to the length):
Object type
The object type is redisObject
the type attribute in it, which is mainly divided into the following 5 types:
Type attribute | description | type command return value |
---|---|---|
REDIS_STRING | String object | string |
REDIS_LIST | List object | list |
REDIS_HASH | Hash object | hash |
REDIS_SET | Collection object | set |
REDIS_ZSET | Ordered collection of objects | zset |
As you can see, this corresponds to our 5 commonly used basic data types.
Encoding
Encoding is redisObject
the encoding
attribute in. We can use the command object encoding
to view the coding of the current object.
It can also be seen from the figure above that there are mainly three encoding types in the string object, as shown in the following table:
Encoding attributes | description | Object encoding command return value |
---|---|---|
OBJ_ENCODING_INT | String object using integer | int |
OBJ_ENCODING_EMBSTR | String object implemented by SDS using embstr encoding | embstr |
OBJ_ENCODING_RAW | String object implemented using SDS | raw |
- Int encoding
When we use a string object to store an integer, and it can be represented by an 8-byte long type (that is, 2 63 -1), Redis will choose to use int encoding for storage, and at this time, theredisObject
object The ptr pointer is directly replaced with the long type. - embstr encoding
When a string is stored in a string object and the length is less than 44 (39 before version 3.2), Redis will choose to use embstr encoding for storage. - Raw encoding
When a string is stored in a string object and the length is greater than 44, Redis will choose to use raw encoding for storage.
Why is embstr encoding changed from 39 bits to 44 bits
In embstr encoding, redisObject and SDS are a continuous piece of memory space. This memory space Redis is limited to 64 bytes, while redisObject occupies 16 bytes. The sds before Redis 3.2 occupies 8 bytes, plus The "\0" at the end of the string occupies 1 byte, so: 64-16-8-1=39 bytes.
After Redis3.2, sds has been optimized. For embstr encoding, sdshdr8 will be used for storage, and the space occupied by sdshdr8 is only 24 bits: 3 bytes (len+alloc+flag) + 1 byte ('\0' character), so In the end there is left: 64-16-3-1=44 bytes.
The difference between embstr encoding and raw encoding
Embstr encoding is an optimized storage method. When applying for space, it makes the redisObject
two objects and SDS a continuous space, so it only needs to apply for space once (similarly , it only needs to release memory once) , while raw encoding Because the space of the redisObject and SDS objects is not continuous, you need to apply for space twice when you use it (similarly, you need to release memory twice) . But when embstr encoding is used, if the string needs to be modified, then because redisObject and SDS are together, both objects need to reapply for space. To avoid this, the embstr encoded string is read-only. Allow modification .
We see the figure above example, for a embstr
coding object string append operation, has not yet reached a length of 45, but has been modified to encode raw
, this is because the embstr
coding is read, modify it if necessary, Redis internally will modify it to raw
code before operating. Similarly, if the long type cannot be stored after the int-encoded string is manipulated (the int type is no longer an integer or the length exceeds 2 63 -1) , the int encoding will be changed to raw encoding.
PS: It should be noted that once the encoding is upgraded (int–>enmstr–>raw), even if the string is later modified to a storage format that conforms to the original memory encoding, the encoding will not be rolled back.
Last visit time lru
This attribute records the time when the object was last accessed, and object idletime
the idle time of the current object can be obtained through commands , namely: current time-lru time.
Note that object idletime
the command itself will not be recorded lru property.
When we enable maxmemory
it, and the reclaiming algorithm attribute maxmemory-policy is configured to volatile-lru
or allkeys-lru
, when the maxmemory
set value is reached , the key with the longest idle time will be reclaimed first.
maxmemory 512MB #不带单位则默认是字节
maxmemory-policy volatile-lru
Refcount
The C language itself does not provide a memory recovery mechanism, so Redis implements a simple reference counting method for garbage collection. In short, the current object is referenced once, and the count is +1. When refcount is equal to 0, it means the current The object has no references and can be garbage collected. If you want to learn more about the garbage collection algorithm, you can click here .
to sum up
This article mainly introduces the most commonly used string data object among the five commonly used types in Redis and analyzes it. The bottom layer uses SDS to store it, and further analyzes how SDS is packaged and its memory allocation strategy and space release strategy. , And its encoding type, etc. In the article, we also compared it with C language strings, and further analyzed why Redis finally chose to use SDS to replace C language strings.
The next article will introduce the underlying storage structure and principle analysis of the list type among the five commonly used data types.
Please pay attention to me and learn and progress with the lone wolf .